POSIX Character Classes


The Portable Operating Systems Interface (POSIX) standard defines character combinations. These combinations are available to ColdFusion developers for use within regular expressions. They help to make regular expressions easier to write, read, and maintain.

Different POSIX character classes are available to ColdFusion. The most common ones include the following:

  • alpha [:alpha:]. Matches combinations of alphabetic characters, regardless of case, (A Za z).

  • alnum [:alnum:]. Matches combinations of alphabetic or numeric characters, regardless of case or order, (A Za z, 0 9).

  • digit [:digit:]. Matches any combination of numeric characters, (0 9).

  • lower [:lower:]. Matches combinations of lowercase alphabetic characters, (a z).

  • punct [:punct:]. Matches any punctuation mark or combination of punctuation marks. The characters covered by this character class include the following: `! " # $ % & ' ( ) * + , - . / : ; < = > ? [ \ ] ^ _ ` { | } ~'.

  • space [:space:]. Matches any whitespace character.

  • upper [:upper:]. Matches combinations of uppercase alphabetic characters, (A Z).

Other available POSIX character classes that you can use within your ColdFusion code include the following:

  • cntrl [:cntrl:]. Matches any character that is not included in one of the other POSIX character classes. The characters include carriage return, formfeed, or newline.

  • graph [:graph:]. Matches any printable character other than carriage return, formfeed, newline, space, tab, or vertical tab.

  • print [:print:]. Matches any printable character.

  • xdigit [:xdigit:]. Matches any hexadecimal digit. An example is [A Fa f0 9].

One rule to remember when using POSIX character classes is that they must always be contained within two pairs of square brackets. Let's start with a simple example:

 <cfset secretmessage = REReplace("E!V!A!C!U!A!T!E! !N!O!W!", "[[:punct:]]", "",  "ALL")> 

The preceding regular expression returns "EVACUATE NOW" when output. The [:punct:] POSIX character class matches all the ! characters in the string and replaces them with the specified string. In this case, "", which is an empty string. Now let's look at something a little more complex:

 <cfset secretmessage =  REReplace(REReplace("I243*w22i3423ll*m2e678e21t234*y234121ou231*083l3452a2343te34r",  "[[:digit:]]", "", "ALL"), "[[:punct:]]", " ", "ALL")> 

Remember that ColdFusion functions enable recursive processing. The preceding example shows the REReplace() regular expression function calling itself, but each call replaces different character types. The resulting string after all iterations of the REReplace() function is "I will meet you later".

It's pretty easy to do what we've done in the preceding examples. We know the strings and know what needs to be replaced. However, what if we don't have that information? Let's look at some functions that can help us with that problem.



Inside ColdFusion MX
Inside Coldfusion MX
ISBN: 0735713049
EAN: 2147483647
Year: 2005
Pages: 579

flylib.com © 2008-2017.
If you may any questions please contact us: flylib@qtcs.net