Regular expressions are composed of standard characters such as letters and numbers as well as special characters and sequences called metacharacters and metasequences. These metacharacters and metasequences are what enable regular expressions to match abstract patterns. For example, using the metasequence \d, you can match any digit, which is more abstract than matching a specific digit. The metacharacters used by regular expressions enable you to match specific parts of a string, group characters, and even perform logical operations. The list of metacharacters used by regular expressions is relatively short. The metacharacters are summarized in Table 16.2.
The metasequences are sequences of characters that are interpreted in a specific manner by regular expressions. Table 16.3 summarizes the regular expression metasequences.
Using Character ClassesCharacter classes are denoted by square brackets ([]), and they enable you to specify a set of characters for one position within a regular expression. For example, the following regular expression uses a character class to match any substring that starts with b, followed by any vowel, and ending with a t. var pattern:RegExp = /b[aeiou]t/g; var string:String = "The bat lost the bet, but he didn't mind a bit."; trace(string.match(pattern)); // bat,bet,but,bit Most metacharacters and metasequences aren't interpreted as such within a character class. For example {5} is interpreted literally as the digit 5 and the right and left curly brace characters when placed within a character class. The exceptions are the metasequences \n, \r, \t, \unnnn, and \xnn. In addition, the , ], and \ characters have special meaning within character classes. The - (hyphen) character within a character class can indicate a range of characters. For example, the following code defines a regular expression that matches any lowercase alphabetical character: var pattern:RegExp = /[a-z]/; You can define valid ranges of uppercase and lowercase alphabetical characters, digits, and ASCII character codes. If you use a - character such that it does not define a valid range, then it will be interpreted literally. For example, the following defines a character class that matches all lowercase characters, digits, and the - character: var pattern:RegExp = /[a-z-0-9]/; The ] character closes a character class. If you want to match the literal ] character within a character class, you have to escape it. The backslash character (\) is the escape character. The following example matches all lowercase characters or the right square bracket character: var pattern:RegExp = /[a-z\]]/; If you want to match the literal backslash character, you can escape it with a preceding backslash character. The following matches any lowercase character or the backslash character: var pattern:RegExp = /[a-z\\]/; Working with QuantifiersThe metacharacters and metasequences *, +, ?, {n}, {n,}, and {n,m} are quantifiers. They allow you to specify repetitions within patterns. Quantifiers are applied to the item preceding them. An item can be a character, metasequence, character class, or group. The following example uses the + operator to find all the substrings that consist of alphabetical characters: var pattern:RegExp = /[a-z]+/ig; var string:String = "There is no path to peace. Peace is the path."; trace(string.match(pattern)); // There,is,no,path,to,peace,Peace,is,the,path The following code matches only the words that are 4 or 5 characters: var pattern:RegExp = /[a-z]{4,5}/ig; var string:String = "There is no path to peace.\nPeace is the path."; trace(string.match(pattern)); // There,path,peace,Peace,path |