Using Regular Expression Groups


Regular expression groups are denoted by parentheses. You can use groups for the following basic purposes:

  • Add quantifiers to more than one character

  • Add more control to logical or operations

  • Remember subpattern matches for subsequent use in the code

Quantifiers apply to the preceding item. The preceding item might be a character, metasequence, character code, or group. The following example uses a regular expression that matches substrings with an is followed by one or more s characters:

var pattern:RegExp = /iss+/g; var string:String = "Mississippi"; trace(string.match(pattern)); // iss,iss


The following example matches all substrings composed of one or more iss sequences:

var pattern:RegExp = /(iss)+/g; var string:String = "Mississippi"; trace(string.match(pattern));  // ississ


The | character normally matches the entire pattern on either side of the character. For example, the following code uses a regular expression that matches either re or ad:

var pattern:RegExp = /re|ad/g; var string:String = "red is rad"; trace(string.match(pattern));  // re,ad


If parentheses are used, then the | operates on just the characters surrounded by the parentheses as shown in the following example:

var pattern:RegExp = /r(e|a)d/g; var string:String = "red is rad"; trace(string.match(pattern));  // red,rad


Parentheses also enable you to use backreferences. Backreferences allow you to reference a grouped substring within the regular expression. You can reference each group numerically from 1 to 99. The following illustrates a backreference:

var pattern:RegExp = /(\d) = \1/g; var string:String = "1 = 1, 2 = 1 + 1, 3 = 1 + 1 + 1, 4 = 4"; trace(string.match(pattern));  // 1 = 1,4 = 4


In the preceding example, the \1 references the first group in the regular expression: the substring matched by (\d). The following is a similar example. Notice that in this case, the pattern doesn't match 2 = 2 because the grouped substring must consist of two digits:

var pattern:RegExp = /(\d\d) = \1/g; var string:String = "20 = 20, 2 = 2, 3 = 1 + 1 + 1, 40 = 40"; trace(string.match(pattern));  // 20 = 20,40 = 40


The following example uses two backreferences:

var pattern:RegExp = /(\d)(\d) = \2\1/g; var string:String = "42 = 24, 2 = 2, 3 = 1 + 1 + 1, 40 = 40"; trace(string.match(pattern));  // 42 = 24


You can use $1 tHRough $99 as references to grouped substrings when using the String.replace() method. The following example illustrates how to use these references:

[View full width]

var pattern:RegExp = /([a-z]+) function ([a-zA-Z]+)\(\):([a-zA-Z]+)/g; var string:String = "public function example():void"; trace(string.replace(pattern, "The function called $2 is declared as $1 with a return type of $3"));


When you call the RegExp.exec() method, it returns an array with the current matching substring as well as any grouped substring.

var pattern:RegExp = /([a-z]+) function ([a-zA-Z]+)\(\):([a-zA-Z]+)/g; var string:String = "public function example():void { trace('example');}"; var substrings:Array = pattern.exec(string); trace(substrings[0]);  // public function example():Void trace(substrings[1]);  // public trace(substrings[2]);  // example trace(substrings[3]);  // void


You can also defined named groups using ?P<groupName> immediately following the opening parenthesis. In this case, RegExp.exec() returns an associative array where the names of the captured groups are keys of the array. The entire matched string is still returned in the first index. The following example is a rewrite of the preceding code such that it uses named groups:

[View full width]

var pattern:RegExp = /(?P<modifier>[a-z]+) function (?P<functionName>[a-zA-Z]+)\(\):( ?P<returnType>[a-zA-Z]+)/g; var string:String = "public function example():void {trace('example');}"; var substrings:Array = pattern.exec(string); trace(substrings[0]); trace(substrings.modifier); trace(substrings.functionName); trace(substrings.returnType);


You can also instruct the regular expression not to capture a group. For example, you might want to use a group with a quantifier, but without capturing the group. In such cases, you can use ?: immediately following the opening parenthesis. The following example uses a standard capturing group. Notice that the array returned by exec() has two elements because it captures the subpattern.

var pattern:RegExp = /i(s|p){2}/; var string:String = "Mississippi"; trace(pattern.exec(string));  // iss,s


The following code rewrites the preceding example such that it uses a non-capturing group. In this example, the array returned by exec() has just one element:

var pattern:RegExp = /i(?:s|p){2}/; var string:String = "Mississippi"; trace(pattern.exec(string));  // iss


Lookahead groups are non-capturing groups that can be either positive (the subpattern must appear) or negative (the subpattern must not appear.) Positive lookahead groups are denoted by ?= following the opening parenthesis. A positive lookahead group says that the specified subpattern must appear in that position, but it will not be included in the match. Frequently, positive lookahead groups are used to match patterns that are followed by a specific pattern. For example, consider a string that contains filenames with file extensions. If you want to retrieve the filenames minus the file extensions from the string, you can use a positive lookahead group as in the following example:

var pattern:RegExp = /[a-z]+(?=\.[a-z]+)/g; var string:String = "Copy the program.exe and run.bat files. Move file.txt."; trace(string.match(pattern));  // program,run,file


You can use positive lookahead groups for complex patterns that would be extremely difficult or impossible to match otherwise. Consider the example of an alphanumeric password that must be between 6 and 20 characters and must contain at least 2 digits as well as at least 1 lowercase and 1 uppercase character. The following example uses positive lookahead groups to accomplish that goal:

var pattern:RegExp = /(?=.*\d.*\d)(?=.*[a-z])(?=.*[A-Z])[a-zA-Z0-9]{6,20}/; var string:String = "a1b2cd3e4";  // No uppercase trace(pattern.test(string)); // false string = "aBcdefg";  // No digits trace(pattern.test(string));  // false string = "a1B2cd3e4"; trace(pattern.test(string));  // true


Negative lookahead groups are denoted by ?!. Negative lookahead groups work just like positive lookahead groups, but they define subpatterns that must not appear. The following example uses a negative lookahead group to match all filenames (with file extensions) that don't have the file extension .txt.

var pattern:RegExp = /[a-z]+(?!\.txt)\.([a-z]+)/g; var string:String = "Copy the program.exe and run.bat files. Move file.txt."; trace(string.match(pattern));  // program.exe,run.bat


The following example rewrites the preceding regular expression slightly so that it matches all filenames except those that have file extensions of .txt or .bat:

var pattern:RegExp = /[a-z]+(?!\.txt|\.bat)\.([a-z]+)/g; var string:String = "Copy the program.exe and run.bat files. Move file.txt."; trace(string.match(pattern));  // program.exe





Advanced ActionScript 3 with Design Patterns
Advanced ActionScript 3 with Design Patterns
ISBN: 0321426568
EAN: 2147483647
Year: 2004
Pages: 132

flylib.com © 2008-2017.
If you may any questions please contact us: flylib@qtcs.net