Section 7.5. The System.Text.RegularExpressions Namespace


7.5. The System.Text.RegularExpressions Namespace

The System.Text.RegularExpressions namespace contains classes that provide access to the .NET Framework's regular expression engine.

In its simplest form, a regular expression is a text string representing a pattern that other strings may or may not match. In more complicated forms, a regular expression is a kind of programming statement. For instance, the expression:

     s/ab*c/def 

says to match the given string against the regular expression ab*c (strings that start with ab and end with c). If a match exists, then replace the given string with the string def. Here are some simple regular expressions for pattern matching:


Single character

This is matched only by itself. For example, the letter 'q' matches itself.


Dot (.)

This is matched by any character except the newline character.


Selection from Character Set

A string of characters in square brackets matches any single character from the string of characters. For example, [abc] matches the single character a, b, or c. A dash can also be used in the character list; [09] matches any single digit. The text [0-9a-z] matches any single digit or any single lowercase character, and [a-zA-Z] matches any single lower-case or uppercase character.

The ^ symbol negates the match when it appears immediately inside the square brackets. For instance, [^09] matches any character except a digit.


Special Match Abbreviations

\d matches any single digit; \D matches any single non-digit.

\w is equivalent to [a-zA-Z_], thus matching any letter or underscore; \W is the negation of \w.


Asterisk (*)

The asterisk matches zero or more repeated instances of the single character preceding the asterisk. For instance, the regular expression \da*\d matches any string beginning with a single digit, continuing with zero or more as and ending with a single digit, as with 01 or 0aaa1.


Plus Sign (+)

The plus sign matches one or more repeated instances of the single character preceding the plus sign. It is similar to the asterisk character, but it requires at least one matching character. For example, the regular expression \da+\d matches any string beginning with a single digit, continuing with one or more as and ending with a single digit, as with 0a1 or 0aaa1, but not 01.


Question Mark (?)

The question mark matches exactly zero or one instances of the single character preceding the question mark. For example, the regular expression \da?\d is matched by any string beginning with a single digit, continuing with zero or one as and ending with a single digit, as with 01 and 0a1.


General Multiplier

A set of curly braces with two comma-delimited integer values indicates a repeated match a specific number of times. The format is {x,y}, where x and y are nonnegative integers, and matches if and only if there are at least x but at most y instances of the single character preceding the opening bracket. For example, the regular expression \da{5,10}\d matches any string beginning with a single digit, continuing with at least 5 but at most 10 as and ending with a single digit, as with 0aaaaaa1.

You can leave out one of x or y. Thus, {x,} means "at least x," and {,y} means "at most y."


Escaped Characters

Several characters have special meaning within regular expression patterns, such as [ and ?. These characters must be escaped with the backslash character (\) before they can be matched as ordinary non-special characters. For instance, \[ matches an opening bracket, \? matches a question mark, and \\ matches a backslash.

The System.Text.RegularExpressions namespace has a Regex class, which has objects that represent regular expressions. Here's a simple example of using the Regex class.

     Dim matchPattern As New System.Text.RegularExpressions.Regex( _         "\da{3,5}\d")     MsgBox(matchPattern.IsMatch("0a1"))      ' Displays False     MsgBox(matchPattern.IsMatch("0aaa1"))    ' Displays True 




Visual Basic 2005(c) In a Nutshell
Visual Basic 2005 in a Nutshell (In a Nutshell (OReilly))
ISBN: 059610152X
EAN: 2147483647
Year: 2004
Pages: 712

flylib.com © 2008-2017.
If you may any questions please contact us: flylib@qtcs.net