Regular Expressions Primer


Regular expressions are some of the most powerful utilities available for manipulating text and data. Regular expressions are patterns that enable a developer to have total control in replacing, searching through, and extracting data. To have a good and productive way to manipulate large amounts of data, you must be proficient in regular expressions.

To have a good understanding of regular expressions, you must be familiar with the characters that make up regular expressions. Learning regular expressions is almost like learning another language. Tables 12.1 through 12.6 provide a list of some of the most frequently used characters with regular expressions.

Table 12.1. Using Characters with Regular Expressions


Action of Character


Matches a clause as numbered by the left parenthesis


Matches any word that occurs twice in a row, such as "How is is your day?"

Table 12.2. Multiple Searches of the Clause in the Regular Expression Using Repetition


Action of Character


Matches zero or more occurrences; same as {0,}


Matches x to y number of occurrences of a regular expression


Matches one or more occurrences; same as {1,}


Matches x or more occurrences or a regular expression


Matches at least two space characters


Matches zero or one occurrences; Same as {0,1}


Matches exactly x occurrences of a regular expression


Matches 7 digits

Table 12.3. Using Characters to Search Through Literals


Action of Character


Matches {


Matches }


Matches the ASCII character expressed by the octal number xxx


Matches "(" or chr (40)


Matches +


Matches [


Matches ]


Matches a form feed


Matches vertical tab


Matches .


Matches \




Matches *

\ r

Matches carriage return


Matches ?


Matches the ASCII character expressed by the hex number dd


Matches "(" or chr(40)


Matches horizontal tab


Matches (


Matches )


Matches alphabetical and numerical characters literally


Matches a new line

Table 12.4. Using Characters to Search Strings


Action of Character


Matches word fragments


Matches es in "lots of movies"


Matches the beginning of a string


Matches C in "Current Characters"


Matches all non-word criteria


Matches only the end of a string


Matches the last s in "She sells sea shells "

Table 12.5. Developing Complex Regular Expressions with Alternation and Grouping


Action of Character

Alternation joins clauses into one regular expression, and then matches any of the individual clauses


Matches "uv" or "wx" or "yz"


Groups a clause to create a clause; may be nested


Matches "xyz" or "z"

Table 12.6. Customized Grouping by Putting Expressions Within Braces


Action of Character


Matches any space character; same as [\t\r\nv\f]


Matches any non-space character; same as [^ \t\r\n\v\f]


Matches any character except \n


Matches any digit; same as [0-9]


Matches any non-digit; same as [^0-9]


Matches any one character enclosed in the character set


Matches any word character; same as [a-zA-Z_0-9]


Matches any non-word character; same as [^a-zA-Z_0-9]


Matches any character not enclosed in the character set


Matches "r" in "horse"

There are some basic concepts you need to understand when using regular expressions. For example, a string that lays out a regular expression is called a pattern . The pattern must be set before the regular expression can be used.

Another thing to keep in mind is the property that decides whether the regular expression is to be compared to all possible matches in a string. This property is called IgnoreCase and defaults to False. A method that searches through and decides whether a string can be matched is called the Test method. If Test finds that the object can be matched, it returns True; if it cannot, it returns False.

Another helpful method of regular expressions is Replace . Replace takes two strings as its arguments and then tries to match the regular expression in the search string. If it is able to match the expression, it then replaces the match with the replace string. If it is not able to match the expression, the original search string is returned. Another method, which works much like the Replace method, is the Execute method. Rather than a search string, the Execute method uses a collection object that contains a Match object for every match it finds.

A couple of properties that are worth mentioning are the Item and Count properties. The Item property enables Match objects to be randomly and incrementally accessed from the Matches collection object. The Count property has the number of Match objects in that collection.

Inside an original string where a match occurs is a read-only value called FirstIndex . FirstIndex looks at the first character in a string and uses a zero-based offset to note the positions of the string. With strings, a couple of other properties are helpful: Value and Length . The Value contains the matched value or text. For the Match object it is the default value. The Length is the value that looks at the matched string and gets the total length of the string.

To use regular expressions as objects you need to use VBScript Version 5.0. You will find that VBScript emulates JScript's RegExp and String objects; at least, that is the case with the VBScript RegExp . In the syntactical area, VBScript is most similar to Visual Basic.

Regular expressions are powerful because they also can be called from other sources outside VBScript. The reason for this is that the VBScript regular expression engine is set up as a COM object. Visual Basic can be used to manipulate regular expressions.

One thing to keep in mind with regular expressions is that regular expressions might not be completely consistent from program to program. Be careful when using different programs to be sure you are getting the desired effects. If any questions arise about discrepancies, you should consult the online manual pages.


Special Edition Using ASP. NET
Special Edition Using ASP.Net
ISBN: 0789725606
EAN: 2147483647
Year: 2002
Pages: 233 © 2008-2017.
If you may any questions please contact us: