Regular expressions are some of the most powerful utilities available for manipulating text and data. Regular expressions are patterns that enable a developer to have total control in replacing, searching through, and extracting data. To have a good and productive way to manipulate large amounts of data, you must be proficient in regular expressions. To have a good understanding of regular expressions, you must be familiar with the characters that make up regular expressions. Learning regular expressions is almost like learning another language. Tables 12.1 through 12.6 provide a list of some of the most frequently used characters with regular expressions. Table 12.1. Using Characters with Regular Expressions
Table 12.2. Multiple Searches of the Clause in the Regular Expression Using Repetition
Table 12.3. Using Characters to Search Through Literals
Table 12.4. Using Characters to Search Strings
Table 12.5. Developing Complex Regular Expressions with Alternation and Grouping
Table 12.6. Customized Grouping by Putting Expressions Within Braces
There are some basic concepts you need to understand when using regular expressions. For example, a string that lays out a regular expression is called a pattern . The pattern must be set before the regular expression can be used. Another thing to keep in mind is the property that decides whether the regular expression is to be compared to all possible matches in a string. This property is called IgnoreCase and defaults to False. A method that searches through and decides whether a string can be matched is called the Test method. If Test finds that the object can be matched, it returns True; if it cannot, it returns False. Another helpful method of regular expressions is Replace . Replace takes two strings as its arguments and then tries to match the regular expression in the search string. If it is able to match the expression, it then replaces the match with the replace string. If it is not able to match the expression, the original search string is returned. Another method, which works much like the Replace method, is the Execute method. Rather than a search string, the Execute method uses a collection object that contains a Match object for every match it finds. A couple of properties that are worth mentioning are the Item and Count properties. The Item property enables Match objects to be randomly and incrementally accessed from the Matches collection object. The Count property has the number of Match objects in that collection. Inside an original string where a match occurs is a read-only value called FirstIndex . FirstIndex looks at the first character in a string and uses a zero-based offset to note the positions of the string. With strings, a couple of other properties are helpful: Value and Length . The Value contains the matched value or text. For the Match object it is the default value. The Length is the value that looks at the matched string and gets the total length of the string. To use regular expressions as objects you need to use VBScript Version 5.0. You will find that VBScript emulates JScript's RegExp and String objects; at least, that is the case with the VBScript RegExp . In the syntactical area, VBScript is most similar to Visual Basic. Regular expressions are powerful because they also can be called from other sources outside VBScript. The reason for this is that the VBScript regular expression engine is set up as a COM object. Visual Basic can be used to manipulate regular expressions. One thing to keep in mind with regular expressions is that regular expressions might not be completely consistent from program to program. Be careful when using different programs to be sure you are getting the desired effects. If any questions arise about discrepancies, you should consult the online manual pages. |