Recipe 5.40. Compiling Regular Expressions for Speed


Problem

You want to compile a regular expression to maximize runtime speed.

Solution

Sample code folder: Chapter 05\RegexDLL

There are two steps to this solution, best described by working through an example. The first step is to run the code to create the compiled DLL file, and the second is to use the new compiled regular expression in one or more applications.

Discussion

First, run the following code one time only to compile and create a DLL file containing a regular expression, in this case using a pattern designed to find all numbers in a string:

 Imports System.Text.RegularExpressions ' …Later, in a method… Dim numPattern As String = _    "[-+]?([0-9]*\.)?[0-9]+([eE][-+]?[0-9]+)?" Dim wordPattern As String = "\w+" Dim whichNamespace As String = "NumbersRegex" Dim isPublic As Boolean = True Dim compNumbers As New RegexCompilationInfo(numPattern, _    RegexOptions.Compiled, "RgxNumbers", _    whichNamespace, isPublic) Dim compWords As New RegexCompilationInfo(wordPattern, _    RegexOptions.Compiled, "RgxWords", whichNamespace, _    isPublic) Dim compAll( ) As RegexCompilationInfo = _    {compNumbers, compWords} Dim whichAssembly As New _    System.Reflection.AssemblyName("RgxNumbersWords") Regex.CompileToAssembly(compAll, whichAssembly) 

This code creates a new file named RgxNumbersWords.dll that contains the compiled regular expression. The file is created in the same folder in which the executable program is located.

To use the new DLL in an application, you need to add a reference to it. Right-click on References in the Solution Explorer, click the Browse tab, find the DLL file in the folder where the application's EXE file is located, and select it to add the reference. Figure 5-45 shows the new reference in the Solution Explorer.

Figure 5-45. The DLL file named RgxNumbersWords added to the References list in the Solution Explorer


You also need to import the namespace defined in this DLL into your application. Either add an Imports command at the top of your source code or, in the Project Properties window, select the References tab, and place a checkmark next to the name of the namespace, as shown in Figure 5-46.

Figure 5-46. Importing a namespace via the Project Properties window


Once the new DLL is referenced and its object's namespace has been imported, you can use the compiled regular expression in an application. The following code uses the new RgxNumbers regular expression to count the numbers in a string:

 Imports System.Text.RegularExpressions ' …Later, in a method… Dim source As String = _    "Making a Pi (3.1415926) is easy as One 1 Two 2 Three 3" Dim parser As New RgxNumbers Dim totalMatches As Integer = parser.Matches(source).Count MsgBox(source & vbNewLine & "Number count: " & _    totalMatches.ToString()) 

Figure 5-47 shows the result of running this code to determine how many numbers are in the sample string.

Figure 5-47. Quickly counting numbers in a string using the compiled regular expression


See Also

Recipe 5.37 also discusses regular expression processing.




Visual Basic 2005 Cookbook(c) Solutions for VB 2005 Programmers
Visual Basic 2005 Cookbook: Solutions for VB 2005 Programmers (Cookbooks (OReilly))
ISBN: 0596101775
EAN: 2147483647
Year: 2006
Pages: 400

flylib.com © 2008-2017.
If you may any questions please contact us: flylib@qtcs.net