6.6 Cookbook Regular Expressions

To wrap up this overview of how regular expressions are used in C# applications, the following is a set of useful expressions that have been used in other environments. [1]

[1] These expressions were taken from the Perl Cookbook by Tom Christiansen and Nathan Torkington (O'Reilly), and updated for the C# environment by Brad Merrill of Microsoft.

  • Matching roman numerals:

     string p1 = "^m*(d?c{0,3}c[dm])"   + "(l?x{0,3}x[lc])(v?i{0,3}i[vx])$"; string t1 = "vii"; Match m1 = Regex.Match(t1, p1); 
  • Swapping first two words:

     string t2 = "the quick brown fox"; string p2 = @"(\S+)(\s+)(\S+)"; Regex x2 = new Regex(p2); string r2 = x2.Replace(t2, "", 1); 
  • Matching "keyword = value" patterns:

     string t3 = "myval = 3"; string p3 = @"(\w+)\s*=\s*(.*)\s*$"; Match m3 = Regex.Match(t3, p3); 
  • Matching lines of at least 80 characters :

     string t4 = "********************"   + "******************************"   + "******************************"; string p4 = ".{80,}"; Match m4 = Regex.Match(t4, p4); 
  • Extracting date/time values (MM/DD/YY HH:MM:SS):

     string t5 = "01/01/01 16:10:01"; string p5 =   @"(\d+)/(\d+)/(\d+) (\d+):(\d+):(\d+)"; Match m5 = Regex.Match(t5, p5); 
  • Changing directories (for Windows ):

     string t6 =   @"C:\Documents and Settings\user1\Desktop\"; string r6 = Regex.Replace(t6,   @"\user1\",   @"\user2\"); 
  • Expanding ( %nn ) hex escapes :

     string t7 = "%41"; // capital A string p7 = "%([0-9A-Fa-f][0-9A-Fa-f])"; // uses a MatchEvaluator delegate string r7 = Regex.Replace(t7, p7,   HexConvert); 
  • Deleting C comments (imperfectly):

     string t8 = @" /*  * this is an old cstyle comment block  */ "; string p8 = @"   /\*  # match the opening delimiter   .*? # match a minimal numer of characters   \*/ # match the closing delimiter "; string r8 = Regex.Replace(t8, p8, "", RegexOptions.Singleline               RegexOptions.IgnorePatternWhitespace); 
  • Removing leading and trailing whitespace:

     string t9a = "   leading"; string p9a = @"^\s+"; string r9a = Regex.Replace(t9a, p9a, "");    string t9b = "trailing  "; string p9b = @"\s+$"; string r9b = Regex.Replace(t9b, p9b, ""); 
  • Turning "\" followed by "n" into a real newline:

     string t10 = @"\ntest\n"; string r10 = Regex.Replace(t10, @"\n", "\n"); 
  • Detecting IP addresses:

     string t11 = "55.54.53.52"; string p11 = "^" +   @"([01]?\d\d2[0-4]\d25[0-5])\." +   @"([01]?\d\d2[0-4]\d25[0-5])\." +   @"([01]?\d\d2[0-4]\d25[0-5])\." +   @"([01]?\d\d2[0-4]\d25[0-5])" +   "$"; Match m11 = Regex.Match(t11, p11); 
  • Removing leading path from filename:

     string t12 = @"c:\file.txt"; string p12 = @"^.*\"; string r12 = Regex.Replace(t12, p12, ""); 
  • Joining lines in multiline strings:

     string t13 = @"this is  a split line"; string p13 = @"\s*\r?\n\s*"; string r13 = Regex.Replace(t13, p13, " "); 
  • Extracting all numbers from a string:

     string t14 = @" test 1 test 2.3 test 47 "; string p14 = @"(\d+\.?\d*\.\d+)"; MatchCollection mc14 = Regex.Matches(t14, p14); 
  • Finding all caps words:

     string t15 = "This IS a Test OF ALL Caps"; string p15 = @"(\b[^\Wa-z0-9_]+\b)"; MatchCollection mc15 = Regex.Matches(t15, p15); 
  • Finding all lowercase words:

     string t16 = "This is A Test of lowercase"; string p16 = @"(\b[^\WA-Z0-9_]+\b)"; MatchCollection mc16 = Regex.Matches(t16, p16); 
  • Finding all initial caps words:

     string t17 = "This is A Test of Initial Caps"; string p17 = @"(\b[^\Wa-z0-9_][^\WA-Z0-9_]*\b)"; MatchCollection mc17 = Regex.Matches(t17, p17); 
  • Finding links in simple HTML:

     string t18 = @" <html> <a href=""http://windows.oreilly.com/news/first.htm"">first tag text</a> <a href=""http://windows.oreilly.com/news/next.htm"">next tag text</a> </html> "; string p18 = @"<A[^>]*?HREF\s*=\s*[""']?"   + @"([^'"" >]+?)[ '""]?>"; MatchCollection mc18 = Regex.Matches(t18, p18, RegexOptions.IgnoreCase            RegexOptions.Singleline); 
  • Finding middle initials :

     string t19 = "Hanley A. Strappman"; string p19 = @"^\S+\s+(\S)\S*\s+\S"; Match m19 = Regex.Match(t19, p19); 
  • Changing inch marks to quotation marks:

     string t20 = @"2' 2"" "; string p20 = "\"([^\"]*)"; string r20 = Regex.Replace(t20, p20, "``''"); 


C# in a Nutshell
C # in a Nutshell, Second Edition
ISBN: 0596005261
EAN: 2147483647
Year: 2005
Pages: 963

flylib.com © 2008-2017.
If you may any questions please contact us: flylib@qtcs.net