Recipe10.10.Returning the Entire Line in Which a Match Is Found


Recipe 10.10. Returning the Entire Line in Which a Match Is Found

Problem

You have a string or file that contains multiple lines. When a specific character pattern is found on a line, you want to return the entire line, not just the matched text.

Solution

Use the StreamReader.ReadLine method to obtain each line in a file in which to run a regular expression against, as shown in Example 10-10.

Example 10-10. Returning the entire line in which a match is found

 public static List<string> GetLines(string source, string pattern, bool isFileName) {     string text = source;     List<string> matchedLines = new List<string>();     // If this is a file, get the entire file's text.     if (isFileName)     {         using (FileStream FS = new FileStream(source, FileMode.Open,                FileAccess.Read, FileShare.Read))         {             using (StreamReader SR = new StreamReader(FS))             {                 Regex RE = new Regex(pattern, RegexOptions.Multiline);                 while (text != null)                 {                     text = SR.ReadLine();                     if (text != null)                     {                         // Run the regex on each line in the string.                         MatchCollection theMatches = RE.Matches(text);                         if (theMatches.Count > 0)                         {                             // Get the line if a match was found.                             matchedLines.Add(text);                         }                     }                 }             }         }     }     else     {         // Run the regex once on the entire string.         Regex RE = new Regex(pattern, RegexOptions.Multiline);         MatchCollection theMatches = RE.Matches(text);         // Use these vars to remember the last line added to matchedLines         // so that we do not add duplicate lines.         int lastLineStartPos = -1;         int lastLineEndPos = -1;         // Get the line for each match.         foreach (Match m in theMatches)         {             int lineStartPos = GetBeginningOfLine(text, m.Index);             int lineEndPos = GetEndOfLine(text, (m.Index + m.Length - 1));             // If this is not a duplicate line, add it.             if (lastLineStartPos != lineStartPos &&                 lastLineEndPos != lineEndPos)             {                 string line = text.Substring(lineStartPos,                                 lineEndPos - lineStartPos);                 matchedLines.Add(line);                 // Reset line positions.                 lastLineStartPos = lineStartPos;                 lastLineEndPos = lineEndPos;             }         }     }     return (matchedLines); } public static int GetBeginningOfLine(string text, int startPointOfMatch) {     if (startPointOfMatch > 0)     {         --startPointOfMatch;     }     if (startPointOfMatch >= 0 && startPointOfMatch < text.Length)     {         // Move to the left until the first '\n char is found.         for (int index = startPointOfMatch; index >= 0; index--)         {             if (text[index] == '\n')             {                 return (index + 1);             }         }         return (0);     }     return (startPointOfMatch); } public static int GetEndOfLine(string text, int endPointOfMatch) {     if (endPointOfMatch >= 0 && endPointOfMatch < text.Length)     {         // Move to the right until the first '\n char is found.         for (int index = endPointOfMatch; index < text.Length; index++)         {             if (text[index] == '\n')             {                 return (index);             }         }         return (text.Length);     }     return (endPointOfMatch); } 

The following method shows how to call the GetLines method with either a filename or a string:

 public static void TestGetLine() {     // Get each line within the file TestFile.txt as a separate string.     Console.WriteLine();     List<string> lines = GetLines(@"C:\TestFile.txt", "\n", true);     foreach (string s in lines)         Console.WriteLine("MatchedLine: " + s);     // Get the lines matching the text "Line" within the given string.     Console.WriteLine();     lines = GetLines("Line1\r\nLine2\r\nLine3\nLine4", "Line", false);     foreach (string s in lines)         Console.WriteLine("MatchedLine: " + s); } 

Discussion

The GetLines method accepts three parameters:


source

The string or filename in which to search for a pattern


pattern

The regular expression pattern to apply to the source string


isFileName

Pass in true if the source is a filename or false if source is a string

This method returns a List<string> of strings that contains each line in which the regular expression match was found.

The GetLines method can obtain the lines on which matches occur, within a string or a file. When running a regular expression against a file with a name that is passed in to the source parameter (when isFileName equals true) in the GetLines method, the file is opened and read line by line. The regular expression is run against each line and, if a match is found, that line is stored in the matchedLines List<string>. Using the ReadLine method of the StreamReader object saves you from having to determine where each line starts and ends. Determining where a line starts and ends in a string requires some work, as you shall see.

Running the regular expression against a string passed in to the source parameter (when isFileName equals false) in the GetLines method produces a MatchCollection. Each Match object in this collection is used to obtain the line on which it is located in the source string. The line is obtained by starting at the position of the first character of the match in the source string and moving one character to the left until either an '\n' character is found or the beginning of the source string is found (this code is found in the GetBeginningOfLine method). This gives you the beginning of the line, which is placed in the variable LineStartPos. Next, the end of the line is found by starting at the last character of the match in the source string and moving to the right until either an '\n' character is found or the end of the source string is found (this code is found in the GetEndOfLine method). This ending position is placed in the LineEndPos variable. All of the text between the LineStartPos and LineEndPos will be the line in which the match is found. Each of these lines is added to the matchedLines List<string> and returned to the caller.

Something interesting you can do with the GetLines method is to pass in the string "\n" in the pattern parameter of this method. This trick will effectively return each line of the string or file as a string in the List<string>.

Note that if more than one match is found on a line, each matching line will be added to the List<string>.

See Also

See the ".NET Framework Regular Expressions," "FileStream Class," and "Stream-Reader Class" topics in the MSDN documentation.



C# Cookbook
Secure Programming Cookbook for C and C++: Recipes for Cryptography, Authentication, Input Validation & More
ISBN: 0596003943
EAN: 2147483647
Year: 2004
Pages: 424

flylib.com © 2008-2017.
If you may any questions please contact us: flylib@qtcs.net