ProblemYou need to match newlines in text. SolutionUse \n or \r. See also the flags constant Pattern.MULTILINE, which makes newlines match as beginning-of-line and end-of-line (^ and $). DiscussionWhile line-oriented tools from Unix such as sed and grep match regular expressions one line at a time, not all tools do. The sam text editor from Bell Laboratories was the first interactive tool I know of to allow multiline regular expressions; the Perl scripting language followed shortly. In the Java API, the newline character by default has no special significance. The BufferedReader method readLine( ) normally strips out whichever newline characters it finds. If you read in gobs of characters using some method other than readLine( ), you may have some number of \n , \r, or \r\n sequences in your text string.[4] Normally all of these are treated as equivalent to \n. If you want only \n to match, use the UNIX_LINES flag to the Pattern.compile( ) method.
In Unix, ^ and $ are commonly used to match the beginning or end of a line, respectively. In this API, the regex metacharacters ^ and $ ignore line terminators and only match at the beginning and the end, respectively, of the entire string. However, if you pass the MULTILINE flag into Pattern.compile( ) , these expressions match just after or just before, respectively, a line terminator; $ also matches the very end of the string. Since the line ending is just an ordinary character, you can match it with . or similar expressions, and, if you want to know exactly where it is, \n or \r in the pattern match it as well. In other words, to this API, a newline character is just another character with no special significance. See the sidebar Pattern.compile( ) Flags. An example of newline matching is shown in Example 4-6. Example 4-6. NLMatch.javaimport java.util.regex.*; /** * Show line ending matching using regex class. * @author Ian F. Darwin, ian@darwinsys.com * @version $Id: ch04.xml,v 1.4 2004/05/04 20:11:27 ian Exp $ */ public class NLMatch { public static void main(String[] argv) { String input = "I dream of engines\nmore engines, all day long"; System.out.println("INPUT: " + input); System.out.println( ); String[] patt = { "engines.more engines", "engines$" }; for (int i = 0; i < patt.length; i++) { System.out.println("PATTERN " + patt[i]); boolean found; Pattern p1l = Pattern.compile(patt[i]); found = p1l.matcher(input).find( ); System.out.println("DEFAULT match " + found); Pattern pml = Pattern.compile(patt[i], Pattern.DOTALL|Pattern.MULTILINE); found = pml.matcher(input).find( ); System.out.println("MultiLine match " + found); System.out.println( ); } } } If you run this code, the first pattern (with the wildcard character .) always matches, while the second pattern (with $) matches only when MATCH_MULTILINE is set. > java NLMatch INPUT: I dream of engines more engines, all day long PATTERN engines more engines DEFAULT match true MULTILINE match: true PATTERN engines$ DEFAULT match false MULTILINE match: true |