Section 8.2. Using java.util.regex


8.2. Using java.util.regex

The mechanics of wielding regular expressions with java.util.regex are fairly simple, with the functionality provided by only two classes, an interface, and an unchecked exception:

 java.util.regex.Pattern     java.util.regex.Matcher     java.util.regex.MatchResult     java.util.regex.PatternSyntaxException 

Informally, I'll refer to the first two simply as "pattern" and "matcher," and in many cases, these are the only classes we'll use. A Pattern object is, in short, a compiled regular expression that can be applied to any number of strings, and a Matcher object is an individual instance of that regex being applied to a specific target string.

New in Java 1.5, MatchResult encapsulates the data from a successful match. Match data is available from the matcher itself until the next match attempt, but can be saved by extracting it as a MatchResult .

PatternSyntaxException is thrown when an attempt is made to use an illformed regular expression (one such as [oops) thats not syntactically correct). It extends java.lang.IllegalAgumentException and is unchecked.

Here's a complete, verbose example showing a simple match:

 public class  SimpleRegexTest  {       public static void main(String[]  args  )       {          String  myText  = "this is my 1st test string";          String  myRegex  = "\d+\w+"; //  This provides for   \d+\w+   java.util.regex  .Pattern  p  =  java.util.regex  .Pattern.compile(  myRegex  );  java.util.regex  .Matcher  m  =  p  .matcher(  myText  );          if (  m  .find()) {              String  matchedText  =  m  .group();              int  matchedFrom  =  m  .start();              int  matchedTo  =  m  .end();              System.out.println("matched [" +  matchedText  + "] " +                                 "from " +  matchedFrom  +                                 " to " +  matchedTo  + ".");          } else {              System.out.println("didn't match");          }       }     } 

This prints ' matched [1st] from 12 to 15 .'. As with all examples in this chapter, names I've chosen are italicized. The parts shown in bold can be omitted if

 import java.util.regex.*; 

is inserted at the head of the program, as with the examples in Chapter 3 (˜ 95).

Doing so is the standard approach, and makes the code more manageable. The rest of this chapter assumes the import statement is always supplied.

The object model used by java.util.regex is a bit different from most. In the previous example, notice that the Matcher object m , after being created by associating a Pattern object and a target string, is used to launch the actual match attempt (with its find method), and to query the results (with its group , start , and end methods ).

This approach may seem a bit odd at first glance, but you'll quickly get used to it.



Mastering Regular Expressions
Mastering Regular Expressions
ISBN: 0596528124
EAN: 2147483647
Year: 2004
Pages: 113

flylib.com © 2008-2017.
If you may any questions please contact us: flylib@qtcs.net