Regular Expressions in Java


The Java classes you will use to perform regular expression operations are contained in the java.util.regex package. The classes are the Matcher and the Pattern classes. These classes allow you to both find and match character sequences against regular expression patterns. You might be wondering what the difference is between finding and matching. The find operation allows you to find matches in a string, and the match operation requires the entire string to be an exact match of the regular expression. Tasks for which you might have used the StringTokenizer class in the past are usually good candidates for exploring the possibility of simplifying your programming with regular expressions.

Note

If you are not able to use a version of Java that contains the regular expression package (>= 1.4), a good alternative regular expression package available is the Apache Jakarta Regular Expression package. This book does not cover the Jakarta package, but you can find information and complete documentation for it at http://jakarta.apache.org/regexp.


Table 6.1 shows the common regular expression matching characters. You might want to refer back to this table as you read the phrases in this chapter.

Table 6.1. Regular Expressions TableCommonly Used Special Characters

Special Character

Description

^

Beginning of the string.

$

End of the string.

?

0 or 1 times (refers to the previous expression).

*

0 or more times (refers to the previous expression).

+

1 or more times (refers to the previous expression).

[...]

Alternative characters.

|

Alternative patterns.

.

Any character.

\d

A digit.

\D

A non-digit.

\s

A whitespace character (space, tab, newline, form-feed, carriage return).

\S

A non-whitespace character.

\w

A word character [a-zA-Z_0-9].

\W

A non-word character [^\w].


Notice that although the regular expression escape characters are shown in Table 6.1 as being preceded by a single backslash, when used in a Java string, they must contain two backslashes. This is because in a Java string, the backslash character has special meaning; thus a double backslash escapes the backslash character and is the equivalent of a single backslash character.

A more complete listing of regular expression characters can be found in the JavaDoc for the Pattern class. This is available at this URL http://java.sun.com/j2se/1.5.0/docs/api/java/util/regex/Pattern.html.




JavaT Phrasebook. Essential Code and Commands
Java Phrasebook
ISBN: 0672329077
EAN: 2147483647
Year: 2004
Pages: 166

flylib.com © 2008-2017.
If you may any questions please contact us: flylib@qtcs.net