Regular Expressions


Many Linux applications use regular expressions to simplify searches. In Perl, regular expressions have been taken to the level of an art form.

Statements with regular expressions use several of the same symbols that wildcards and operators use, but for slightly different reasons. Regular expressions use symbols to match patterns in text.

Some folks suggest that Perl regular expressions even represent a small programming language of its own. When you open the perlrequick man page (a brief introduction to regular expressions in Perl), this idea is reinforced by using a "Hello World!" example. To find the word World in "Hello World", use this line in a Perl script:

print "It matches\n" if "Hello World" =~ /World/; 

This script tells Perl to find (=~) the expression World in the Hello World string and then print the words "It matches" with a newline (\n) if it is found.

Any scalar can be matched against a regular expression in a script. This offers a wealth of possibilities to locate and manipulate data. The trick is learning the shorthand involved in getting it. This shorthand includes the metacharacters that represent search terms and operators. It's because regular expression patterns are made up of metacharacters that some people find cryptic and difficult to decipher.

Table 30.4 shows you some of the primary metacharacters and their uses in Perl. If you are familiar with regular expressions and wildcards in the shell or other applications, watch carefully for differences in Perl's metacharacter usage.

Table 30.4. Some Perl Metacharacters

Meaning

Metacharacter

Match any single character (except a newline).

.

Match the preceding character(s), any number of times.

*

Match the preceding character(s), one or more times.

+

The preceding character(s) may not be in the string, but it will still match if the other requirements are present.

?

Boolean OR statement.

|

Text matches if a tab character is present.

\t

Pattern matches if it is at the beginning of a line.

^

Pattern matches if it is at the end of a line.

$

Match a letter, number, or underscore.

\w

Match a number.

\d

Match any whitespace character such as a space, tab, or newline.

\s

Escape any metacharacter into an ordinary character.

\


In addition to searching and matching text segments and character patterns, regular expressions can also be used to validate user input. The form of expected input is expressed as a regular expression against which the input is compared.

my $email = <STDIN>; print "Input is in valid format" if     $email =~ \[A-Za-z0-9._%-]+@[A-Za-z0-9._%-]+\.[A-Za-z]{2,4}\ 

This is just a taste of how to use regular expressions in Perl. Read over the perlrequick and perlretut man pages for a more in-depth look, or check out Teach Yourself Regular Expressions in 10 Minutes by Ben Forta and published by Sams Publishing.



SUSE Linux 10 Unleashed
SUSE Linux 10.0 Unleashed
ISBN: 0672327260
EAN: 2147483647
Year: 2003
Pages: 332

flylib.com © 2008-2017.
If you may any questions please contact us: flylib@qtcs.net