Flylib.com

Books Software

 
 
 

Extended Regular Expressions (Egrep)


Extended Regular Expressions (Egrep)

Extended regular expressions are an alternative to the basic regular expressions discussed so far in this chapter. When you use an extended regular expression, grep recognizes a different set of metacharacters, which are listed in Table 17.4.

Table 17.4: Egrep Metacharacters

Metacharacter

Description

(period)

Match any character except end of line.

(vertical bar)

Perform an OR.

?

The preceding pattern is optional and is to be matched at most once.

*

The preceding pattern is optional and is to be matched zero or more times.

+

The preceding pattern is not optional and is to be matched one or more times.

^

Match from the beginning of the line.

$

Match from the end of the line.

[ ]

Match any character within the brackets. Ranges may be

specified with a hyphen.

\

Turn off the special meaning of the following character.

()

Group characters or patterns into a larger pattern for more complex matches. For example, (abc)+ matches abc, abcabc, abcabcabc, etc.

There are two ways to use the alternate metacharacter set. One way is with the egrep utility; the other is to use grep with the -E option. The following examples illustrate the egrep features that differ from basic regular expressions. In the first example, egrep finds lines that contain either Oct or Feb :


egrep 'OctFeb' goodoleboys.txt


Bubba     Oct 13   444-1111 Buck       Mary Jean     12


Roscoe    Feb 2    444-2234 Rover      Alice Jean    410


Bill      Feb 29   333-4444 Daisy      Daisy         20

The following command finds lines with an uppercase B , zero or one i , and a lowercase l :


egrep 'Bi?l'           goodoleboys.txt


Chuck     Dec 25     444-2345 Blue      Mary Sue     12     .50


Billy Bob June 11    444-4340 Leotis    Lisa Sue     12


Claude    May 31     333-4340 Blue      Etheline     12


Bill      Feb 29     333-4444 Daisy     Daisy        20

Finally, this example finds lines where an uppercase E is preceded by zero or more spaces:


egrep ' *E'           goodoleboys.txt


Claude    May 31     333-4340 Blue      Etheline      12


Ernest T. ??         none     none      none         non

e



Fgrep

You can tell grep to interpret metacharacters literally when you are looking for a character that would otherwise be interpreted as a metacharacter. For example, you might wish to search for a period, which grep interprets as a wildcard that stands for any character. You have already learned one method to have the period metacharacter interpreted literally, which is to precede it with a back-slash, as shown here:

grep '\.' goodoleboys.txt

A second method is to use the -F option, which tells grep to treat metacharacters literally:

grep -F '.' goodoleboys.txt

A third method is to use the fgrep utility. Fgrep, which stands for fast grep or fixed grep (depending on whom you ask), does not interpret metacharacters. The following example illustrates the use of fgrep :

fgrep '.' goodoleboys.txt

The fgrep method is probably the easiest of the three. However, it cannot be used when you want to interpret some metacharacters literally, but make others use their wildcard abilities , as in this example:

grep '$[0-9]*\.[0-9]*.$' goodoleboys.txt

This command searches for records that have a dollar sign, then zero or more digits, then a period, then zero or more digits again, and finally one other character, anchored at the end of the line. Table 17.5 shows the parts of this search string.

Table 17.5: A Search String, Explained

Symbols

Description

\$

Search for a dollar sign.

[0 “9]*

Search for zero or more digits.

\.

Search for a period.

[0 “9]*

Search for zero or more digits, again.

.

Search for any character.

$

Search for the end-of-line anchor.