Regular Expressions | GNU/Linux Application Programming (Programming Series)

A key aspect of sed is its use of regular expressions. A regular expression is a pattern that can match text strings. Regular expressions are a formal language from which very complex patterns can be expressed . Let s look at a few examples (using the sed delimiter for completeness):

/dog/	Matches any occurrence of dog
/[a-z]/	Matches a single character a through z
/[a-zA-Z]/	Matches single characters a through z and A through Z
/[0-9]/	Matches all single digits ( through 9 )
/0[ab]1/	Matches 0a1 and 0b1
/Z*/	Matches zero or more occurrences of Z ( Z , ZZ , ZZZ , and so on)
/Z?/	Match zero or one instance of Z ( ˜ Z )
/[^0-9]/	Matches any single character other than digits
/t.m/	Matches any occurrence of t separate by one character followed by m , such as tim , tom , and so on

These patterns illustrate a number of special symbols used within regular expressions. For example, the [ ] indicates a range of characters. The - symbol is used to define the range. The * character indicates that a character may repeat zero or more times. The ? specifies that one or zero instances of the character is used for the match. The ^ character indicates the characters that are NOT used for the match. Finally, the . character matches any character.

Two other regular expressions (called anchors ) that we ll explore with sed in this chapter include ^ (different context than before) to match at the beginning of the line and $ to match at the end of the line. For example:

/^The/	Matches The at the beginning of a line
/end.$/	Matches end. at the end of a line
/^T.*./	Matches lines that start with T and end with .