A key aspect of sed is its use of regular expressions. A regular expression is a pattern that can match text strings. Regular expressions are a formal language from which very complex patterns can be expressed . Let s look at a few examples (using the sed delimiter for completeness):
/dog/ | Matches any occurrence of dog |
/[a-z]/ | Matches a single character a through z |
/[a-zA-Z]/ | Matches single characters a through z and A through Z |
/[0-9]/ | Matches all single digits ( through 9 ) |
/0[ab]1/ | Matches 0a1 and 0b1 |
/Z*/ | Matches zero or more occurrences of Z ( Z , ZZ , ZZZ , and so on) |
/Z?/ | Match zero or one instance of Z ( ˜ Z ) |
/[^0-9]/ | Matches any single character other than digits |
/t.m/ | Matches any occurrence of t separate by one character followed by m , such as tim , tom , and so on |
These patterns illustrate a number of special symbols used within regular expressions. For example, the [ ] indicates a range of characters. The - symbol is used to define the range. The * character indicates that a character may repeat zero or more times. The ? specifies that one or zero instances of the character is used for the match. The ^ character indicates the characters that are NOT used for the match. Finally, the . character matches any character.
Two other regular expressions (called anchors ) that we ll explore with sed in this chapter include ^ (different context than before) to match at the beginning of the line and $ to match at the end of the line. For example:
/^The/ | Matches The at the beginning of a line |
/end.$/ | Matches end. at the end of a line |
/^T.*./ | Matches lines that start with T and end with . |