7.2. Using Simple PatternsTo match a pattern (regular expression) against the contents of $_, put the pattern between a pair of forward slashes (/) as we do here: $_ = "yabba dabba doo"; if (/abba/) { print "It matched!\n"; } The expression /abba/ looks for that four-letter string in $_; if it finds it, it returns a true value. In this case, it's found more than once, but that doesn't make any difference. If it's found at all, it's a match; if it's not in there at all, it fails. Because the pattern match is generally being used to return a true or false value, it is almost always found in the conditional expression of if or while. All of the usual backslash escapes that you can put into double-quoted strings are available in patterns, so you could use the pattern /coke\tsprite/ to match the eleven characters of coke, a tab, and sprite. 7.2.1. About MetacharactersIf patterns matched only literal strings, they wouldn't be very useful. That's why a number of special characters, called metacharacters, have special meanings in regular expressions. For example, the dot (.) is a wildcard characterit matches any single character except a newline (which is represented by "\n"). So, the pattern /bet.y/ would match betty. It would also match betsy, bet=y, bet.y, or any other string that has bet, followed by any one character (except a newline), followed by y. It wouldn't match bety or betsey since those don't have one character between the t and the y. The dot always matches exactly one character. If you wanted to match a period in the string, you could use the dot. But that would match any possible character (except a newline), which might be more than you wanted. If you want the dot to match a period, you can backslash it. That rule goes for all of Perl's regular expression metacharacters: a backslash in front of any metacharacter makes it nonspecial. So, the pattern /3\.14159/ doesn't have a wildcard character. The backslash is our second metacharacter. If you mean a real backslash, use a pair of thema rule that applies everywhere else in Perl. 7.2.2. Simple QuantifiersIt often happens that you'll need to repeat something in a pattern. The star (*) means to match the preceding item zero or more times. So, /fred\t*barney/ matches any number of tab characters between fred and barney. It matches "fred\tbarney" with one tab, "fred\t\tbarney" with two tabs, "fred\t\t\tbarney" with three tabs, or "fredbarney" with nothing in between at all. That's because the star means "zero or more"so you could have hundreds of tab characters in between but nothing other than tabs. Think of the star as saying, "That previous thing, any number of times, even zero times" (because * is the "times" operator in multiplication). What if you wanted to allow something besides tab characters? The dot matches any character,[*] so .* will match any character, any number of times. That means that the /fred.*barney/ pattern matches "any old junk" between fred and barney. Any line that mentions fred and (somewhere later) barney will match that pattern. We often call .* the "any old junk" pattern because it can match any old junk in your strings.
The star is formally called a quantifier, meaning that it specifies a quantity of the preceding item. It's not the only quantifier; the plus (+) is another. The plus means to match the preceding item one or more times: /fred +barney/ matches if fred and barney are separated by spaces and only spaces. (The space is not a metacharacter.) This won't match fredbarney since the plus means there must be one or more spaces between the two names, so at least one space is required. Think of the plus as saying, "That last thing, plus (optionally) more of the same thing." There's a third quantifier like the star and plus, which is more limited. It's the question mark (?), which means that the preceding item is optional in that it may occur once or not at all. Like the other two quantifiers, the question mark means that the preceding item appears a certain number of times. In this case, the item may match one time (if it's there) or zero times (if it's not). There aren't any other possibilities. So, /bamm-?bamm/ matches either spelling: bamm-bamm or bammbamm. This is easy to remember since it's saying, "That last thing, maybe? Or maybe not?" All three of these quantifiers must follow something since they tell how many times the previous item may repeat. 7.2.3. Grouping in PatternsParentheses are also metacharacters. As in mathematics, parentheses (( )) may be used for grouping. As an example, the pattern /fred+/ matches strings like freddddddddd, but strings like that don't show up often in real life. But the pattern /(fred)+/ matches strings like fredfredfred, which is more likely to be what you wanted. What about the pattern /(fred)*/? That matches strings like hello, world.[]
|