Recipe 6.9 Matching Shell Globs as Regular Expressions

6.9.1 Problem

You want to allow users to specify matches using traditional shell wildcards, not full Perl regular expressions. Wildcards are easier to type than full regular expressions for simple cases.

6.9.2 Solution

Use the following subroutine to convert four shell wildcard characters into their equivalent regular expression; all other characters are quoted to render them literals.

sub glob2pat {     my $globstr = shift;     my %patmap = (         '*' => '.*',         '?' => '.',         '[' => '[',         ']' => ']',     );     $globstr =~ s{(.)} { $patmap{$1} || "\Q$1" }ge;     return '^' . $globstr . '$'; }

6.9.3 Discussion

A Perl regex pattern is not the same as a shell wildcard pattern. The shell's *.* is not a valid regular expression. Its meaning as a pattern would be /^.*\..*$/s, which is admittedly much less fun to type.

The function given in the Solution makes these conversions for you, following the standard wildcard rules used by the glob built-in. Table 6-1Table 6-1 shows equivalent wildcard patterns in the shell and in Perl.

Table 6-2. Shell globs and equivalent Perl wildcard patterns

Shell

Perl

list.?

^list\..$

project.*

^project\..*$

*old

^.*old$

type*.[ch]

^type.*\.[ch]$

*.*

^.*\..*$

*

^.*$

The function returns a string, not a regex object, because the latter would lock in (and out) any modifier flags, such as /i, but we'd rather delay that decision until later.

Shell wildcard rules are different from those of a regular expression. The entire pattern is implicitly anchored at the ends; a question mark maps into any character; an asterisk is any amount of anything; and brackets are character ranges. Everything else is a literal.

Most shells do more than simple one-directory globbing. For instance, */* means "all files (including directory files) in all subdirectories of the current directory." Also, shells usually don't expand wildcards to include files with names beginning with a period; you usually have to put that leading period into your glob pattern explicitly. Our glob2pat function doesn't do these things if you need them, use the File::KGlob module from CPAN.

6.9.4 See Also

Your system manpages for the various shells, such as csh(1), tcsh(1), sh(1), ksh(1), and bash(1); the glob function in perlfunc(1) and Chapter 29 of Programming Perl; the documentation for the CPAN module Glob::DosGlob; the "I/O Operators" section of perlop(1); we talk more about globbing in Recipe 9.6



Perl Cookbook
Perl Cookbook, Second Edition
ISBN: 0596003137
EAN: 2147483647
Year: 2003
Pages: 501

flylib.com © 2008-2017.
If you may any questions please contact us: flylib@qtcs.net