regex

regex

Use regular expressions V8.9 and above

The regex type allows you to parse tokens in the workspace using POSIX regular expressions. For information on how to use regular expressions see the online manuals ed (1) and regexp (1). A regex database-map type is declared like this:

 K  name  regex  expression  

The name is the symbolic name you will use to reference this database map from inside the RHS of rule sets. The expression is the literal text that composes your regular expression. Here is a simple example:

 Knumberedname regex   ^[0-9]+<@(aolmsn).com.?> 

The intention here is for this regular expression to match any address that has an all-numeric user part (the part before the <@ ), and a domain part that is either aol.com or (the character) msn.com . To make rules that use this type easier to write, you can add a -a switch to the declaration:

 Knumberedname regex -a.FOUND ^[0-9]+<@(aolmsn).com.?> 

Here the -a database switch causes .FOUND to be appended to any successful match.

Note that because of the way we have declared this database map, nothing but the suffix will be returned on a successful match. To get the original key returned you need to also use the -m database switch (-m).

This regex type can use a number of switches to good advantage. The complete list is shown in Table 23-24.

Table 23-24. The regex database-map type K command switches

Switch

Description

-a

-a

Append tag on successful match

-b

See this section

Use basic, not extended, regular expression matching

-D

-D

Don't use this database map if DeliveryMode= defer

-d

See this section

The delimiting string

-f

-f

Don't fold keys to lowercase, and cause the regular expression to match in a case-insensitive manner

-m

-m

Suppress replacement on match

-n

See this section

NOTthat is, invert the test

-q

-q

Don't strip quotes from key

-S

-S

Space replacement character

-s

See this section

Substring to match and return

-T

-T

Suffix to append on temporary failure

-t

-t

Ignore temporary errors

Note that some additional explanation for a few of these switches is provided in the sections that follow. Also, for an actual example of the regex type, see the file cf/cf/knecht.mc , which demonstrates a way to deal with one type of spam email.

The -b regex database-map switch

The -b switch limits the regular expression to a more limited but faster form. If you are using only simple regular expressions, as in the nature of those defined by ed (1), you can use this -b switch to slightly speed up the process:

 Kmatch regex -b -aLOCAL @localhost 

Here, the search is for a workspace that contains the substring @localhost . Because this is a very simple regular expression, the -b switch is appropriate. If you use the -b on a complex match (such as the one in the previous section's -n example), you might see an error such as this:

  configfile  : line  num  : field (2) out of range, only 1 substring in pattern 

The -d regex database-map switch

There might be times when you would prefer some other character, operator, or token to replace the $ that is returned when using the -s switch. If so, you can specify a different one with the -d database switch. Consider:

 Kmatch regex -s2,3  -d++  -a.FOUND (\<a\>\<b\>)@(\<bob\>\<ted\>)  .(\<com\>\<org\>)  

Here we specify that the three characters ++ will replace the single operator $ in the returned value:

 >  test a@bob.com  test               input: a @ bob . com test             returns: bob++com . FOUND 

Note that here the bob++com is a single token.

You can opt to have the original key returned. This is done by specifying the -m database switch:

 Kmatch regex -s2,3  -m  -d++ -a.FOUND (\<a\>\<b\>)@(\<bob\>\<ted\>)  .(\<com\>\<org\>)  

Note that the -m switch overrides the presence of the -s and -d switches:

 >  test a@bob.com  test               input: a @ bob . com test             returns: a @ bob . com . FOUND 

The -n regex database-map switch

The -n switch inverts the entire sense of the regular expression lookup. It returns a successful match only if the regular expression does not match. Consider:

 Kmatch regex -m  -n  -a.FOUND (\<a\>\<b\>)@(\<bob\>\<ted\>)  .(\<com\>\<org\>)  

If you view the effect of this switch in rule-testing mode, you will see that the result is inverted:

 > test a@bob.com test               input: a @ bob . com test             returns: a @ bob . com > test x@y.net test               input: x @ y . net test             returns: x @ y . net . FOUND 

The -s regex database-map switch

The -s database-map switch is used with the regex type to specify a substring to match and return. To illustrate , consider the following mini-configuration file:

 V10 Kmatch regex  -s  (\<bob\>\<ted\>) Stest R $*       $@ $(match  $) 

The regular expression looks to match either the name bob or ted , but no other names . The -s says to return the substring actually matched in the expression along with the key, the two separated from each other by a $ operator. Now, observe this mini-configuration file in rule-testing mode:

 %  /usr/sbin/sendmail -bt -Cdemo.cf  ADDRESS TEST MODE (ruleset 3 NOT automatically invoked) Enter <ruleset> <address> >  test bob  test               input: bob test             returns: bob $ bob >  test alice  test               input: alice test             returns: alice 

By adding a -a switch, which appends text to the matched key:

 Kmatch regex -s  -a.FOUND  (bobted) 

we see that the matched key with -s is second:

 >  test bob  test               input: bob test             returns: bob $ bob . FOUND 

When multiple substrings can be matched, the -s database switch can be used to specify which substring match to return. Consider:

 Kmatch regex  -s2  -a.FOUND (\<a\>\<b\>)  @(\<bob\>\<ted\>)  

There are two substring searches here, first the (\<a\>\<b\> ) choice, then the (\<bob\>\<ted\> ) choice. Because the -s has a 2 as its argument, the second matched substring will be returned, not the first:

 >  test a@bob  test               input: a @ bob test             returns: bob . FOUND 

In more complex expressions it might be desirable to return multiple substrings. To do that just list them following the -s with each separated from the next by a comma:

 Kmatch regex  -s2,3  -a.FOUND (\<a\>\<b\>)@(\<bob\>\<ted\>)  .(\<com\>\<org\>)  

When multiple substrings are listed in this way, they are separated by the $ operator when they are returned:

 >  test a@bob.com  test               input: a @ bob . com test             returns: bob $ com . FOUND 


Sendmail
sendmail, 4th Edition
ISBN: 0596510292
EAN: 2147483647
Year: 2002
Pages: 1174

flylib.com © 2008-2017.
If you may any questions please contact us: flylib@qtcs.net