11.3. Patterns and Procedures

 < Day Day Up > 

awk scripts consist of patterns and actions :

     pattern  { action }

Both are optional. If pattern is missing, { action } is applied to all lines. If {action} is missing, the matched line is printed.

11.3.1. Patterns

A pattern can be any of the following:

     general expression     /regular expression/     relational expression     pattern-matching expression     BEGIN     END

  • General expressions can be composed of quoted strings, numbers, operators, function calls, user-defined variables, or any of the predefined variables described later in the section "Built-in Variables."

  • Regular expressions use the extended set of metacharacters as described in Chapter 7.

  • The ^ and $ metacharacters refer to the beginning and end of a string (such as the fields), respectively, rather than the beginning and end of a line. In particular, these metacharacters will not match at a newline embedded in the middle of a string.

  • Relational expressions use the relational operators listed in the section "Operators," later in this chapter. For example, $2 > $1 selects lines for which the second field is greater than the first. Comparisons can be either string or numeric. Thus, depending upon the types of data in $1 and $2, awk will do either a numeric or a string comparison. This can change from one record to the next.

  • Pattern-matching expressions use the operators ~ (matches) and !~ (doesn't match). See the section "Operators" later in this chapter.

  • The BEGIN pattern lets you specify actions that take place before the first input line is processed. (Generally, you process the command line and set global variables here.)

  • The END pattern lets you specify actions that take place after the last input record is read.

  • BEGIN and END patterns may appear multiple times. The actions are merged as if there had been one large action.

Except for BEGIN and END, patterns can be combined with the Boolean operators || (or), && (and), and ! (not). An inclusive range of lines can also be specified using comma-separated patterns:

     pattern,pattern

11.3.2. Procedures

Procedures consist of one or more commands, function calls, or variable assignments, separated by newlines or semicolons, and are contained within curly braces. Commands fall into five groups:

  • Variable or array assignments

  • Input/output commands

  • Built-in functions

  • Control-flow commands

  • User-defined functions

11.3.3. Simple Pattern-Action Examples

  • Print first field of each line:

         { print $1 }

  • Print all lines that contain pattern:

         /pattern/

  • Print first field of lines that contain pattern:

         /pattern/ { print $1 }

  • Select records containing more than two fields:

         NF > 2

  • Interpret input records as a group of lines up to a blank line. Each line is a single field:

         BEGIN { FS = "\n"; RS = "" }

  • Print fields 2 and 3 in switched order, but only on lines whose first field matches the string URGENT:

         $1 ~ /URGENT/ { print $3, $2 }

  • Count and print the number of lines matching pattern:

         /pattern/ { ++x }     END { print x }

  • Add numbers in second column and print the total:

         { total += $2 }     END { print "column total is", total}

  • Print lines that contain fewer than 20 characters:

         length($0) < 20

  • Print each line that begins with Name: and that contains exactly 7 fields:

         NF == 7 && /^Name:/

  • Print the fields of each record in reverse order, one per line:

         {             for (i = NF; i >= 1; i--)                     print $i     } 

     < Day Day Up > 


    Unix in a Nutshell
    Unix in a Nutshell, Fourth Edition
    ISBN: 0596100299
    EAN: 2147483647
    Year: 2005
    Pages: 201

    flylib.com © 2008-2017.
    If you may any questions please contact us: flylib@qtcs.net