# Chapter 6. The awk Utility: awk Programming Constructs

 CONTENTS

## Chapter 6. The awk Utility: awk Programming Constructs

•  6.1 Comparison Expressions
•  6.2 Review
•  UNIX TOOLS LAB EXERCISE

### 6.1 Comparison Expressions

Comparison expressions match lines where if the condition is true, the action is performed. These expressions use relational operators and are used to compare numbers or strings. Table 6.1 provides a list of the relational operators. The value of the expression is 1 if the expression evaluates true, and 0 if false.

#### 6.1.1 Relational Operators

##### Table 6.1. Relational Operators
Operator Meaning Example
< Less than. x < y
<= Less than or equal to. x <= y
== Equal to. x == y
!= Not equal to. x != y
>= Greater than or equal to. x >= y
> Greater than. x > y
~ Matched by regular expression. x ~ /y/
!~ Not matched by regular expression. x !~ /y/
##### Example 6.1
`(The Database) % cat employee Tom Jones      4423      5/12/66   543354 Mary Adams     5346      11/4/63   28765 Sally Chang    1654      7/22/54   650000 Billy Black    1683      9/23/44   336500 (The Command Line) 1   % nawk '\$3 == 5346' employees     Mary Adams  5346     11/4/63   28765 2   % nawk '\$3 > 5000{print \$1} ' employees     Mary 3   % nawk '\$2 ~ /Adam/ ' employees     Mary Adams   5346    11/4/63   28765 4   % nawk '\$2 !~ /Adam/ ' employees     Tom Jones    4423    5/12/66   543354     Sally Chang  1654    7/22/54   650000     Billy Black  1683    9/23/44   336500 `

## EXPLANATION

1. If the third field is equal to 5346, the condition is true and awk will perform the default action print the line. When an if condition is implied, it is a conditional pattern test.

2. If the third field is greater than 5000, awk prints the first field.

3. If the second field matches the regular expression Adam, the record is printed.

4. If the second field does not match the regular expression Adam, the record is printed. If an expression is a numeric value and is being compared to a string value with an operator that requires a numeric comparison, the string value will be converted to a numeric value. If the operator requires a string value, the numeric value will be converted to a string value.

#### 6.1.2 Conditional Expressions

A conditional expression uses two symbols, the question mark and the colon, to evaluate expressions. It is really just a short way to achieve the same result as doing an if/else statement. The general format is shown below.

## FORMAT

`conditional expression1 ? expression2 : expression3 `

This produces the same result as the if/else shown below. (A complete discussion of the if/else construct is given later.)

`{ if (expression1)        expression2 else        expression3 } `
##### Example 6.2
`nawk '{max=(\$1 > \$2) ? \$1 : \$2; print max}' filename `

## EXPLANATION

If the first field is greater than the second field, the value of the expression after the question mark is assigned to max, otherwise the value of the expression after the colon is assigned to max.

This is comparable to

`if (\$1 > \$2)      max=\$1 else      max=\$2 `

#### 6.1.3 Computation

Computation can be performed within patterns. Awk performs all arithmetic in floating point. The arithmetic operators are provided in Table 6.2.

##### Table 6.2. Arithmetic Operators
Operator Meaning Example
- Subtract x-y
* Multiply x*y
/ Divide x/y
% Modulus x%y
^ Exponentiation x^y
##### Example 6.3
`nawk '\$3 * \$4 > 500' filename `

## EXPLANATION

Awk will multiply the third field (\$3) by the fourth field (\$4), and if the result is greater than 500, it will display those lines. (filename is assumed to be a file containing the input.)

#### 6.1.4 Compound Patterns

Compound patterns are expressions that combine patterns with logical operators (see Table 6.3). An expression is evaluated from left to right.

##### Table 6.3. Logical Operators
Operator Meaning Example
&& Logical and a && b
|| Logical or a || b
! not ! a
##### Example 6.4
`nawk '\$2 > 5 && \$2 <= 15' filename `

## EXPLANATION

Awk will display those lines that match both conditions; that is, where the second field (\$2) is greater than 5 and the second field (\$2) is also less than or equal to 15. With the && operator, both conditions must be true. (filename is assumed to be a file containing the input.)

##### Example 6.5
`nawk '\$3 == 100 || \$4 > 50' filename `

## EXPLANATION

Awk will display those lines that match one of the conditions; that is, where the third field is equal to 100 or the fourth field is greater than 50. With the || operator, only one of the conditions must be true. (filename is assumed to be a file containing the input.)

##### Example 6.6
`nawk '!(\$2 < 100 && \$3 < 20)' filename `

## EXPLANATION

If both conditions are true, awk will negate the expression and display those lines. So the lines displayed will have one or both conditions false. The unary ! operator negates the result of the condition so that if the expression yields a true condition, the not will make it false, and vice versa. (filename is assumed to be a file containing the input.)

#### 6.1.5 Range Patterns

Range patterns match from the first occurrence of one pattern to the first occurrence of the second pattern, then match for the next occurrence of the first pattern to the next occurrence of the second pattern, etc. If the first pattern is matched and the second pattern is not found, awk will display all lines to the end of the file.

##### Example 6.7
`nawk '/Tom/,/Suzanne/' filename `

## EXPLANATION

Awk will display all lines, inclusive, that range between the first occurrence of Tom and the first occurrence of Suzanne. If Suzanne is not found, awk will continue processing lines until the end of file. If, after the range between Tom and Suzanne is printed, Tom appears again, awk will start displaying lines until another Suzanne is found or the file ends.

#### 6.1.6 A Data Validation Program

Using the awk commands discussed so far, the following password-checking program from the book, The AWK Programming Language,[1] illustrates how the data in a file can be validated.

##### Example 6.8
`(The Password Database) 1   % cat /etc/passwd     tooth:pwHfudo.eC9sM:476:40:Contract Admin.:/home/rickenbacker/tooth:/bin/csh     lisam:9JY7OuS2f3lHY:4467:40:Lisa M. Spencer:/home/fortune1/lisam:/bin/csh     goode:v7Ww.nWJCeSIQ:32555:60:Goodwill Guest User:/usr/goodwill:/bin/csh     bonzo:eTZbu6M2jM7VA:5101:911: SSTOOL Log account :/home/sun4/bonzo:/bin/csh     info:mKZsrioPtW9hA:611:41:Terri Stern:/home/chewie/info:/bin/csh     cnc:IN1IVqVj1bVv2:10209:41:Charles Carnell:/home/christine/cnc:/bin/csh     bee:*:347:40:Contract Temp.:/home/chanel5/bee:/bin/csh     friedman:oyuIiKoFTV0TE:3561:50:Jay Friedman:/home/ibanez/friedman:/bin/csh     chambers:Rw7R1k77yUY4.:592:40:Carol Chambers:/usr/callisto2/chambers:/bin/csh     gregc:nkLulOg:7777:30:Greg Champlin FE Chicago     ramona:gbDQLdDBeRc46:16660:68:RamonaLeininge MWA CustomerService Rep:/     home/forsh: (The Awk Commands) 2   % cat /etc/passwd | nawk  F: '\ 3   NF != 7{\ 4   printf("line %d, does not have 7 fields: %s\n",NR,\$0)} \ 5   \$1 !~ /[A Za z0 9]/{printf("line %d, nonalphanumeric user id: %s\n",NR,\$0)} \ 6   \$2 == "*" {printf("line %d, no password: %s\n",NR,\$0)} ' (The Output) line 7, no password: bee:*:347:40:Contract Temp.:/home/chanel5/bee:/bin/csh line 10, does not have 7 fields: gregc:nk2EYi7kLulOg:7777:30:Greg Champlin FE Chicago line 11, does not have 7 fields: ramona:gbDQLdDBeRc46:16660:68:Ramona Leininger MWA Customer Service Rep:/home/forsh: `

## EXPLANATION

1. The contents of the /etc/passwd file are displayed.

2. The cat program sends its output to awk. Awk's field separator is a colon.

3. If the number of fields (NF) is not equal to 7, the following action block is executed.

4. The printf function prints the string line <number>, does not have 7 fields: followed by the number of the current record (NR) and the record itself (\$0).

5. If the first field (\$1) does not contain any alphanumeric characters, the printf function prints the string nonalphanumeric user id:, followed by the number of the record and the record.

6. If the second field (\$2) equals an asterisk, the string no passwd: is printed, followed by the number of the record and the record itself.

### 6.2 Review

#### 6.2.1 Equality Testing

`% cat datafile northwest           NW   Joe Craig         3.0   .98    3    4 western             WE   Sharon Kelly      5.3   .97    5    23 southwest           SW   Chris Foster      2.7   .8     2    18 southern            SO   May Chin          5.1   .95    4    15 southeast           SE   Derek Johnson     4.0   .7     4    17 eastern             EA   Susan Beal        4.4   .84    5    20 northeast           NE   TJ Nichols        5.1   .94    3    13 north               NO   Val Shultz        4.5   .89    5    9 central             CT   Sheri Watson      5.7   .94    5    13 `
##### Example 6.9
`nawk '\$7 == 5' datafile western       WE      Sharon Kelly       5.3  .97  5    23 eastern       EA      Susan Beal         4.4  .84  5    20 north         NO      Val Shultz         4.5  .89  5    9 central       CT      Sheri Watson       5.7  .94  5    13 `

## EXPLANATION

If the seventh field (\$7) is equal to the number 5, the record is printed.

##### Example 6.10
`nawk '\$2 == "CT"{print \$1, \$2}' datafile central       CT `

## EXPLANATION

If the second field is equal to the string CT, fields one and two (\$1, \$2) are printed. Strings must be quoted.

`% cat datafile northwest           NW   Joel Craig        3.0    .98    3    4 western             WE   Sharon Kelly      5.3    .97    5    23 southwest           SW   Chris Foster      2.7    .8     2    18 southern            SO   May Chin          5.1    .95    4    15 southeast           SE   Derek Johnson     4.0    .7     4    17 eastern             EA   Susan Beal        4.4    .84    5    20 northeast           NE   TJ Nichols        5.1    .94    3    13 north               NO   Val Shultz        4.5    .89    5    9 central             CT   Sheri Watso       5.7    .94    5    13 `

#### 6.2.2 Relational Operators

##### Example 6.11
`nawk '\$7 != 5' datafile northwest     NW     Joel Craig        3.0  .98  3    4 southwest     SW     Chris Foster      2.7  .8   2    18 southern      SO     May Chin          5.1  .95  4    15 southeast     SE     Derek Johnson     4.0  .7   4    17 northeast     NE     TJ Nichols        5.1   94  3    13 `

## EXPLANATION

If the seventh field (\$7) is not equal to the number 5, the record is printed.

##### Example 6.12
`nawk '\$7 < 5 {print \$4, \$7}' datafile Craig 3 Foster 2 Chin 4 Johnson 4 Nichols 3 `

## EXPLANATION

If the seventh field (\$7) is less than 5, fields 4 and 7 are printed.

##### Example 6.13
`nawk '\$6 > .9 {print \$1, \$6}' datafile northwest .98 western .97 southern .95 northeast .94 central .94 `

## EXPLANATION

If the sixth field (\$6) is greater than .9, fields 1 and 6 are printed.

##### Example 6.14
`nawk '\$8 <= 17 { print \$8}' datafile 4 15 17 13 9 13 `

## EXPLANATION

If the eighth field (\$8) is less than or equal to 17, it is printed.

##### Example 6.15
`nawk '\$8 >= 17 {print \$8}' datafile 23 18 17 20 `

## EXPLANATION

If the eighth field is greater than or equal to 17, the eighth field is printed.

`% cat datafile northwest           NW   Joel Craig        3.0    .98    3    4 western             WE   Sharon Kelly      5.3    .97    5    23 southwest           SW   Chris Foster      2.7    .8     2    18 southern            SO   May Chin          5.1    .95    4    15 southeast           SE   Derek Johnson     4.0    .7     4    17 eastern             EA   Susan Beal        4.4    .84    5    20 northeast           NE   TJ Nichols        5.1    .94    3    13 north               NO   Val Shultz        4.5    .89    5    9 central             CT   Sheri Watson      5.7    .94    5    13 `

#### 6.2.3 Logical Operators

##### Example 6.16
`nawk '\$8 > 10 && \$8 < 17' datafile southern      SO     May Chin         5.1  .95  4    15 northeast     NE     TJ Nichols       5.1  .94  3    13 central       CT     Sheri Watson     5.7  .94  5    13 `

## EXPLANATION

If the eighth field (\$8) is greater than 10 and less than 17,the record is printed. The record will be printed only if both expressions are true.

##### Example 6.17
`nawk '\$2 == "NW" || \$1 ~ /south/{print \$1, \$2}' datafile northwest NW southwest SW southern SO southeast SE `

## EXPLANATION

If the second field (\$2) is equal to the string NW or the first field (\$1) contains the pattern south, the first and second fields (\$1, \$2) are printed. The record will be printed if only one of the expressions is true.

#### 6.2.4 Logical not Operator

##### Example 6.18
`nawk '!(\$8 == 13){print \$8}' datafile 4 23 18 15 17 20 9 `

## EXPLANATION

If the eighth field (\$8) is equal to 13, the ! (not operator) nots the expression and prints the eighth field (\$8). The ! is a unary negation operator.

#### 6.2.5 Arithmetic Operators

##### Example 6.19
`nawk '/southern/{print \$5 + 10}' datafile 15.1 `

## EXPLANATION

If the record contains the regular expression southern, 10 is added to the value of the fifth field (\$5) and printed. Note that the number prints in floating point.

##### Example 6.20
`nawk '/southern/{print \$8 + 10}' datafile 25 `

## EXPLANATION

If the record contains the regular expression southern, 10 is added to the value of the eighth field (\$8) and printed. Note that the number prints in decimal.

`% cat datafile northwest           NW   Joel Craig        3.0    .98    3    4 western             WE   Sharon Kelly      5.3    .97    5    23 southwest           SW   Chris Foster      2.7    .8     2    18 southern            SO   May Chin          5.1    .95    4    15 southeast           SE   Derek Johnson     4.0    .7     4    17 eastern             EA   Susan Beal        4.4    .84    5    20 northeast           NE   TJ Nichols        5.1    .94    3    13 north               NO   Val Shultz        4.5    .89    5    9 central             CT   Sheri Watson      5.7    .94    5    13 `
##### Example 6.21
`nawk '/southern/{print \$5 + 10.56}' datafile 15.66 `

## EXPLANATION

If the record contains the regular expression southern, 10.56 is added to the value of the fifth field (\$5) and printed.

##### Example 6.22
`nawk '/southern/{print \$8 - 10}' datafile 5 `

## EXPLANATION

If the record contains the regular expression southern, 10 is subtracted from the value of the eighth field (\$8) and printed.

##### Example 6.23
`nawk '/southern/{print \$8 / 2}' datafile 7.5 `

## EXPLANATION

If the record contains the regular expression southern, the value of the eighth field (\$8) is divided by 2 and printed.

##### Example 6.24
`nawk '/northeast/{print \$8 / 3}' datafile 4.33333 `

## EXPLANATION

If the record contains the regular expression northeast, the value of the eighth field (\$8) is divided by 3 and printed. The precision is five places to the right of the decimal point.

##### Example 6.25
`nawk '/southern/{print \$8 * 2}' datafile 30 `

## EXPLANATION

If the record contains the regular expression southern, the eighth field (\$8) is multiplied by 2 and printed.

##### Example 6.26
`nawk '/northeast/ {print \$8 % 3}' datafile 1 `

## EXPLANATION

If the record contains the regular expression northeast, the eighth field (\$8) is divided by 3 and the remainder (modulus) is printed.

##### Example 6.27
`nawk '\$3 ~ /^Susan/\ {print "Percentage: "\$6 + .2 " Volume: " \$8}' datafile Percentage: 1.04 Volume: 20 `

## EXPLANATION

If the third field (\$3) begins with the regular expression Susan, the print function prints the result of the calculations and the strings in double quotes.

`% cat datafile northwest          NW   Joel Craig        3.0    .98    3    4 western            WE   Sharon Kelly      5.3    .97    5    23 southwest          SW   Chris Foster      2.7    .8     2    18 southern           SO   May Chin          5.1    .95    4    15 southeast          SE   Derek Johnson     4.0    .7     4    17 eastern            EA   Susan Beal        4.4    .84    5    20 northeast          NE   TJ Nichols        5.1    .94    3    13 north              NO   Val Shultz        4.5    .89    5    9 central            CT   Sheri Watson      5.7    .94    5    13 `

#### 6.2.6 Range Operator

##### Example 6.28
`nawk '/^western/,/^eastern/' datafile western       WE     Sharon Kelly       5.3  .97  5   23 southwest     SW     Chris Foster       2.7  .8   2   18 southern      SO     May Chin           5.1  .95  4   15 southeast     SE     Derek Johnson      4.0  .7   4   17 eastern       EA     Susan Beal         4.4  .84  5   20 `

## EXPLANATION

All records within the range beginning with the regular expression western are printed until a record beginning with the expression eastern is found. Records will start being printed again if the pattern western is found and will continue to print until eastern or end of file is reached.

#### 6.2.7 Conditional Operator

##### Example 6.29
`nawk '{print (\$7 > 4 ? "high "\$7 : "low "\$7)}' datafile low 3 high 5 low 2 low 4 low 4 high 5 low 3 high 5 high 5 `

## EXPLANATION

If the seventh field (\$7) is greater than 4, the print function gets the value of the expression after the question mark (the string high and the seventh field), else the print function gets the value of the expression after the colon (the string low and the value of the seventh field).

#### 6.2.8 Assignment Operators

##### Example 6.30
`nawk '\$3 == "Chris"{ \$3 = "Christian"; print}' datafile southwest SW Christian Foster 2.7 .8 2 18 `

## EXPLANATION

If the third field (\$3) is equal to the string Chris, the action is to assign Christian to the third field (\$3) and print the record. The double equal tests its operands for equality, whereas the single equal is used for assignment.

##### Example 6.31
`nawk '/Derek/{\$8 += 12; print \$8}' datafile 29 `

## EXPLANATION

If the regular expression Derek is found, 12 is added and assigned to (+=) the eighth field (\$8), and that value is printed. Another way to write this is: \$8 = \$8 + 12.

##### Example 6.32
`nawk '{\$7 %= 3; print \$7}' datafile 0 2 2 1 1 2 0 2 2 `

## EXPLANATION

For each record, the seventh field (\$7) is divided by 3, and the remainder of that division (modulus) is assigned to the seventh field and printed.

### UNIX TOOLS LAB EXERCISE

#### Lab 4: awk Exercise

Mike Harrington:(510) 548-1278:250:100:175

Christian Dobbins:(408) 538-2358:155:90:201

Susan Dalsass:(206) 654-6279:250:60:50

Archie McNichol:(206) 548-1348:250:100:175

Jody Savage:(206) 548-1278:15:188:150

Guy Quigley:(916) 343-6410:250:100:175

Dan Savage:(406) 298-7744:450:300:275

Nancy McNeil:(206) 548-1278:250:80:75

John Goldenrod:(916) 348-4278:250:100:175

Chet Main:(510) 548-5258:50:95:135

Tom Savage:(408) 926-3456:250:168:200

Elizabeth Stachelin:(916) 440-1763:175:75:300

(Refer to the database called lab4.data on the CD.)

The database above contains the names, phone numbers, and money contributions to the party campaign for the past three months.

 1: Print the first and last names of those who contributed over \$100 in the first month. 2: Print the names and phone numbers of those who contributed less than \$60 in the first month. 3: Print those who contributed between \$90 and \$150 in the third month. 4: Print those who contributed more than \$800 over the three-month period. 5: Print the names and phone numbers of those with an average monthly contribution greater than \$150. 6: Print the first name of those not in the 916 area code. 7: Print each record preceded by the number of the record. 8: Print the name and total contribution of each person. 9: Add \$10 to Elizabeth's second contribution. 10: Change Nancy McNeil's name to Louise McInnes.

[1]  Alfred V. Aho, Brian W. Kernighan, and Peter J. Weinberger, The AWK Programming Language (Boston: Addison-Wesley, 1988). 1988 Bell Telephone Laboratories, Inc. Reprinted by permission of Pearson Education, Inc.

 CONTENTS

UNIX Shells by Example, 3rd Edition
ISBN: 013066538X
EAN: 2147483647
Year: 2001
Pages: 18
Authors: Ellie Quigley

Similar book on Amazon