CONTENTS |
Numeric constants can be represented as integers, such as 243, floating point numbers, such as 3.14, or numbers using scientific notation, such as .723E-1 or 3.4e7. Strings, such as Hello world, are enclosed in double quotes.
Initialization and Type Coercion. Just mentioning a variable in your awk program causes it to exist. A variable can be a string, a number, or both. When it is set, it becomes the type of the expression on the right-hand side of the equal sign.
Uninitialized variables have the value zero or the value " ", depending on the context in which they are used.
name = "Nancy" name is a string x++ x is a number; x is initialized to zero and incremented by 1 number = 35 number is a number
To coerce a string to be a number:
name + 0
To coerce a number to be a string:
number " "
All fields and array elements created by the split function are considered strings, unless they contain only a numeric value. If a field or array element is null, it has the string value of null. An empty line is also considered to be a null string.
User-defined variables consist of letters, digits, and underscores, and cannot begin with a digit. Variables in awk are not declared. Awk infers data type by the context of the variable in the expression. If the variable is not initialized, awk initializes string variables to null and numeric variables to zero. If necessary, awk will convert a string variable to a numeric variable, and vice versa. Variables are assigned values with awk's assignment operators. See Table 7.1.
Operator | Meaning | Equivalence |
---|---|---|
= | a = 5 | a = 5 |
+= | a = a + 5 | a += 5 |
-= | a = a - 5 | a -= 5 |
*= | a = a * 5 | a *= 5 |
/= | a = a / 5 | a /= 5 |
%= | a = a % 5 | a %= 5 |
^= | a = a ^ 5 | a ^= 5 |
The simplest assignment takes the result of an expression and assigns it to a variable.
FORMATvariable = expression |
% nawk '$1 ~ /Tom/ {wage = $2 * $3; print wage}' filename
EXPLANATIONAwk will scan the first field for Tom and when there is a match, it will multiply the value of the second field by the value of the third field and assign the result to the user-defined variable wage. Since the multiplication operation is arithmetic, awk assigns wage an initial value of zero. (The % is the UNIX prompt and filename is an input file.) |
Increment and Decrement Operators. To add one to an operand, the increment operator is used. The expression x++ is equivalent to x = x + 1. Similarly, the decrement operator subtracts one from its operand. The expression x- - is equivalent to x = x - 1. This notation is useful in looping operations when you simply want to increment or decrement a counter. You can use the increment and decrement operators either preceding the operator, as in ++x, or after the operator, as in x++. If these expressions are used in assignment statements, their placement will make a difference in the result of the operation.
{x = 1; y = x++ ; print x, y}
The ++ here is called a post-increment operator; y is assigned the value of one, and then x is increased by one, so that when all is said and done, y will equal one, and x will equal two.
{x = 1; y = ++x; print x, y}
The ++ here is called a pre-increment operator; x is incremented first, and the value of two is assigned to y, so that when this statement is finished, y will equal two, and x will equal two.
User-Defined Variables at the Command Line. A variable can be assigned a value at the command line and passed into an awk script. For more on processing arguments, see "Processing Command Arguments (nawk)", ARGV, the array of command line arguments.
nawk F: f awkscript month=4 year=2001 filename
EXPLANATIONThe user-defined variables month and year are assigned the values 4 and 2001, respectively. In the awk script, these variables may be used as though they were created in the script. Note: If filename precedes the arguments, the variables will not be available in the BEGIN statements. (See "BEGIN Patterns".) |
The v Option (nawk). The v option provided by nawk allows command line arguments to be processed within a BEGIN statement. For each argument passed at the command line, there must be a v option preceding it.
Field Variables. Field variables can be used like user-defined variables, except they reference fields. New fields can be created by assignment. A field value that is referenced and has no value will be assigned the null string. If a field value is changed, the $0 variable is recomputed using the current value of OFS as a field separator. The number of fields allowed is usually limited to 100.
% nawk ' { $5 = 1000 * $3 / $2; print } ' filename
EXPLANATIONIf $5 does not exist, awk will create it and assign the result of the expression 1000 * $3 / $2 to the fifth field ($5). If the fifth field exists, the result will be assigned to it, overwriting what is there. |
% nawk ' $4 == "CA" { $4 = "California"; print}' filename
EXPLANATIONIf the fourth field ($4) is equal to the string CA, awk will reassign the fourth field to California. The double quotes are essential. Without them, the strings become user-defined variables with an initial value of null. |
Built-In Variables. Built-in variables have uppercase names. They can be used in expressions and can be reset. See Table 7.2 for a list of built-in variables.
Variable Name | Variable Contents |
---|---|
ARGC | Number of command line argument. |
ARGV | Array of command line arguments. |
FILENAME | Name of current input file. |
FNR | Record number in current file. |
FS | The input field separator, by default a space. |
NF | Number of fields in current record. |
NR | Number of records so far. |
OFMT | Output format for numbers. |
OFS | Output field separator. |
ORS | Output record separator. |
RLENGTH | Length of string matched by match function. |
RS | Input record separator. |
RSTART | Offset of string matched by match function. |
SUBSEP | Subscript separator. |
(The Employees Database) % cat employees2 Tom Jones:4423:5/12/66:543354 Mary Adams:5346:11/4/63:28765 Sally Chang:1654:7/22/54:650000 Mary Black:1683:9/23/44:336500 (The Command Line) % nawk F: '$1 == "Mary Adams"{print NR, $1, $2, $NF}' employees2 (The Output) 2 Mary Adams 5346 28765
EXPLANATIONThe F option sets the field separator to a colon. The print function prints the record number, the first field, the second field, and the last field ($NF). |
The BEGIN pattern is followed by an action block that is executed before awk processes any lines from the input file. In fact, a BEGIN block can be tested without any input file, since awk does not start reading input until the BEGIN action block has completed. The BEGIN action is often used to change the value of the built-in variables, OFS, RS, FS, and so forth, to assign initial values to user-defined variables and to print headers or titles as part of the output.
% nawk 'BEGIN{FS=":"; OFS="\t"; ORS="\n\n"}{print $1,$2,$3}' file
EXPLANATIONBefore the input file is processed, the field separator (FS) is set to a colon, the output field separator (OFS) to a tab, and the output record separator (ORS) to two newlines. If there are two or more statements in the action block, they should be separated with semicolons or placed on separate lines (use a backslash to escape the newline character if at the shell prompt). |
% nawk 'BEGIN{print "MAKE YEAR"}' make year
EXPLANATIONAwk will display MAKE YEAR. The print function is executed before awk opens the input file, and even though the input file has not been assigned, awk will still print MAKE and YEAR. When debugging awk scripts, you can test the BEGIN block actions before writing the rest of the program. |
END patterns do not match any input lines, but execute any actions that are associated with the END pattern. END patterns are handled after all lines of input have been processed.
% nawk 'END{print "The number of records is " NR }' filename The number of records is 4
EXPLANATIONThe END block is executed after awk has finished processing the file. The value of NR is the number of the last record read. |
% nawk '/Mary/{count++}END{print "Mary was found " count " times."}'\ employees Mary was found 2 times.
EXPLANATIONFor every line that contains the pattern sun, the value of the count variable is incremented by one. After awk has processed the entire file, the END block prints the string Sun was found, the value of count, and the string times. |
When redirecting output from within awk to a UNIX file, the shell redirection operators are used. The filename must be enclosed in double quotes. When the > symbol is used, the file is opened and truncated. Once the file is opened, it remains open until explicitly closed or the awk program terminates. Output from subsequent print statements to that file will be appended to the file.
The >> symbol is used to open the file, but does not clear it out; instead it simply appends to it.
% nawk '$4 >= 70 {print $1, $2 > "passing_file" }' filename
EXPLANATIONIf the value of the fourth field is greater than or equal to 70, the first and second fields will be printed to the file passing_ file. |
The getline Function. The getline function is used to read input from the standard input, a pipe, or a file other than the current file being processed. It gets the next line of input and sets the NF, NR, and the FNR built-in variables. The getline function returns one if a record is found and zero if EOF (end of file) is reached. If there is an error, such as failure to open a file, the getline function returns a value of -1.
% nawk 'BEGIN{ "date" | getline d; print d}' filename Thu Jan 14 11:24:24 PST 2001
EXPLANATIONWill execute the UNIX date command, pipe the output to getline, assign it to the user-defined variable d, and then print d. |
% nawk 'BEGIN{ "date " | getline d; split( d, mon) ; print mon[2]}'\ filename Jan
EXPLANATIONWill execute the date command and pipe the output to getline. The getline function will read from the pipe and store the input in a user-defined variable, d. The split function will create an array called mon out of variable d and then the second element of the array mon will be printed. |
% nawk 'BEGIN{while("ls" | getline) print}' a.out db dbook getdir file sortedf
EXPLANATIONWill send the output of the ls command to getline; for each iteration of the loop, getline will read one more line of the output from ls and then print it to the screen. An input file is not necessary, since the BEGIN block is processed before awk attempts to open input. |
(The Command Line) 1 % nawk 'BEGIN{ printf "What is your name?" ;\ getline name < "/dev/tty"}\ 2 $1 ~ name {print "Found " name " on line ", NR "."}\ 3 END{print "See ya, " name "."}' filename (The Output) What is your name? Ellie < Waits for input from user > Found Ellie on line 5. See ya, Ellie.
EXPLANATION
|
(The Command Line) % nawk 'BEGIN{while (getline < "/etc/passwd" > 0 )lc++; print lc}'\ file (The Output) 16
EXPLANATIONAwk will read each line from the /etc/passwd file, increment lc until EOF is reached, then print the value of lc, which is the number of lines in the passwd file.
Note The value returned by getline is negative one if the file does not exist. If the end of file is reached, the return value is zero, and if a line was read, the return value is one. Therefore, the command while ( getline < "/etc/junk") would start an infinite loop if the file /etc/junk did not exist, since the return value of negative one yields a true condition. |
If you open a pipe in an awk program, you must close it before opening another one. The command on the right-hand side of the pipe symbol is enclosed in double quotes. Only one pipe can be open at a time.
(The Database) % cat names john smith alice cheba george goldberg susan goldberg tony tram barbara nguyen elizabeth lone dan savage eliza goldberg john goldenrod (The Command Line) % nawk '{print $1, $2 | "sort r +1 2 +0 1 "}' names (The Output) tony tram john smith dan savage barbara nguyen elizabeth lone john goldenrod susan goldberg george goldberg eliza goldberg alice cheba
EXPLANATIONAwk will pipe the output of the print statement as input to the UNIX sort command, which does a reversed sort using the second field as the primary key and the first field as the secondary key. The UNIX command must be enclosed in double quotes. (See "sort" in Appendix A.) |
If you plan to use a file or pipe in an awk program again for reading or writing, you may want to close it first, since it remains open until the script ends. Once opened, the pipe remains open until awk exits. Therefore, statements in the END block will also be affected by the pipe. The first line in the END block closes the pipe.
(In Script) 1 { print $1, $2, $3 | " sort -r +1 -2 +0 -1"} END{ 2 close("sort r +1 2 +0 1") <rest of statements> }
EXPLANATION
|
The system Function. The built-in system function takes a UNIX system command as its argument, executes the command, and returns the exit status to the awk program. It is similar to the C standard library function, also called system(). The UNIX command must be enclosed in double quotes.
FORMATsystem( "UNIX Command") |
(In Script) { 1 system ( "cat" $1 ) 2 system ( "clear" ) }
EXPLANATION
|
% cat datafile northwest NW Joel Craig 3.0 .98 3 4 western WE Sharon Kelly 5.3 .97 5 23 southwest SW Chris Foster 2.7 .8 2 18 southern SO May Chin 5.1 .95 4 15 southeast SE Derek Johnson 4.0 .7 4 17 eastern EA Susan Beal 4.4 .84 5 20 northeast NE TJ Nichols 5.1 .94 3 13 north NO Val Shultz 4.5 .89 5 9 central CT Sheri Watson 5.7 .94 5 13
% nawk '/^north/{count += 1; print count}' datafile 1 2 3
% cat datafile northwest NW Joel Craig 3.0 .98 3 4 western WE Sharon Kelly 5.3 .97 5 23 southwest SW Chris Foster 2.7 .8 2 18 southern SO May Chin 5.1 .95 4 15 southeast SE Derek Johnson 4.0 .7 4 17 eastern EA Susan Beal 4.4 .84 5 20 northeast NE TJ Nichols 5.1 .94 3 13 north NO Val Shultz 4.5 .89 5 9 central CT Sheri Watson 5.7 .94 5 13
EXPLANATIONIf the record begins with the regular expression north, a user-defined variable, count, is created; count is incremented by 1 and its value is printed. |
% nawk '/^north/{count++; print count}' datafile 1 2 3
EXPLANATIONThe auto-increment operator increments the user-defined variable count by 1. The value of count is printed. |
% nawk '{x = $7--; print "x = "x ", $7 = "$7}' datafile x = 3, $7 = 2 x = 5, $7 = 4 x = 2, $7 = 1 x = 4, $7 = 3 x = 4, $7 = 3 x = 5, $7 = 4 x = 3, $7 = 2 x = 5, $7 = 4 x = 5, $7 = 4
EXPLANATIONAfter the value of the seventh field ($7) is assigned to the user-defined variable x, the auto-decrement operator decrements the seventh field by one. The value of x and the seventh field are printed. |
EXAMPLE 7.22 % nawk '/^north/{print "The record number is " NR}' datafile The record number is 1 The record number is 7 The record number is 8
EXPLANATIONIf the record begins with the regular expression north, the string The record number is and the value of NR (record number) are printed. |
% nawk '{print NR, $0}' datafile 1 northwest NW Joel Craig 3.0 .98 3 4 2 western WE Sharon Kelly 5.3 .97 5 23 3 southwest SW Chris Foster 2.7 .8 2 18 4 southern SO May Chin 5.1 .95 4 15 5 southeast SE Derek Johnson 4.0 .7 4 17 6 eastern EA Susan Beal 4.4 .84 5 20 7 northeast NE TJ Nichols 5.1 .94 3 13 8 north NO Val Shultz 4.5 .89 5 9 9 central CT Sheri Watson 5.7 .94 5 13
EXPLANATIONThe value of NR, the number of the current record, and the value of $0, the entire record, are printed. |
% nawk 'NR==2,NR==5{print NR, $0}' datafile 2 western WE Sharon Kelly 5.3 97 5 23 3 southwest SW Chris Foster 2.7 8 2 18 4 southern SO May Chin 5.1 95 4 15 5 southeast SE Derek Johnson 4.0 7 4 17
EXPLANATIONIf the value of NR is in the range between 2 and 5 (record numbers 2 5), the number of the record (NR) and the record ($0) are printed. |
% cat datafile northwest NW Joel Craig 3.0 .98 3 4 western WE Sharon Kelly 5.3 .97 5 23 southwest SW Chris Foster 2.7 .8 2 18 southern SO May Chin 5.1 .95 4 15 southeast SE Derek Johnson 4.0 .7 4 17 eastern EA Susan Beal 4.4 .84 5 20 northeast NE TJ Nichols 5.1 .94 3 13 north NO Val Shultz 4.5 .89 5 9 central CT Sheri Watson 5.7 .94 5 13
% nawk '/^north/{print NR, $1, $2, $NF, RS}' datafile 1 northwest NW 4 7 northeast NE 13 8 north NO 9
EXPLANATIONIf the record begins with the regular expression north, the number of the record (NR), followed by the first field, the second field, the value of the last record (NF preceded by a dollar sign) and the value of RS (a newline) are printed. Since the print function generates a newline by default, RS will generate another newline, resulting in double spacing between records. |
% cat datafile2 Joel Craig:northwest:NW:3.0:.98:3:4 Sharon Kelly:western:WE:5.3:.97:5:23 Chris Foster:southwest:SW:2.7:.8:2:18 May Chin:southern:SO:5.1:.95:4:15 Derek Johnson:southeast:SE:4.0:.7:4:17 Susan Beal:eastern:EA:4.4:.84:5:20 TJ Nichols:northeast:NE:5.1:.94:3:13 Val Shultz:north:NO:4.5:.89:5:9 Sheri Watson:central:CT:5.7:.94:5:131.
% nawk -F: 'NR == 5{print NF}' datafile2 7
EXPLANATIONThe field separator is set to a colon at the command line with the F option. If the number of the record (NR) is 5, the number of fields (NF) is printed. |
% nawk 'BEGIN{OFMT="%.2f";print 1.2456789,12E-2}' datafile2 1.25 0.12
EXPLANATIONThe OFMT, output format variable for the print function, is set so that floating point numbers will be printed with a decimal-point precision of two digits. The numbers 1.23456789 and 12E-2 are printed in the new format. |
% cat datafile northwest NW Joel Craig 3.0 .98 3 4 western WE Sharon Kelly 5.3 .97 5 23 southwest SW Chris Foster 2.7 .8 2 18 southern SO May Chin 5.1 .95 4 15 southeast SE Derek Johnson 4.0 .7 4 17 eastern EA Susan Beal 4.4 .84 5 20 northeast NE TJ Nichols 5.1 .94 3 13 north NO Val Shultz 4.5 .89 5 9 central CT Sheri Watson 5.7 .94 5 13
% nawk '{$9 = $6 * $7; print $9}' datafile 2.94 4.85 1.6 3.8 2.8 4.2 2.82 4.45 4.7
EXPLANATIONThe result of multiplying the sixth field ($6) and the seventh field ($7) is stored in a new field, $9, and printed. There were eight fields; now there are nine. |
% nawk '{$10 = 100; print NF, $9, $0}' datafile 10 northwest NW Joel Craig 3.0 .98 3 4 100 10 western WE Sharon Kelly 5.3 .97 5 23 100 10 southwest SW Chris Foster 2.7 .8 2 18 100 10 southern SO May Chin 5.1 .95 4 15 100 10 southeast SE Derek Johnson 4.0 .7 4 17 100 10 eastern EA Susan Beal 4.4 .84 5 20 100 10 northeast NE TJ Nichols 5.1 .94 3 13 100 10 north NO Val Shultz 4.5 .89 5 9 100 10 central CT Sheri Watson 5.7 .94 5 13 100
EXPLANATIONThe tenth field ($10) is assigned 100 for each record. This is a new field. The ninth field ($9) does not exist, so it will be considered a null field. The number of fields is printed (NF), followed by the value of $9, the null field, and the entire record ($0). The value of the tenth field is 100. |
% nawk 'BEGIN{print "---------EMPLOYEES---------"}' ---------EMPLOYEES---------
EXPLANATIONThe BEGIN pattern is followed by an action block. The action is to print out the string ---------EMPLOYEES--------- before opening the input file. Note that an input file has not been provided and awk does not complain. |
% nawk 'BEGIN{print "\t\t---------EMPLOYEES-------\n"}\ {print $0}' datafile ---------EMPLOYEES------- northwest NW Joel Craig 3.0 .98 3 4 western WE Sharon Kelly 5.3 .97 5 23 southwest SW Chris Foster 2.7 .8 2 18 southern SO May Chin 5.1 .95 4 15 southeast SE Derek Johnson 4.0 .7 4 17 eastern EA Susan Beal 4.4 .84 5 20 northeast NE TJ Nichols 5.1 .94 3 13 north NO Val Shultz 4.5 .89 5 9 central CT Sheri Watson 5.7 .94 5 13
EXPLANATIONThe BEGIN action block is executed first. The title ---------EMPLOYEES------- is printed. The second action block prints each record in the input file. When breaking lines, the backslash is used to suppress the carriage return. Lines can be broken at a semicolon or a curly brace. |
% cat datafile2 Joel Craig:northwest:NW:3.0:.98:3:4 Sharon Kelly:western:WE:5.3:.97:5:23 Chris Foster:southwest:SW:2.7:.8:2:18 May Chin:southern:SO:5.1:.95:4:15 Derek Johnson:southeast:SE:4.0:.7:4:17 Susan Beal:eastern:EA:4.4:.84:5:20 TJ Nichols:northeast:NE:5.1:.94:3:13 Val Shultz:north:NO:4.5:.89:5:9 Sheri Watson:central:CT:5.7:.94:5:131.
% nawk 'BEGIN{ FS=":";OFS="\t"};/^Sharon/{print $1, $2, $8 }'\ datafile2 Sharon Kelly western 28
EXPLANATIONThe BEGIN action block is used to initialize variables. The FS variable (field separator) is assigned a colon. The OFS variable (output field separator) is assigned a tab (\t). If a record begins with the regular expression Sharon, the first, second, and eighth fields ($1, $2, $8) are printed. Each field in the output is separated by a tab. |
% cat datafile northwest NW Joel Craig 3.0 .98 3 4 western WE Sharon Kelly 5.3 .97 5 23 southwest SW Chris Foster 2.7 .8 2 18 southern SO May Chin 5.1 .95 4 15 southeast SE Derek Johnson 4.0 .7 4 17 eastern EA Susan Beal 4.4 .84 5 20 northeast NE TJ Nichols 5.1 .94 3 13 north NO Val Shultz 4.5 .89 5 9 central CT Sheri Watson 5.7 .94 5 13
% nawk 'END{print "The total number of records is " NR}' datafile The total number of records is 9
EXPLANATIONAfter awk has finished processing the input file, the statements in the END block are executed. The string The total number of records is is printed, followed by the value of NR, the number of the last record. |
% nawk '/^north/{count++}END{print count}' datafile 3
EXPLANATIONIf the record begins with the regular expression north, the user-defined variable count is incremented by one. When awk has finished processing the input file, the value stored in the variable count is printed. |
# Second awk script-- awk.sc2 1 BEGIN{ FS=":"; OFS="\t" print " NAME\t\tDISTRICT\tQUANTITY" print "___________________________________________\n" } 2 {print $1"\t " $3"\t\t" $7} {total+=$7} /north/{count++} 3 END{ print "---------------------------------------------" print "The total quantity is " total print "The number of northern salespersons is " count "." } (The Output) 4 % nawk -f awk.sc2 datafile2 NAME DISTRICT QUANTITY ___________________________________________ Joel Craig NW 4 Sharon Kelly WE 23 Chris Foster SW 18 May Chin SO 15 Derek Johnson SE 17 Susan Beal EA 20 TJ Nichols NE 13 Val Shultz NO 9 Sheri Watson CT 13 --------------------------------------------- The total quantity is 132 The number of northern salespersons is 3.
% cat datafile2 Joel Craig:northwest:NW:3.0:.98:3:4 Sharon Kelly:western:WE:5.3:.97:5:23 Chris Foster:southwest:SW:2.7:.8:2:18 May Chin:southern:SO:5.1:.95:4:15 Derek Johnson:southeast:SE:4.0:.7:4:17 Susan Beal:eastern:EA:4.4:.84:5:20 TJ Nichols:northeast:NE:5.1:.94:3:13 Val Shultz:north:NO:4.5:.89:5:9 Sheri Watson:central:CT:5.7:.94:5:131.
EXPLANATION
|
% nawk '{printf "$%6.2f\n",$6 * 100}' datafile $ 98.00 $ 97.00 $ 80.00 $ 95.00 $ 70.00 $ 84.00 $ 94.00 $ 89.00 $ 94.00
EXPLANATIONThe printf function formats a floating point number to be right-justified (the default) with a total of 6 digits, one for the decimal point, and two for the decimal numbers to the right of the period. The number will be rounded up and printed. |
% nawk '{printf "|%-15s|\n",$4}' datafile |Craig | |Kelly | |Foster | |Chin | |Johnson | |Beal | |Nichols | |Shultz | |Watson |
EXPLANATIONA left-justified, 15-space string is printed. The fourth field ($4) is printed enclosed in vertical bars to illustrate the spacing. |
% cat datafile northwest NW Joel Craig 3.0 .98 3 4 western WE Sharon Kelly 5.3 .97 5 23 southwest SW Chris Foster 2.7 .8 2 18 southern SO May Chin 5.1 .95 4 15 southeast SE Derek Johnson 4.0 .7 4 17 eastern EA Susan Beal 4.4 .84 5 20 northeast NE TJ Nichols 5.1 .94 3 13 north NO Val Shultz 4.5 .89 5 9 central CT Sheri Watson 5.7 .94 5 13
% nawk '/north/{print $1, $3, $4 > "districts"}' datafile % cat districts northwest Joel Craig northeast TJ Nichols north Val Shultz
EXPLANATIONIf the record contains the regular expression north, the first, third, and fourth fields ($1, $3, $4) are printed to an output file called districts. Once the file is opened, it remains open until closed or the program terminates. The filename "districts" must be enclosed in double quotes. |
% nawk '/south/{print $1, $2, $3 >> "districts"}' datafile % cat districts northwest Joel Craig northeast TJ Nichols north Val Shultz southwest SW Chris southern SO May southeast SE Derek
EXPLANATIONIf the record contains the pattern south, the first, second, and third fields ($1, $2, $3) are appended to the output file districts. |
# awk script using pipes -- awk.sc3 1 BEGIN{ 2 printf " %-22s%s\n", "NAME", "DISTRICT" print "--------------------------------------" 3 } 4 /west/{count++} 5 {printf "%s %s\t\t%-15s\n", $3, $4, $1| "sort +1" } 6 END{ 7 close "sort +1" printf "The number of sales persons in the western " printf "region is " count "."} (The Output) % nawk -f awk.sc3 datafile 1 NAME DISTRICT 2 -------------------------------------------------- 3 Susan Beal eastern May Chin southern Joel Craig northwest Chris Foster southwest Derek Johnson southeast Sharon Kelly western TJ Nichols northeast Val Shultz north Sheri Watson central The number of sales persons in the western region is 3.
EXPLANATION
|
Mike Harrington:(510) 548-1278:250:100:175
Christian Dobbins:(408) 538-2358:155:90:201
Susan Dalsass:(206) 654-6279:250:60:50
Archie McNichol:(206) 548-1348:250:100:175
Jody Savage:(206) 548-1278:15:188:150
Guy Quigley:(916) 343-6410:250:100:175
Dan Savage:(406) 298-7744:450:300:275
Nancy McNeil:(206) 548-1278:250:80:75
John Goldenrod:(916) 348-4278:250:100:175
Chet Main:(510) 548-5258:50:95:135
Tom Savage:(408) 926-3456:250:168:200
Elizabeth Stachelin:(916) 440-1763:175:75:300
(Refer to the database called lab5.data on the CD.)
The database above contains the names, phone numbers, and money contributions to the party campaign for the past three months.
Write a nawk script to produce the following output:
% nawk -f nawk.sc db ***CAMPAIGN 1998 CONTRIBUTIONS*** ---------------------------------------------------------------------- NAME PHONE Jan | Feb | Mar | Total Donated ---------------------------------------------------------------------- Mike Harrington (510) 548-1278 250.00 100.00 175.00 525.00 Christian Dobbins (408) 538-2358 155.00 90.00 201.00 446.00 Susan Dalsass (206) 654-6279 250.00 60.00 50.00 360.00 Archie McNichol (206) 548-1348 250.00 100.00 175.00 525.00 Jody Savage (206) 548-1278 15.00 188.00 150.00 353.00 Guy Quigley (916) 343-6410 250.00 100.00 175.00 525.00 Dan Savage (406) 298-7744 450.00 300.00 275.00 1025.00 Nancy McNeil (206) 548-1278 250.00 80.00 75.00 405.00 John Goldenrod (916) 348-4278 250.00 100.00 175.00 525.00 Chet Main (510) 548-5258 50.00 95.00 135.00 280.00 Tom Savage (408) 926-3456 250.00 168.00 200.00 618.00 Elizabeth Stacheli (916) 440-1763 175.00 75.00 300.00 550.00 ---------------------------------------------------------------------- SUMMARY ---------------------------------------------------------------------- The campaign received a total of $6137.00 for this quarter. The average donation for the 12 contributors was $511.42. The highest contribution was $300.00. The lowest contribution was $15.00.
The conditional statements in awk were borrowed from the C language. They are used to control the flow of the program in making decisions.
Statements beginning with the if construct are action statements. With conditional patterns, the if is implied; with a conditional action statement, the if is explicitly stated, and followed by an expression enclosed in parentheses. If the expression evaluates true (nonzero or non-null), the statement or block of statements following the expression is executed. If there is more than one statement following the conditional expression, the statements are separated either by semicolons or a newline, and the group of statements must be enclosed in curly braces so that the statements are executed as a block.
FORMATif (expression) { statement; statement; ... } |
1 % nawk '{if ( $6 > 50 ) print $1 "Too high"}' filename 2 % nawk '{if ($6 > 20 && $6 <= 50){safe++; print "OK"}}' filename
EXPLANATION
|
The if/else statement allows a two-way decision. If the expression after the if keyword is true, the block of statements associated with that expression are executed. If the first expression evaluates to false or zero, the block of statements after the else keyword is executed. If multiple statements are to be included with the if or else, they must be blocked with curly braces.
FORMAT{if (expression) { statement; statement; ... } else{ statement; statement; ... } } |
1 % nawk '{if( $6 > 50) print $1 " Too high" ;\ else print "Range is OK"}' filename 2 % nawk '{if ( $6 > 50 ) { count++; print $3 } \ else { x+5; print $2 } }' filename
EXPLANATION
|
The if/else and else if statements allow a multiway decision. If the expression following the keyword if is true, the block of statements associated with that expression is executed and control starts again after the last closing curly brace associated with the final else. Otherwise, control goes to the else if and that expression is tested. When the first else if condition is true, the statements following the expression are executed. If none of the conditional expressions test true, control goes to the else statements. The else is called the default action because if none of the other statements are true, the else block is executed.
FORMAT{if (expression) { statement; statement; ... } else if (expression){ statement; statement; ... } else if (expression){ statement; statement; ... } else{ statement } } |
(In the Script) 1 {if ( $3 > 89 && $3 < 101 ) Agrade++ 2 else if ( $3 > 79 ) Bgrade++ 3 else if ( $3 > 69 ) Cgrade++ 4 else if ( $3 > 59 ) Dgrade++ 5 else Fgrade++ } END{print "The number of failures is" Fgrade }
EXPLANATION
|
Loops are used to repeatedly execute the statements following the test expression if a condition is true. Loops are often used to iterate through the fields within a record and to loop through the elements of an array in the END block. Awk has three types of loops: the while loop, the for loop, and the special for loop, which will be discussed later when working with awk arrays.
The first step in using a while loop is to set a variable to an initial value. The value is then tested in the while expression. If the expression evaluates to true (nonzero), the body of the loop is entered and the statements within that body are executed. If there is more than one statement within the body of the loop, those statements must be enclosed in curly braces. Before ending the loop block, the variable controlling the loop expression must be updated or the loop will continue forever. In the following example, the variable is reinitialized each time a new record is processed.
The do/while loop is similar to the while loop, except that the expression is not tested until the body of the loop is executed at least once.
% nawk '{ i = 1; while ( i <= NF ) { print NF, $i ; i++ } }' filename
EXPLANATIONThe variable i is initialized to one; while i is less than or equal to the number of fields (NF) in the record, the print statement will be executed, then i will be incremented by one. The expression will then be tested again, until the variable i is greater than the value of NF. The variable i is not reinitialized until awk starts processing the next record. |
The for loop and while loop are essentially the same, except the for loop requires three expressions within the parentheses: the initialization expression, the test expression, and the expression to update the variables within the test expression. In awk, the first statement within the parentheses of the for loop can perform only one initialization. (In C, you can have multiple initializations separated by commas.)
% nawk '{ for( i = 1; i <= NF; i++) print NF,$i }' filex
EXPLANATIONThe variable i is initialized to one and tested to see whether it is less than or equal to the number of fields (NF) in the record. If so, the print function prints the value of NF and the value of $i (the $ preceding the i is the number of the ith field), then i is incremented by one. (Frequently the for loop is used with arrays in an END action to loop through the elements of an array.) See "Arrays". |
break and continue Statements. The break statement lets you break out of a loop if a certain condition is true. The continue statement causes the loop to skip any statements that follow if a certain condition is true, and returns control to the top of the loop, starting at the next iteration.
(In the Script) 1 {for ( x = 3; x <= NF; x++ ) if ( $x < 0 ){ print "Bottomed out!"; break} # breaks out of for loop } 2 {for ( x = 3; x <= NF; x++ ) if ( $x == 0 ) { print "Get next item"; continue} # starts next iteration of the for loop }
EXPLANATION
|
The next statement gets the next line of input from the input file, restarting execution at the top of the awk script.
(In Script) { if ($1 ~ /Peter/){next} else {print} }
EXPLANATIONIf the first field contains Peter, awk skips over this line and gets the next line from the input file. The script resumes execution at the beginning. |
The exit statement is used to terminate the awk program. It stops processing records, but does not skip over an END statement. If the exit statement is given a value between 0 and 255 as an argument (exit 1), this value can be printed at the command line to indicate success or failure by typing:
(In Script) {exit (1) } (The Command Line) % echo $status (csh) 1 $ echo $? (sh/ksh) 1
EXPLANATIONAn exit status of zero indicates success, and an exit value of nonzero indicates failure (a convention in UNIX). It is up to the programmer to provide the exit status in a program. The exit value returned in this example is 1. |
Arrays in awk are called associative arrays because the subscripts can be either numbers or strings. The subscript is often called the key and is associated with the value assigned to the corresponding array element. The keys and values are stored internally in a table where a hashing algorithm is applied to the value of the key in question. Due to the techniques used for hashing, the array elements are not stored in a sequential order, and when the contents of the array are displayed, they may not be in the order you expected.
An array, like a variable, is created by using it, and awk can infer whether it is used to store numbers or strings. Array elements are initialized with numeric value zero and string value null, depending on the context. You do not have to declare the size of an awk array. Awk arrays are used to collect information from records and may be used for accumulating totals, counting words, tracking the number of times a pattern occurred, and so forth.
Using Variables as Array Indexes
(The Input File) % cat employees Tom Jones 4424 5/12/66 543354 Mary Adams 5346 11/4/63 28765 Sally Chang 1654 7/22/54 650000 Billy Black 1683 9/23/44 336500 (The Command Line) 1 % nawk '{name[x++]=$2};END{for(i=0; i<NR; i++)\ print i, name[i]}' employees 0 Jones 1 Adams 2 Chang 3 Black 2 % nawk '{id[NR]=$3};END{for(x = 1; x <= NR; x++)\ print id[x]}' employees 4424 5346 1654 1683
EXPLANATION
|
The Special for Loop. The special for loop is used to read through an associative array in cases where the for loop is not practical; that is, when strings are used as subscripts or the subscripts are not consecutive numbers. The special for loop uses the subscript as a key into the value associated with it.
FORMAT{for(item in arrayname){ print arrayname[item] } } |
(The Input File) % cat db Tom Jones Mary Adams Sally Chang Billy Black Tom Savage Tom Chung Reggie Steel Tommy Tucker (The Command Line, for Loop) 1 % nawk '/^Tom/{name[NR]=$1};\ END{for( i = 1; i <= NR; i++ )print name[i]}' db Tom Tom Tom Tommy (The Command Line, Special for Loop) 2 % nawk '/^Tom/{name[NR]=$1};\ END{for(i in name){print name[i]}}' db Tom Tommy Tom Tom
EXPLANATION
|
Using Strings as Array Subscripts. A subscript may consist of a variable containing a string or literal string. If the string is a literal, it must be enclosed in double quotes.
(The Input File) % cat datafile3 tom mary sean tom mary mary bob mary alex (The Script) # awk.sc script 1 /tom/ { count["tom"]++ } 2 /mary/ { count["mary"]++ } 3 END{print "There are " count["tom"] " Toms in the file and " count["mary"]" Marys in the file."} (The Command Line) % nawk -f awk.sc datafile3 There are 2 Toms in the file and 4 Marys in the file.
EXPLANATION
|
Using Field Values as Array Subscripts. Any expression can be used as a subscript in an array. Therefore, fields can be used. The program in Example 7.52 counts the frequency of all names appearing in the second field and introduces a new form of the for loop.
(The Input File) % cat datafile4 4234 Tom 43 4567 Arch 45 2008 Eliza 65 4571 Tom 22 3298 Eliza 21 4622 Tom 53 2345 Mary 24 (The Command Line) % nawk '{count[$2]++}END{for(name in count)print name,count[name] }'\ datafile4 Tom 3 Arch 1 Eliza 2 Mary 1
EXPLANATIONThe awk statement first will use the second field as an index in the count array. The index varies as the second field varies, thus the first index in the count array is Tom and the value stored in count["Tom"] is one. Next, count["Arch"] is set to one, count["Eliza"] to one, and count["Mary"] to one. When awk finds the next occurrence of Tom in the second field, count["Tom"] is incremented, now containing the value 2. The same thing happens for each occurrence of Arch, Eliza, and Mary. |
for( index_value in array ) statement
The for loop found in the END block of the previous example works as follows: The variable name is set to the index value of the count array. After each iteration of the for loop, the print action is performed, first printing the value of the index, and then the value stored in that element. (The order of the printout is not guaranteed.)
(The Input File) % cat datafile4 4234 Tom 43 4567 Arch 45 2008 Eliza 65 4571 Tom 22 3298 Eliza 21 4622 Tom 53 2345 Mary 24 (The Command Line) % nawk '{dup[$2]++; if (dup[$2] > 1){name[$2]++ }}\ END{print "The duplicates were"\ for (i in name){print i, name[i]}}' datafile4 (The Output) Tom 2 Eliza 2
EXPLANATIONThe subscript for the dup array is the value in the second field, that is, the name of a person. The value stored there is initially zero, and it is incremented by one each time a new record is processed. If the name is a duplicate, the value stored for that subscript will go up to two, and so forth. If the value in the dup array is greater than one, a new array called name also uses the second field as a subscript and keeps track of the number of names greater than one. |
Arrays and the split Function. Awk's built-in split function allows you to split a string into words and store them in an array. You can define the field separator or use the value currently stored in FS.
FORMATsplit(string, array, field separator) split (string, array) |
(The Command Line) % nawk BEGIN{ split( "3/15/2001", date, "/");\ print "The month is " date[1] "and the year is "date[3]"} \ filename (The Output) The month is 3 and the year is 2001.
EXPLANATIONThe string 3/15/2001 is stored in the array date, using the forward slash as the field separator. Now date[1] contains 3, date[2] contains 15, and date[3] contains 2001. The field separator is specified in the third argument; if not specified, the value of FS is used as the separator. |
The delete Function. The delete function removes an array element.
% nawk '{line[x++]=$2}END{for(x in line) delete(line[x])}'\ filename
EXPLANATIONThe value assigned to the array line is the value of the second field. After all the records have been processed, the special for loop will go through each element of the array, and the delete function will in turn remove each element. |
Multidimensional Arrays (nawk). Although awk does not officially support multidimensional arrays, a syntax is provided that gives the appearance of a multidimensional array. This is done by concatenating the indices into a string separated by the value of a special built-in variable, SUBSEP. The SUBSEP variable contains the value "\034," an unprintable character that is so unusual that it is unlikely to be found as an index character. The expression matrix[2,8] is really the array matrix[2 SUBSEP 8], which evaluates to matrix["2\0348"]. The index becomes a unique string for an associative array.
(The Input File) 1 2 3 4 5 2 3 4 5 6 6 7 8 9 10 (The Script) 1 {nf=NF 2 for(x = 1; x <= NF; x++ ){ 3 matrix[NR, x] = $x } } 4 END { for (x=1; x <= NR; x++ ){ for (y = 1; y <= nf; y++ ) printf "%d ", matrix[x,y] printf"\n" } } (The Output) 1 2 3 4 5 2 3 4 5 6 6 7 8 9 10
EXPLANATION
|
ARGV. Command line arguments are available to nawk (the new version of awk) with the built-in array called ARGV. These arguments include the command nawk, but not any of the options passed to nawk. The index of the ARGV array starts at zero. (This works only for nawk.)
ARGC. ARGC is a built-in variable that contains the number of command line arguments.
(The Script) # This script is called argvs BEGIN{ for ( i=0; i < ARGC; i++ ){ printf("argv[%d] is %s\n", i, ARGV[i]) } printf("The number of arguments, ARGC=%d\n", ARGC) } (The Output) % nawk f argvs datafile argv[0] is nawk argv[1] is datafile The number of arguments, ARGC=2
EXPLANATIONIn the for loop, i is set to zero, i is tested to see if it is less than the number of command line arguments (ARGC), and the printf function displays each argument encountered, in turn. When all of the arguments have been processed, the last printf statement outputs the number of arguments, ARGC. The example demonstrates that awk does not count command line options as arguments. |
(The Command Line) % nawk f argvs datafile "Peter Pan" 12 argv[0] is nawk argv[1] is datafile argv[2] is Peter Pan argv[3] is 12 The number of arguments, ARGC=4
EXPLANATIONAs in the last example, each of the arguments is printed. The nawk command is considered the first argument, whereas the f option and script name, argvs, are excluded. |
(The Datafile) % cat datafile5 Tom Jones:123:03/14/56 Peter Pan:456:06/22/58 Joe Blow:145:12/12/78 Santa Ana:234:02/03/66 Ariel Jones:987:11/12/66 (The Script) % cat arging.sc # This script is called arging.sc 1 BEGIN{FS=":"; name=ARGV[2] 2 print "ARGV[2] is "ARGV[2] } $1 ~ name { print $0 } (The Command Line) % nawk f arging.sc datafile5 "Peter Pan" ARGV[2] is Peter Pan Peter Pan:456:06/22/58 nawk: can't open Peter Pan input record number 5, file Peter Pan source line number 2
EXPLANATION
|
(The Script) % cat arging2.sc BEGIN{FS=":"; name=ARGV[2] print "ARGV[2] is " ARGV[2] delete ARGV[2] } $1 ~ name { print $0 } (The Command Line) % nawk f arging2.sc datafile "Peter Pan" ARGV[2] is Peter Pan Peter Pan:456:06/22/58
EXPLANATIONAwk treats the elements of the ARGV array as input files; after an argument is used, it is shifted to the left and the next one is processed, until the ARGV array is empty. If the argument is deleted immediately after it is used, it will not be processed as the next input file. |
The sub and gsub Functions. The sub function matches the regular expression for the largest and leftmost substring in the record, and then replaces that substring with the substitution string. If a target string is specified, the regular expression is matched for the largest and leftmost substring in the target string, and the substring is replaced with the substitution string. If a target string is not specified, the entire record is used.
FORMATsub (regular expression, substitution string); sub (regular expression, substitution string, target string) |
1 % nawk '{sub(/Mac/, "MacIntosh"); print}' filename 2 % nawk '{sub(/Mac/, "MacIntosh", $1); print}' filename
EXPLANATION
|
FORMATgsub(regular expression, substitution string) gsub(regular expression, substitution string, target string) |
1 % nawk '{ gsub(/CA/, "California"); print }' datafile 2 % nawk '{ gsub(/[Tt]om/, "Thomas", $1 ); print }' filename
EXPLANATION
|
The index Function. The index function returns the first position where a substring is found in a string. Offset starts at position 1.
FORMATindex(string, substring) |
% nawk '{ print index("hollow", "low") }' filename 4
EXPLANATIONThe number returned is the position where the substring low is found in hollow, with the offset starting at one. |
The length Function. The length function returns the number of characters in a string. Without an argument, the length function returns the number of characters in a record.
FORMATlength ( string ) length |
% nawk '{ print length("hello") }' filename 5
EXPLANATIONThe length function returns the number of characters in the string hello. |
The substr Function. The substr function returns the substring of a string starting at a position where the first position is one. If the length of the substring is given, that part of the string is returned. If the specified length exceeds the actual string, the string is returned.
FORMATsubstr(string, starting position) substr(string, starting position, length of string) |
% nawk ' { print substr("Santa Claus", 7, 6 )} ' filename Claus
EXPLANATIONIn the string Santa Claus, print the substring starting at position 7 with a length of 6 characters. |
The match Function. The match function returns the index where the regular expression is found in the string, or zero if not found. The match function sets the built-in variable RSTART to the starting position of the substring within the string, and RLENGTH to the number of characters to the end of the substring. These variables can be used with the substr function to extract the pattern. (Works only with nawk.)
FORMATmatch(string, regular expression) |
% nawk 'END{start=match("Good ole USA", /[A Z]+$/); print start}'\ filename 10
EXPLANATIONThe regular expression /[A Z]+$/ says search for consecutive uppercase letters at the end of the string. The substring USA is found starting at the tenth character of the string Good ole USA. If the string cannot be matched, 0 is returned. |
1 % nawk 'END{start=match("Good ole USA", /[A Z]+$/);\ print RSTART, RLENGTH}' filename 10 3 2 % nawk 'BEGIN{ line="Good ole USA"}; \ END{ match( line, /[A Z]+$/);\ print substr(line, RSTART,RLENGTH)}' filename USA
EXPLANATION
|
The split Function. The split function splits a string into an array using whatever field separator is designated as the third parameter. If the third parameter is not provided, awk will use the current value of FS.
FORMATsplit (string, array, field separator) split (string, array) |
% awk 'BEGIN{split("12/25/2001",date,"/");print date[2]}' filename 25
EXPLANATIONThe split function splits the string 12/25/2001 into an array, called date, using the forward slash as the separator. The array subscript starts at 1. The second element of the date array is printed. |
The sprintf Function. The sprintf function returns an expression in a specified format. It allows you to apply the format specifications of the printf function.
FORMATvariable=sprintf("string with format specifiers ", expr1, expr2, \ ... , expr2) |
% awk '{line = sprintf ( "% 15s %6.2f ", $1 , $3 );\ print line}' filename
EXPLANATIONThe first and third fields are formatted according to the printf specifications (a left-justified, 15-space string and a right-justified, 6-character floating point number). The result is assigned to the user-defined variable line. See "The printf Function". |
Table 7.3 lists the built-in arithmetic functions, where x and y are arbitrary expressions.
Name | Value Returned |
---|---|
atan2(x,y) | Arctangent of y/x in the range. |
cos(x) | Cosine of x, with x in radians. |
exp(x) | Exponential function of x, e. |
int(x) | Integer part of x; truncated toward 0 when x > 0. |
log(x) | Natural (base e) logarithm of x. |
rand( ) | Random number r, where 0 < r < 1. |
sin(x) | Sine of x, with x in radians. |
sqrt(x) | Square root of x. |
srand(x) | x is a new seed for rand( ).[a] |
[a] From Alfred V. Aho, Brian W. Kernighan, and Peter J. Weinberger, The AWK Programming Language (Boston: Addison-Wesley, 1988). 1988 Bell Telephone Laboratories, Inc. Reprinted by permission of Pearson Education, Inc.
The int function truncates any digits to the right of the decimal point to create a whole number. There is no rounding off.
1 % awk 'END{print 31/3}' filename 10.3333 2 % awk 'END{print int(31/3})' filename 10
EXPLANATION
|
The rand Function. The rand function generates a pseudorandom floating point number greater than or equal to zero and less than one.
% nawk '{print rand()}' filename 0.513871 0.175726 0.308634 % nawk '{print rand()}' filename 0.513871 0.175726 0.308634
EXPLANATIONEach time the program runs, the same set of numbers is printed. The srand function can be used to seed the rand function with a new starting value. Otherwise, as in this example, the same sequence is repeated each time rand is called. |
The srand Function. The srand function without an argument uses the time of day to generate the seed for the rand function. Srand(x) uses x as the seed. Normally, x should vary during the run of the program.
% nawk 'BEGIN{srand()};{print rand()}' filename 0.508744 0.639485 0.657277 % nawk 'BEGIN{srand()};{print rand()}' filename 0.133518 0.324747 0.691794
EXPLANATIONThe srand function sets a new seed for rand. The starting point is the time of day. Each time rand is called, a new sequence of numbers is printed. |
% nawk 'BEGIN{srand()};{print 1 + int(rand() * 25)}' filename 6 24 14
EXPLANATIONThe srand function sets a new seed for rand. The starting point is the time of day. The rand function selects a random number between 0 and 25 and casts it to an integer value. |
A user-defined function can be placed anywhere in the script that a pattern action rule can.
FORMATfunction name ( parameter, parameter, parameter, ... ) { statements return expression (The return statement and expression are optional) } |
Variables are passed by value and are local to the function where they are used. Only copies of the variables are used. Arrays are passed by address or by reference, so array elements can be directly changed within the function. Any variable used within the function that has not been passed in the parameter list is considered a global variable; that is, it is visible to the entire awk program, and if changed in the function, is changed throughout the program. The only way to provide local variables within a function is to include them in the parameter list. Such parameters are usually placed at the end of the list. If there is not a formal parameter provided in the function call, the parameter is initially set to null. The return statement returns control and possibly a value to the caller.
(The Command Line Display of grades File before Sort) % cat grades 44 55 66 22 77 99 100 22 77 99 33 66 55 66 100 99 88 45 (The Script) % cat sorter.sc # Script is called sorter # It sorts numbers in ascending order 1 function sort ( scores, num_elements, temp, i, j ) { # temp, i, and j will be local and private, # with an initial value of null. 2 for( i = 2; i <= num_elements ; ++i ) { 3 for ( j = i; scores [j 1] > scores[j]; j ){ temp = scores[j] scores[j] = scores[j 1] scores[j 1] = temp } 4 } 5 } 6 {for ( i = 1; i <= NF; i++) grades[i]=$i 7 sort(grades, NF) # Two arguments are passed 8 for( j = 1; j <= NF; ++j ) printf( "%d ", grades[j] ) printf("\n") } (After the Sort) % nawk f sorter.sc grades 22 44 55 66 77 99 22 33 66 77 99 100 45 55 66 88 99 100
EXPLANATION
|
% cat datafile northwest NW Joel Craig 3.0 .98 3 4 western WE Sharon Kelly 5.3 .97 5 23 southwest SW Chris Foster 2.7 .8 2 18 southern SO May Chin 5.1 .95 4 15 southeast SE Derek Johnson 4.0 .7 4 17 eastern EA Susan Beal 4.4 .84 5 20 northeast NE TJ Nichols 5.1 .94 3 13 north NO Val Shultz 4.5 .89 5 9 central CT Sheri Watson 5.7 .94 5 13
% nawk '{if ( $8 > 15 ){ print $3 " has a high rating"}\ else print $3 "---NOT A COMPETITOR---"}' datafile Joel---NOT A COMPETITOR--- Sharon has a high rating Chris has a high rating May---NOT A COMPETITOR--- Derek has a high rating Susan has a high rating TJ---NOT A COMPETITOR--- Val---NOT A COMPETITOR--- Sheri---NOT A COMPETITOR---
EXPLANATIONThe if statement is an action statement. If there is more than one statement following the expression, it must be enclosed in curly braces. (Curly braces are not required in this example, since there is only one statement following the expression.) The expression reads if the eighth field is greater than 15, print the third field and the string has a high rating; else print the third field and ---NOT A COMPETITOR---. |
% nawk '{i=1; while(i<=NF && NR < 2){print $i; i++}}' datafile northwest NW Joel Craig 3.0 .98 3 4
EXPLANATIONThe user-defined variable i is assigned 1. The while loop is entered and the expression tested. If the expression evaluates true, the print statement is executed; the value of the ith field is printed. The value of i is printed, next the value is incremented by 1, and the loop is reentered. The loop expression will become false when the value of i is greater than NF and the value of NR is two or more. The variable i will not be reinitialized until the next record is entered. |
% cat datafile northwest NW Joel Craig 3.0 .98 3 4 western WE Sharon Kelly 5.3 .97 5 23 southwest SW Chris Foster 2.7 .8 2 18 southern SO May Chin 5.1 .95 4 15 southeast SE Derek Johnson 4.0 .7 4 17 eastern EA Susan Beal 4.4 .84 5 20 northeast NE TJ Nichols 5.1 .94 3 13 north NO Val Shultz 4.5 .89 5 9 central CT Sheri Watson 5.7 .94 5 13
% nawk '{ for( i=3 ; i <= NF && NR == 3 ; i++ ){ print $i }}' datafile Chris Foster 2.7 .8 2 18
EXPLANATIONThis is similar to the while loop in functionality. The initialization, test, and loop control statements are all in one expression. The value of i (i = 3) is initialized once for the current record. The expression is then tested. If i is less than or equal to NF, and NR is equal to 3, the print block is executed. After the value of the ith field is printed, control is returned to the loop expression. The value of i is incremented and the test is repeated. |
(The Command Line) % cat nawk.sc4 # Awk script illustrating arrays BEGIN{OFS="\t"} { list[NR] = $1 } # The array is called list. The index in the # number of the current record. The value of the # first field is assigned to the array element. END{ for( n = 1; n <= NR; n++){ print list[n]} # for loop is used to loop # through the array. } (The Command Line) % nawk -f nawk.sc4 datafile northwest western southwest southern southeast eastern northeast north central
EXPLANATIONThe array, list, uses NR as an index value. Each time a line of input is processed, the first field is assigned to the list array. In the END block, the for loop iterates through each element of the array. |
(The Command Line) % cat nawk.sc5 # Awk script with special for loop /north/{name[count++]=$3} END{ print "The number living in a northern district: " count print "Their names are: " for ( i in name ) # Special nawk for loop is used to print name[i] # iterate through the array. } % nawk -f nawk.sc5 datafile The number living in a northern district: 3 Their names are: Joel TJ Val
EXPLANATIONEach time the regular expression north appears on the line, the name array is assigned the value of the third field. The index count is incremented each time a new record is processed, thus producing another element in the array. In the END block, the special for loop is used to iterate through the array. |
% cat datafile northwest NW Joel Craig 3.0 .98 3 4 western WE Sharon Kelly 5.3 .97 5 23 southwest SW Chris Foster 2.7 .8 2 18 southern SO May Chin 5.1 .95 4 15 southeast SE Derek Johnson 4.0 .7 4 17 eastern EA Susan Beal 4.4 .84 5 20 northeast NE TJ Nichols 5.1 .94 3 13 north NO Val Shultz 4.5 .89 5 9 central CT Sheri Watson 5.7 .94 5 13
(The Command Line) % cat nawk.sc6 # Awk and the special for loop {region[$1]++} # The index is the first field of each record END{for(item in region){ print region[item], item } } % nawk -f nawk.sc6 datafile 1 central 1 northwest 1 western 1 southeast 1 north 1 southern 1 northeast 1 southwest 1 eastern % nawk -f nawk.sc6 datafile3 4 Mary 2 Tom 1 Alax 1 Bob 1 Sean
EXPLANATIONThe region array uses the first field as an index. The value stored is the number of times each region was found. The END block uses the special awk for loop to iterate through the array called region. |
Mike Harrington:(510) 548-1278:250:100:175
Christian Dobbins:(408) 538-2358:155:90:201
Susan Dalsass:(206) 654-6279:250:60:50
Archie McNichol:(206) 548-1348:250:100:175
Jody Savage:(206) 548-1278:15:188:150
Guy Quigley:(916) 343-6410:250:100:175
Dan Savage:(406) 298-7744:450:300:275
Nancy McNeil:(206) 548-1278:250:80:75
John Goldenrod:(916) 348-4278:250:100:175
Chet Main:(510) 548-5258:50:95:135
Tom Savage:(408) 926-3456:250:168:200
Elizabeth Stachelin:(916) 440-1763:175:75:300
(Refer to the database called lab6.data on the CD.)
The database above contains the names, phone numbers, and money contributions to the party campaign for the past three months.
1. Write a nawk script that will produce the following report:
***FIRST QUARTERLY REPORT*** ***CAMPAIGN 1998 CONTRIBUTIONS*** ---------------------------------------------------------------------- NAME PHONE Jan | Feb | Mar | Total Donated ---------------------------------------------------------------------- Mike Harrington (510) 548-1278 250.00 100.00 175.00 525.00 Christian Dobbins (408) 538-2358 155.00 90.00 201.00 446.00 Susan Dalsass (206) 654-6279 250.00 60.00 50.00 360.00 Archie McNichol (206) 548-1348 250.00 100.00 175.00 525.00 Jody Savage (206) 548-1278 15.00 188.00 150.00 353.00 Guy Quigley (916) 343-6410 250.00 100.00 175.00 525.00 Dan Savage (406) 298-7744 450.00 300.00 275.00 1025.00 Nancy McNeil (206) 548-1278 250.00 80.00 75.00 405.00 John Goldenrod (916) 348-4278 250.00 100.00 175.00 525.00 Chet Main (510) 548-5258 50.00 95.00 135.00 280.00 Tom Savage (408) 926-3456 250.00 168.00 200.00 618.00 Elizabeth Stachelin (916) 440-1763 175.00 75.00 300.00 550.00 ---------------------------------------------------------------------- SUMMARY ---------------------------------------------------------------------- The campaign received a total of $6137.00 for this quarter. The average donation for the 12 contributors was $511.42. The highest total contribution was $1025.00 made by Dan Savage. ***THANKS Dan*** The following people donated over $500 to the campaign. They are eligible for the quarterly drawing!! Listed are their names (sorted by last names) and phone numbers: John Goldenrod--(916) 348-4278 Mike Harrington--(510) 548-1278 Archie McNichol--(206) 548-1348 Guy Quigley--(916) 343-6410 Dan Savage--(406) 298-7744 Tom Savage--(408) 926-3456 Elizabeth Stachelin--(916) 440-1763 Thanks to all of you for your continued support!!
Some data (e.g., that read in from tape or from a spreadsheet) may not have obvious field separators but may instead have fixed-width columns. To preprocess this type of data, the substr function is useful.
In the following example, the fields are of a fixed width, but are not separated by a field separator. The substr function is used to create fields.
% cat fixed 031291ax5633(408)987 0124 021589bg2435(415)866 1345 122490de1237(916)933 1234 010187ax3458(408)264 2546 092491bd9923(415)134 8900 112990bg4567(803)234 1456 070489qr3455(415)899 1426 % nawk '{printf substr($0,1,6)" ";printf substr($0,7,6)" ";\ print substr($0,13,length)}' fixed 031291 ax5633 (408)987 0124 021589 bg2435 (415)866 1345 122490 de1237 (916)933 1234 010187 ax3458 (408)264 2546 092491 bd9923 (415)134 8900 112990 bg4567 (803)234 1456 070489 qr3455 (415)899 1426
EXPLANATIONThe first field is obtained by getting the substring of the entire record, starting at the first character, offset by 6 places. Next, a space is printed. The second field is obtained by getting the substring of the record, starting at position 7, offset by 6 places, followed by a space. The last field is obtained by getting the substring of the entire record, starting at position 13 to the position represented by the length of the line. (The length function returns the length of the current line, $0, if it does not have an argument.) |
Empty Fields. If the data is stored in fixed-width fields, it is possible that some of the fields are empty. In the following example, the substr function is used to preserve the fields, regardless of whether they contain data.
1 % cat db xxx xxx xxx abc xxx xxx a bbb xxx xx % cat awkfix # Preserving empty fields. Field width is fixed. { 2 f[1]=substr($0,1,3) 3 f[2]=substr($0,5,3) 4 f[3]=substr($0,9,3) 5 line=sprintf("%-4s%-4s%-4s\n", f[1],f[2], f[3]) 6 print line } % nawk f awkfix db xxx xxx xxx abc xxx xxx a bbb xxx xx
EXPLANATION
|
Numbers with $, Commas, or Other Characters. In the following example, the price field contains a dollar sign and comma. The script must eliminate these characters to add up the prices to get the total cost. This is done using the gsub function.
% cat vendor access tech:gp237221:220:vax789:20/20:11/01/90:$1,043.00 alisa systems:bp262292:280:macintosh:new updates:06/30/91:$456.00 alisa systems:gp262345:260:vax8700:alisa talk:02/03/91:$1,598.50 apple computer:zx342567:240:macs:e mail:06/25/90:$575.75 caci:gp262313:280:sparc station:network11.5:05/12/91:$1,250.75 datalogics:bp132455:260:microvax2:pagestation maint:07/01/90:$1,200.00 dec:zx354612:220:microvax2:vms sms:07/20/90:$1,350.00 % nawk F: '{gsub(/\$/,"");gsub(/,/,""); cost +=$7};\ END{print "The total is $" cost}' vendor $7474
EXPLANATIONThe first gsub function globally substitutes the literal dollar sign (\$) with the null string, and the second gsub function substitutes commas with a null string. The user-defined cost variable is then totalled by adding the seventh field to cost and assigning the result back to cost. In the END block, the string The total cost is $ is printed, followed by the value of cost.[1] |
The Bundle Program. In The AWK Programming Language, the program to bundle files together is very short and to the point. We are trying to combine several files into one file to save disk space, to send files through electronic mail, and so forth. The following awk command will print every line of each file, preceded with the filename.
% nawk '{ print FILENAME, $0 }' file1 file2 file3 > bundled
EXPLANATIONThe name of the current input file, FILENAME, is printed, followed by the record ($0) for each line of input in file1. After file1 has reached the end of file, awk will open the next file, file2, and do the same thing, and so on. The output is redirected to a file called bundled. |
Unbundle. The following example displays how to unbundle files, or put them back into separate files.
% nawk '$1 != previous { close(previous); previous=$1};\ {print substr($0, index($0, " ") + 1) > $1}' bundled
EXPLANATIONThe first field is the name of the file. If the name of the file is not equal to the value of the user-defined variable previous (initially null), the action block is executed. The file assigned to previous is closed, and previous is assigned the value of the first field. Then the substr of the record, the starting position returned from the index function (the position of the first space + 1), is redirected to the filename contained in the first field. To bundle the files so that the filename appears on a line by itself, above the contents of the file use, the following command: % nawk '{if(FNR==1){print FILENAME;print $0}\ else print $0}' file1 file2 file3 > bundled The following command will unbundle the files: % nawk 'NF==1{filename=$NF} ;\ NF != 1{print $0 > filename}' bundled |
In the sample data files used so far, each record is on a line by itself. In the following sample datafile, called checkbook, the records are separated by blank lines and the fields are separated by newlines. To process this file, the record separator (RS) is assigned a value of null, and the field separator (FS) is assigned the newline.
(The Input File) % cat checkbook 1/1/01 #125 695.00 Mortgage 1/1/01 #126 56.89 PG&E 1/2/01 #127 89.99 Safeway 1/3/01 +750.00 Pay Check 1/4/01 #128 60.00 Visa (The Script) % cat awkchecker 1 BEGIN{RS=""; FS="\n";ORS="\n\n"} 2 {print NR, $1,$2,$3,$4} (The Output) % nawk f awkchecker checkbook 1 1/1/01 #125 695.00 Mortgage 2 1/1/01 #126 56.89 PG&E 3 1/2/01 #127 89.99 Safeway 4 1/3/01 +750.00 Pay Check 5 1/4/01 #128 60.00 Visa
EXPLANATION
|
The following example is modified from a program in The AWK Programming Language.[2] The tricky part of this is keeping track of what is actually being processed. The input file is called data.form. It contains just the data. Each field in the input file is separated by colons. The other file is called form.letter. It is the actual form that will be used to create the letter. This file is loaded into awk's memory with the getline function. Each line of the form letter is stored in an array. The program gets its data from data.form, and the letter is created by substituting real data for the special strings preceded by # and @ found in form.letter. A temporary variable, temp, holds the actual line that will be displayed after the data has been substituted. This program allows you to create personalized form letters for each person listed in data.form.
(The Awk Script) % cat form.awk # form.awk is an awk script that requires access to 2 files: The # first file is called "form.letter." This file contains the # format for a form letter. The awk script uses another file, # "data.form," as its input file. This file contains the # information that will be substituted into the form letters in # the place of the numbers preceded by pound signs. Today's date # is substituted in the place of "@date" in "form.letter." 1 BEGIN{ FS=":"; n=1 2 while(getline < "form.letter" > 0) 3 form[n++] = $0 # Store lines from form.letter in an array 4 "date" | getline d; split(d, today, " ") # Output of date is Fri Mar 2 14:35:50 PST 2001 5 thisday=today[2]". "today[3]", "today[6] 6 } 7 { for( i = 1; i < n; i++ ){ 8 temp=form[i] 9 for ( j = 1; j <=NF; j++ ){ gsub("@date", thisday, temp) 10 gsub("#" j, $j , temp ) } 11 print temp } } % cat form.letter The form letter, form.letter, looks like this: ********************************************************* Subject: Status Report for Project "#1" To: #2 From: #3 Date: @date This letter is to tell you, #2, that project "#1" is up to date. We expect that everything will be completed and ready for shipment as scheduled on #4. Sincerely, #3 ********************************************************** The file, data.form, is awk's input file containing the data that will replace the #1 4 and the @date in form.letter. % cat data.form Dynamo:John Stevens:Dana Smith, Mgr:4/12/2001 Gallactius:Guy Sterling:Dana Smith, Mgr:5/18/2001 (The Command Line) % nawk f form.awk data.form ********************************************************* Subject: Status Report for Project "Dynamo" To: John Stevens From: Dana Smith, Mgr Date: Mar. 2, 2001 This letter is to tell you, John Stevens, that project "Dynamo" is up to date. We expect that everything will be completed and ready for shipment as scheduled on 4/12/2001. Sincerely, Dana Smith, Mgr Subject: Status Report for Project "Gallactius" To: Guy Sterling From: Dana Smith, Mgr Date: Mar. 2, 2001 This letter is to you, Guy Sterling, that project "Gallactius" is up to date. We expect that everything will be completed and ready for shipment as scheduled on 5/18/2001. Sincerely, Dana Smith, Mgr
EXPLANATION
|
Now that you have seen how awk works, you will find that awk is a very powerful utility when writing shell scripts. You can embed one-line awk commands or awk scripts within your shell scripts. The following is a sample of a Korn shell program embedded with awk commands.
!#/bin/ksh # This korn shell script will collect data for awk to use in # generating form letter(s). See above. print "Hello $LOGNAME. " print "This report is for the month and year:" 1 cal | nawk 'NR==1{print $0}' if [[ f data.form || f formletter? ]] then rm data.form formletter? 2> /dev/null fi integer num=1 while true do print "Form letter #$num:" read project?"What is the name of the project? " read sender?"Who is the status report from? " read recipient?"Who is the status report to? " read due_date?"What is the completion date scheduled? " echo $project:$recipient:$sender:$due_date > data.form print n "Do you wish to generate another form letter? " read answer if [[ "$answer" != [Yy]* ]] then break else 2 nawk f form.awk data.form > formletter$num fi (( num+=1 )) done nawk f form.awk data.form > formletter$num
EXPLANATION
|
% cat datafile northwest NW Joel Craig 3.0 .98 3 4 western WE Sharon Kelly 5.3 .97 5 23 southwest SW Chris Foster 2.7 .8 2 18 southern SO May Chin 5.1 .95 4 15 southeast SE Derek Johnson 4.0 .7 4 17 eastern EA Susan Beal 4.4 .84 5 20 northeast NE TJ Nichols 5.1 .94 3 13 north NO Val Shultz 4.5 .89 5 9 central CT Sheri Watson 5.7 .94 5 13
% nawk 'NR==1{gsub(/northwest/,"southeast", $1) ;print}' datafile southeast NW Joel Craig 3.0 .98 3 4
EXPLANATIONIf this is the first record (NR == 1), globally substitute the regular expression northwest with southeast, if northwest is found in the first field. |
% nawk 'NR==1{print substr($3, 1, 3)}' datafile Joe
EXPLANATIONIf this is the first record, display the substring of the third field, starting at the first character, and extracting a length of 3 characters. The substring Joe is printed. |
% nawk 'NR==1{print length($1)}' datafile 9
EXPLANATIONIf this is the first record, the length (number of characters) in the first field is printed. |
% nawk 'NR==1{print index($1,"west")}' datafile 6
EXPLANATIONIf this is the first record, print the first position where the substring west is found in the first field. The string west starts at the sixth position (index) in the string northwest. |
% nawk '{if(match($1,/^no/)){print substr($1,RSTART,RLENGTH)}}'\ datafile no no no
EXPLANATIONIf the match function finds the regular expression /^no/ in the first field, the index position of the leftmost character is returned. The built-in variable RSTART is set to the index position and the RLENGTH variable is set to the length of the matched substring. The substr function returns the string in the first field starting at position RSTART, RLENGTH number of characters. |
% nawk 'BEGIN{split("10/14/01",now,"/");print now[1],now[2],now[3]}' 10 14 01
EXPLANATIONThe string 10/14/01 is split into an array called now. The delimiter is the forward slash. The elements of the array are printed, starting at the first element of the array. |
% cat datafile2 Joel Craig:northwest:NW:3.0:.98:3:4 Sharon Kelly:western:WE:5.3:.97:5:23 Chris Foster:southwest:SW:2.7:.8:2:18 May Chin:southern:SO:5.1:.95:4:15 Derek Johnson:southeast:SE:4.0:.7:4:17 Susan Beal:eastern:EA:4.4:.84:5:20 TJ Nichols:northeast:NE:5.1:.94:3:13 Val Shultz:north:NO:4.5:.89:5:9 Sheri Watson:central:CT:5.7:.94:5:13
% nawk -F: '/north/{split($1, name, " ");\ print "First name: "name[1];\ print "Last name: " name[2];\ print "\n--------------------"}' datafile2 First name: Joel Last name: Craig -------------------- First name: TJ Last name: Nichols -------------------- First name: Val last name: Shultz --------------------
EXPLANATIONThe input field separator is set to a colon ( F:). If the record contains the regular expression north, the first field is split into an array called name, where a space is the delimiter. The elements of the array are printed. |
% cat datafile northwest NW Joel Craig 3.0 .98 3 4 western WE Sharon Kelly 5.3 .97 5 23 southwest SW Chris Foster 2.7 .8 2 18 southern SO May Chin 5.1 .95 4 15 southeast SE Derek Johnson 4.0 .7 4 17 eastern EA Susan Beal 4.4 .84 5 20 northeast NE TJ Nichols 5.1 .94 3 13 north NO Val Shultz 4.5 .89 5 9 central CT Sheri Watson 5.7 .94 5 13
% nawk '{line=sprintf("%10.2f%5s\n",$7,$2); print line}' datafile 3.00 NW 5.00 WE 2.00 SW 4.00 SO 4.00 SE 5.00 EA 3.00 NE 5.00 NO 5.00 CT
EXPLANATIONThe sprintf function formats the seventh and the second fields ($7, $2) using the formatting conventions of the printf function. The formatted string is returned and assigned to the user-defined variable line and printed. |
% cat argvs.sc # Testing command line arguments with ARGV and ARGC using a for loop. BEGIN{ for(i=0;i < ARGC;i++) printf("argv[%d] is %s\n", i, ARGV[i]) printf("The number of arguments, ARGC=%d\n", ARGC) } % nawk -f argvs.sc datafile argv[0] is nawk argv[1] is datafile The number of arguments, ARGC=2
EXPLANATIONThe BEGIN block contains a for loop to process the command line arguments. ARGC is the number of arguments and ARGV is an array that contains the actual arguments. Nawk does not count options as arguments. The only valid arguments in this example are the nawk command and the input file, datafile. |
% nawk 'BEGIN{name=ARGV[1]};\ $0 ~ name {print $3 , $4}' "Derek" datafile nawk: can't open Derek source line number 1 % nawk 'BEGIN{name=ARGV[1]; delete ARGV[1]};\ $0 ~ name {print $3, $4}' "Derek" datafile Derek Johnson
EXPLANATION
|
% cat datafile northwest NW Joel Craig 3.0 .98 3 4 western WE Sharon Kelly 5.3 .97 5 23 southwest SW Chris Foster 2.7 .8 2 18 southern SO May Chin 5.1 .95 4 15 southeast SE Derek Johnson 4.0 .7 4 17 eastern EA Susan Beal 4.4 .84 5 20 northeast NE TJ Nichols 5.1 .94 3 13 north NO Val Shultz 4.5 .89 5 9 central CT Sheri Watson 5.7 .94 5 13
% nawk 'BEGIN{ "date" | getline d; print d}' datafile Mon Jan 15 11:24:24 PST 2001
EXPLANATIONThe UNIX date command is piped to the getline function. The results are stored in the variable d and printed. |
% nawk 'BEGIN{ "date " | getline d; split( d, mon) ;print mon[2]}'\ datafile Jan
EXPLANATIONThe UNIX date command is piped to the getline function and the results are stored in d. The split function splits the string d into an array called mon. The second element of the array is printed. |
% nawk 'BEGIN{ printf "Who are you looking for?" ; \ getline name < "/dev/tty"};\
EXPLANATIONInput is read from the terminal, /dev/tty, and stored in the array called name. |
% nawk 'BEGIN{while(getline < "/etc/passwd" > 0 ){lc++}; print lc}'\ datafile 16
EXPLANATIONThe while loop is used to loop through the /etc/passwd file one line at a time. Each time the loop is entered, a line is read by getline and the value of the variable lc is incremented. When the loop exits, the value of lc is printed, i.e., the number of lines in the /etc/passwd file. As long as the return value from getline is not 0, i.e., a line has been read, the looping continues. |
% nawk '{if ( $5 > 4.5) next; print $1}' datafile northwest southwest southeast eastern north
EXPLANATIONIf the fifth field is greater than 4.5, the next line is read from the input file (datafile) and processing starts at the beginning of the awk script (after the BEGIN block). Otherwise, the first field is printed. |
% nawk '{if ($2 ~ /S/){print ; exit 0}}' datafile southwest SW Chris Foster 2.7 .8 2 18 % echo $status ( csh ) or echo $? (sh or ksh) 0
EXPLANATIONIf the second field contains an S, the record is printed and the awk program exits. The C shell status variable contains the exit value. If using the Bourne or Korn shells, the $? variable contains the exit status. |
% cat datafile northwest NW Joel Craig 3.0 .98 3 4 western WE Sharon Kelly 5.3 .97 5 23 southwest SW Chris Foster 2.7 .8 2 18 southern SO May Chin 5.1 .95 4 15 southeast SE Derek Johnson 4.0 .7 4 17 eastern EA Susan Beal 4.4 .84 5 20 northeast NE TJ Nichols 5.1 .94 3 13 north NO Val Shultz 4.5 .89 5 9 central CT Sheri Watson 5.7 .94 5 13
(The Command Line) % cat nawk.sc7 1 BEGIN{largest=0} 2 {maximum=max($5)} 3 function max ( num ) { 4 if ( num > largest){ largest=num } return largest 5 } 6 END{ print "The maximum is " maximum "."} % nawk -f nawk.sc7 datafile The maximum is 5.7.
EXPLANATION
|
Mike Harrington:(510) 548-1278:250:100:175
Christian Dobbins:(408) 538-2358:155:90:201
Susan Dalsass:(206) 654-6279:250:60:50
Archie McNichol:(206) 548-1348:250:100:175
Jody Savage:(206) 548-1278:15:188:150
Guy Quigley:(916) 343-6410:250:100:175
Dan Savage:(406) 298-7744:450:300:275
Nancy McNeil:(206) 548-1278:250:80:75
John Goldenrod:(916) 348-4278:250:100:175
Chet Main:(510) 548-5258:50:95:135
Tom Savage:(408) 926-3456:250:168:200
Elizabeth Stachelin:(916) 440-1763:175:75:300
(Refer to the database called lab7.data on the CD.)
The database above contains the names, phone numbers, and money contributions to the party campaign for the past three months.
1: | Write a user-defined function to return the average of all the contributions for a given month. The month will be passed in at the command line. |
[1] For details on how commas are added back into the program, see Alfred V. Aho, Brian W. Kernighan, and Peter J. Weinberger, The AWK Programming Language (Boston: Addison-Wesley, 1988), p. 72.
[2] Alfred V. Aho, Brian W. Kernighan, and Peter J. Weinberger, The AWK Programming Language (Boston: Addison-Wesley, 1988). 1988 Bell Telephone Laboratories, Inc. Reprinted by permission of Pearson Education, Inc.
CONTENTS |