Section 11.11. Alphabetical Summary of awk Functions and Commands


11.11. Alphabetical Summary of awk Functions and Commands

The following alphabetical list of keywords and functions includes all that are available in POSIX awk and gawk. Extensions that aren't part of POSIX awk but that are in both gawk and the Bell Laboratories awk are marked as {E}. Cases where gawk has extensions are marked as {G}. Items that aren't marked with a symbol are available in all versions.

#

     # 

Ignore all text that follows on the same line. # is used in awk scripts as the comment character and is not really a command.

and

     and(expr1, expr2) {G} 

Return the bitwise AND of expr1 and expr2, which should be values that fit in a C unsigned long.

asort

     asort(src [,dest]) {G} 

Sort the array src based on the element values, destructively replacing the indices with values from one to the number of elements in the array. If dest is supplied, copy src to dest and sort dest, leaving src unchanged. Returns the number of elements in src.

asorti

     asorti(src [,dest]) {G} 

Like asort( ), but the sorting is done based on the indices in the array, not based on the element values. For gawk 3.1.2 and later.

atan2

     atan2(y, x) 

Return the arctangent of y/x in radians.

bindtextdomain

     bindtextdomain(dir [,domain]) {G} 

Look in directory dir for message translation files for text domain domain (default: value of TEXTDOMAIN). Returns the directory where domain is bound.

break

     break 

Exit from a while, for, or do loop.

close

     close(expr)     close(expr, how) {G} 

In most implementations of awk, you can only have up to ten files and one pipe open simultaneously. Therefore, POSIX awk provides a close( ) function that allows you to close a file or a pipe. It takes the same expression that opened the pipe or file as an argument. This expression must be identical, character by character, to the one that opened the file or pipeeven whitespace is significant.

In the second form, close one end of either a TCP/IP socket or a two-way pipe to a coprocess. how is a string, either "from" or "to". Case does not matter.

compl

     compl(expr) {G} 

Return the bitwise complement of expr, which should be a value that fits in a C unsigned long.

continue

     continue 

Begin next iteration of while, for, or do loop.

cos

     cos(x) 

Return the cosine of x, an angle in radians.

dcgettext

     dcgettext(str [, dom [,cat]]) {G} 

Return the translation of str for the text domain dom in message category cat. Default text domain is value of TEXTDOMAIN. Default category is "LC_MESSAGES".

dcngettext

     dcngettext(str1, str2, num [, dom [,cat]]) {G} 

If num is one, return the translation of str1 for the text domain dom in message category cat. Otherwise, return the translation of str2. Default text domain is value of TEXTDOMAIN. Default category is "LC_MESSAGES". For gawk 3.1.1 and later.

delete

     delete array[element]     delete array {E} 

Delete element from array. The brackets are typed literally. The second form is a common extension, which deletes all elements of the array in one shot.

do

     do      statement     while (expr) 

Looping statement. Execute statement, then evaluate expr and if true, execute statement again. A series of statements must be put within braces.

exit

     exit [expr] 

Exit from script, reading no new input. The END procedure, if it exists, will be executed. An optional expr becomes awk's return value.

exp

     exp(x) 

Return exponential of x (ex).

extension

     extension(lib, init) {G} 

Dynamically load the shared object file lib, calling the function init to initialize it. Return the value returned by the init function. This function allows you to add new built-in functions to gawk. See Effective awk Programming (O'Reilly) for the details.

fflush

     fflush([output-expr]) {E} 

Flush any buffers associated with open output file or pipe output-expr.

gawk extends this function. If no output-expr is supplied, it flushes standard output. If output-expr is the null string (""), it flushes all open files and pipes.

for

     for (init-expr; test-expr; incr-expr)      statement 

C-style looping construct. init-expr assigns the initial value of a counter variable. test-expr is a relational expression that is evaluated each time before executing the statement. When test-expr is false, the loop is exited. incr-expr is used to increment the counter variable after each pass. All of the expressions are optional. A missing test-expr is considered to be true. A series of statements must be put within braces.

for

     for (item in array)      statement 

Special loop designed for reading associative arrays. For each element of the array, the statement is executed; the element can be referenced by array [item]. A series of statements must be put within braces.

function

     function name(parameter-list) {      statements     } 

Create name as a user-defined function consisting of awk statements that apply to the specified list of parameters. No space is allowed between name and the left parenthesis when the function is called.

gensub

     gensub(regex, str, how [, target]) {G} 

General substitution function. Substitute str for matches of the regular expression regex in the string target. If how is a number, replace the howth match. If it is "g" or "G", substitute globally. If target is not supplied, $0 is used. Return the new string value. The original target is not modified. (Compare with gsub and sub.) Use & in the replacement string to stand for the text matched by the pattern.

getline

     getline     getline [var] [< file]     command | getline [var]     command |& getline [var] {G} 

Read next line of input.

The second form reads input from file, and the third form reads the output of command. All forms read one record at a time, and each time the statement is executed, it gets the next record of input. The record is assigned to $0 and is parsed into fields, setting NF, NR and FNR. If var is specified, the result is assigned to var and $0 and NF are not changed. Thus, if the result is assigned to a variable, the current record does not change. getline is actually a function, and it returns 1 if it reads a record successfully, 0 if end-of-file is encountered, and -1 if for some reason it is otherwise unsuccessful.

The fourth form reads the output from coprocess command. See the section Coprocesses and Sockets," for more information.

gsub

     gsub(regex, str [, target]) 

Globally substitute str for each match of the regular expression regex in the string target. If target is not supplied, defaults to $0. Return the number of substitutions. Use & in the replacement string to stand for the text matched by the pattern.

if

     if (condition)      statement1     [else      statement2] 

If condition is true, do statement1; otherwise do statement2 in optional else clause. The condition can be an expression using any of the relational operators <, <=, = =, !=, >=, or >, as well as the array membership operator in, and the pattern-matching operators ~ and !~ (e.g., if ($1 ~ /[Aa].*/)). A series of statements must be put within braces. Another if can directly follow an else in order to produce a chain of tests or decisions.

index

     index(str, substr) 

Return the position (starting at 1) of substr in str, or zero if substr is not present in str.

int

     int(x) 

Return integer value of x by truncating any fractional part.

length

     length([arg]) 

Return length of arg, or the length of $0 if no argument.

log

     log(x) 

Return the natural logarithm (base e) of x.

lshift

     lshift(expr, count) {G} 

Return the result of shifting expr left by count bits. Both expr and count should be values that fit in a C unsigned long.

match

     match(str, regex)     match(str, regex [, array]) {G} 

Function that matches the pattern, specified by the regular expression regex, in the string str and returns either the position in str where the match begins, or 0 if no occurrences are found. Sets the values of RSTART and RLENGTH to the start and length of the match, respectively.

If array is provided, gawk puts the text that matched the entire regular expression in array[0], the text that matched the first parenthesized subexpression in array[1], the second in array[2], and so on.

mktime

     mktime(timespec) {G} 

Turns timespec (a string of the form YYYY MM DD HH MM SS[DST] representing a local time) into a time-of-day value in seconds since midnight, January 1, 1970, UTC.

next

     next 

Read next input line and start new cycle through pattern/procedures statements.

nextfile

     nextfile {E} 

Stop processing the current input file and start new cycle through pattern/procedures statements, beginning with the first record of the next file.

or

     or(expr1, expr2) {G} 

Return the bitwise OR of expr1 and expr2, which should be values that fit in a C unsigned long.

print

     print [ output-expr[ , ...]] [ dest-expr ] 

Evaluate the output-expr and direct it to standard output followed by the value of ORS. Each comma-separated output-expr is separated in the output by the value of OFS. With no output-expr, print $0. The output may be redirected to a file or pipe via the dest-expr, which is described in Output Redirections," later in this chapter.

printf

     printf(format [, expr-list ]) [ dest-expr ] 

An alternative output statement borrowed from the C language. It has the ability to produce formatted output. It can also be used to output data without automatically producing a newline. format is a string of format specifications and constants. expr-list is a list of arguments corresponding to format specifiers. As for print, output may be redirected to a file or pipe. See printf Formats," later in this chapter, for a description of allowed format specifiers.

Like any string, format can also contain embedded escape sequences: \n (newline) or \t (tab) being the most common. Spaces and literal text can be placed in the format argument by quoting the entire argument. If there are multiple expressions to be printed, there should be multiple formats specified.

Examples

Using the script:

          { printf("The sum on line %d is %.0f.\n", NR, $1+$2) } 

The following input line:

          5   5 

produces this output, followed by a newline:

     The sum on line 1 is 10. 

rand

     rand(  ) 

Generate a random number between 0 and 1. This function returns the same series of numbers each time the script is executed, unless the random number generator is seeded using srand( ).

return

     return [expr] 

Used within a user-defined function to exit the function, returning the value of expr. The return value of a function is undefined if expr is not provided.

rshift

     rshift(expr, count) {G} 

Return the result of shifting expr right by count bits. Both expr and count should be values that fit in a C unsigned long.

sin

     sin(x) 

Return the sine of x, an angle in radians.

split

     split(string, array [, sep]) 

Split string into elements of array array[1],...,array[n]. Return the number of array elements created. The string is split at each occurrence of separator sep. If sep is not specified, FS is used.

sprintf

     sprintf(format [, expressions]) 

Return the formatted value of one or more expressions, using the specified format. Data is formatted but not printed. See printf Formats," later in this chapter, for a description of allowed format specifiers.

sqrt

     sqrt(arg) 

Return the square root of arg.

srand

     srand([expr]) 

Use optional expr to set a new seed for the random number generator. Default is the time of day. Return value is the old seed.

strftime

     strftime([format [,timestamp]]) {G} 

Format timestamp according to format. Return the formatted string. The timestamp is a time-of-day value in seconds since midnight, January 1, 1970, UTC. The format string is similar to that of sprintf. If timestamp is omitted, it defaults to the current time. If format is omitted, it defaults to a value that produces output similar to that of the Unix date command. See the date entry in Chapter 3 for a list.

strtonum

     strtonum(expr) {G} 

Return the numeric value of expr, which is a string representing an octal, decimal, or hexadecimal number in the usual C notations. Use this function for processing nondecimal input data.

sub

     sub(regex, str [, target]) 

Substitute str for first match of the regular expression regex in the string target. If target is not supplied, defaults to $0. Returns 1 if successful, 0 otherwise. Use & in the replacement string to stand for the text matched by the pattern.

substr

     substr(string, beg [, len]) 

Return substring of string at beginning position beg (counting from 1), and the characters that follow to maximum specified length len. If no length is given, use the rest of the string.

system

     system(command) 

Function that executes the specified command and returns its exit status. The status of the executed command typically indicates success or failure. A value of 0 means that the command executed successfully. A nonzero value indicates a failure of some sort. The documentation for the command you're running will give you the details.

awk does not make the output of the command available for processing within the awk script. Use command | getline to read the output of a command into the script.

systime

     systime(  ) {G} 

Return a time-of-day value in seconds since midnight, January 1, 1970, UTC.

Examples

Log the start and end times of a data-processing program:

     BEGIN {             now = systime(  )             mesg = strftime("Started at %m/%d/%Y %H:%M:%S",     now)             print mesg     }     process data ...     END {             now = systime(  )             mesg = strftime("Ended at %m/%d/%Y %H:%M:%S", now)             print mesg     } 

tolower

     tolower(str) 

Translate all uppercase characters in str to lowercase and return the new string.[*]

[*] Very early versions of nawk don't support tolower( ) and toupper( ). However, they are now part of the POSIX specification for awk.

toupper

     toupper(str) 

Translate all lowercase characters in str to uppercase and return the new string.

while

     while (condition)      statement 

Do statement while condition is true (see if for a description of allowable conditions). A series of statements must be put within braces.

xor

     xor(expr1, expr2) {G} 

Return the bitwise XOR of expr1 and expr2, which should be values that fit in a C unsigned long.

11.12.1. Output Redirections

For print and printf, dest-expr is an optional expression that directs the output to a file or pipe.


> file

Direct the output to a file, overwriting its previous contents.


>> file

Append the output to a file, preserving its previous contents. In both this case and the > file case, the file will be created if it does not already exist.


| command

Direct the output as the input to a system command.


|& command

Direct the output as the input to a coprocess. gawk only.

Be careful not to mix > and >> for the same file. Once a file has been opened with >, subsequent output statements continue to append to the file until it is closed.

Remember to call close( ) when you have finished with a file, pipe, or coprocess. If you don't, eventually you will hit the system limit on the number of simultaneously open files.

11.12.2. printf Formats

Format specifiers for printf and sprintf have the following form:

     %[posn$][flag][width][.precision]letter 

The control letter is required. The format-conversion control letters are given in the following table.

Character

Description

c

ASCII character.

d

Decimal integer.

i

Decimal integer. (Added in POSIX)

e

Floating-point format ([-]d.precisione[+-]dd).

E

Floating-point format ([-]d.precisionE[+-]dd).

f

Floating-point format ([-]ddd.precision).

g

e or f conversion, whichever is shortest, with trailing zeros removed.

G

E or f conversion, whichever is shortest, with trailing zeros removed.

o

Unsigned octal value.

s

String.

u

Unsigned decimal value.

x

Unsigned hexadecimal number. Uses a-f for 10 to 15.

X

Unsigned hexadecimal number. Uses A-F for 10 to 15.

%

Literal %.


gawk allows you to provide a positional specifier after the % (posn$). A positional specifier is an integer count followed by a $. The count indicates which argument to use at that point. Counts start at one and don't include the format string. This feature is primarily for use in producing translations of format strings. For example:

     $ gawk 'BEGIN { printf "%2$s, %1$s\n", "world", "hello" }'     hello, world 

The optional flag is one of the following:

Character

Description

-

Left-justify the formatted value within the field.

space

Prefix positive values with a space and negative values with a minus.

+

Always prefix numeric values with a sign, even if the value is positive.

#

Use an alternate form: %o has a preceding 0; %x and %X are prefixed with 0x and 0X, respectively; %e, %E and %f always have a decimal point in the result; and %g and %G do not have trailing zeros removed.

0

Pad output with zeros, not spaces. This only happens when the field width is wider than the converted result. This flag applies to all output formats, even nonnumeric ones.

'

gawk 3.1.4 and later only. For numeric formats, in locales that support it, supply a thousands-separator charater.


The optional width is the minimum number of characters to output. The result will be padded to this size if it is smaller. The 0 flag causes padding with zeros; otherwise, padding is with spaces.

The precision is optional. Its meaning varies by control letter, as shown in the following table:

Conversion

Precision means

%d, %i, %o, %u, %x, %X

The minimum number of digits to print.

%e, %E, %f

The number of digits to the right of the decimal point.

%g, %G

The maximum number of significant digits.

%s

The maximum number of characters to print.




Linux in a Nutshell
Linux in a Nutshell
ISBN: 0596154488
EAN: 2147483647
Year: 2004
Pages: 147

flylib.com © 2008-2017.
If you may any questions please contact us: flylib@qtcs.net