Flylib.com

Books Software

 
 
 

11.6. Variable and Array Assignment

 < Day Day Up > 

11.6. Variable and Array Assignment

Variables can be assigned a value with an = sign. For example:

FS = ","

Expressions using the operators + , - , / , and % (modulo) can be assigned to variables.

Arrays can be created with the split( ) function (described later), or they can simply be named in an assignment statement. Array elements can be subscripted with numbers ( array [1] , ..., array [ n ] ) or with strings. Arrays subscripted by strings are called associative arrays . [*] For example, to count the number of widgets you have, you could use the following script:

[*] In fact, all arrays in awk are associative; numeric subscripts are converted to strings before using them as array subscripts. Associative arrays are one of awk 's most powerful features.

/widget/ { count["widget"]++ }

Count widgets

END      { print count["widget"] }

Print the count


You can use the special for loop to read all the elements of an associative array:

for (item in array)


process


array[item]

The index of the array is available as item , while the value of an element of the array can be referenced as array[item] .

You can use the in operator to test that an element exists by testing to see if its index exists. For example:

if (index in array)

            ...

tests that array[index] exists, but you cannot use it to test the value of the element referenced by array[index] .

You can also delete individual elements of the array using the delete statement. (See also the delete entry in the section "Alphabetical Summary of awk Functions and Commands," later in this chapter.)

11.6.1. Escape Sequences

Within string and regular expression constants, the following escape sequences may be used.

Sequence

Meaning

\a

Alert (bell)

\b

Backspace

\f

Form feed

\n

Newline

\r

Carriage return

\t

TAB

\v

Vertical tab

\\

Literal backslash

\ nnn

Octal value nnn

\x nn

Hexadecimal value nn

\"

Literal double quote (in strings)

\/

Literal slash (in regular expressions)


The \x escape sequence is a common extension; it is not part of POSIX awk .


11.6.2. Octal and Hexadecimal Constants in gawk

gawk allows you to use octal and hexadecimal constants in your program source code. The form is as in C: octal constants start with a leading , and hexadecimal constants with a leading 0x or 0X . The hexadecimal digits a - f may be in either upper- or lowercase.

$

gawk 'BEGIN { print 042, 42, 0x42 }'

34 42 66

Use the strtonum( ) function to convert octal or hexadecimal input data into numerical values.

 < Day Day Up > 
 < Day Day Up > 

11.7. User -Defined Functions

POSIX awk allows you to define your own functions. This makes it easy to encapsulate sequences of steps that need to be repeated into a single place, and reuse the code from anywhere in your program.

The following function capitalizes each word in a string. It has one parameter, named input , and five local variables that are written as extra parameters:

# capitalize each word in a string

    function capitalize(input,    result, words, n, i, w)

    {

       result = ""

       n = split(input, words, " ")

       for (i = 1; i <= n; i++) {

            w = words[i]

            w = toupper(substr(w, 1, 1)) substr(w, 2)

            if (i > 1)

                     result = result " "

            result = result w

       }

       return result

    }



    # main program, for testing

    { print capitalize(
# capitalize each word in a string function capitalize(input, result, words, n, i, w) { result = "" n = split(input, words, " ") for (i = 1; i <= n; i++) { w = words[i] w = toupper(substr(w, 1, 1)) substr(w, 2) if (i > 1) result = result " " result = result w } return result } # main program, for testing { print capitalize($0) }
) }

With this input data:

A test line with words and numbers like 12 on it.

This program produces:

A Test Line With Words And Numbers Like 12 On It.

For user-defined functions, no space is allowed between the function name and the left parenthesis when the function is called.


 < Day Day Up >