6.21. awk Built-In Functions

 <  Day Day Up  >  

6.21. awk Built-In Functions

6.21.1 String Functions

The sub and gsub Functions

The sub function matches the regular expression for the largest and leftmost substring in the record, and then replaces that substring with the substitution string. If a target string is specified, the regular expression is matched for the largest and leftmost substring in the target string, and the substring is replaced with the substitution string. If a target string is not specified, the entire record is used.

FORMAT

 sub (regular expression, substitution string); sub (regular expression, substitution string, target string) 

Example 6.147.
 1  %  nawk '{sub(/Mac/, "MacIntosh"); print}' filename  2  %  nawk '{sub(/Mac/, "MacIntosh",); print}' filename  

EXPLANATION

  1. The first time the regular expression Mac is matched in the record ( $0 ), it will be replaced with the string MacIntosh . The replacement is made only on the first occurrence of a match on the line. (See gsub for multiple occurrences.)

  2. The first time the regular expression Mac is matched in the first field of the record, it will be replaced with the string MacIntosh . The replacement is made only on the first occurrence of a match on the line for the target string. The gsub function substitutes a regular expression with a string globally, that is, for every occurrence where the regular expression is matched in each record ($0).

FORMAT

 gsub(regular expression, substitution string) gsub(regular expression, substitution string, target string) 

Example 6.148.
 1  %  nawk '{ gsub(/CA/, "California"); print }' datafile  2  %  nawk '{ gsub(/[Tt]om/, "Thomas",); print }' filename  

EXPLANATION

  1. Everywhere the regular expression CA is found in the record ( $0 ), it will be replaced with the string California .

  2. Everywhere the regular expression Tom or tom is found in the first field, it will be replaced with the string Thomas .

The index Function

The index function returns the first position where a substring is found in a string. Offset starts at position 1.

FORMAT

 index(string, substring) 

Example 6.149.
 %  nawk '{ print index("hollow", "low") }' filename   4  

EXPLANATION

The number returned is the position where the substring low is found in hollow , with the offset starting at one.

The length Function

The length function returns the number of characters in a string. Without an argument, the length function returns the number of characters in a record.

FORMAT

 length (string) length 

Example 6.150.
 %  nawk '{ print length("hello") }' filename   5  

EXPLANATION

The length function returns the number of characters in the string hello .

The substr Function

The substr function returns the substring of a string starting at a position where the first position is one. If the length of the substring is given, that part of the string is returned. If the specified length exceeds the actual string, the string is returned.

FORMAT

 substr(string, starting position) substr(string, starting position, length of string) 

Example 6.151.
 %  nawk ' { print substr("Santa Claus", 7, 6)} ' filename   Claus  

EXPLANATION

In the string Santa Claus , print the substring starting at position 7 with a length of 6 characters.

The match Function

The match function returns the index where the regular expression is found in the string, or zero if not found. The match function sets the built-in variable RSTART to the starting position of the substring within the string, and RLENGTH to the number of characters to the end of the substring. These variables can be used with the substr function to extract the pattern. (Works only with nawk .)

FORMAT

 match(string, regular expression) 

Example 6.152.
 %  nawk 'END{start=match("Good ole USA", /[AZ]+$/); print start}'\   filename   10  

EXPLANATION

The regular expression /[A “Z]+$/ says search for consecutive uppercase letters at the end of the string. The substring USA is found starting at the tenth character of the string Good ole USA . If the string cannot be matched, 0 is returned.

Example 6.153.
 1   %  nawk 'END{start=match("Good ole USA", /[AZ]+$/);\  print RSTART, RLENGTH}' filename  10 3  2   %  nawk 'BEGIN{ line="Good ole USA"}; \  END{ match(line, /[AZ]+$/);\       print substr(line, RSTART,RLENGTH)}' filename  USA  

EXPLANATION

  1. The RSTART variable is set by the match function to the starting position of the regular expression matched. The RLENGTH variable is set to the length of the substring.

  2. The substr function is used to find a substring in the variable line , and uses the RSTART and RLENGTH values (set by the match function) as the beginning position and length of the substring.

The split Function

The split function splits a string into an array using whatever field separator is designated as the third parameter. If the third parameter is not provided, awk will use the current value of FS .

FORMAT

 split (string, array, field separator) split (string, array) 

Example 6.154.
 %  awk 'BEGIN{split("12/25/2001",date,"/");print date[2]}' filename   25  

EXPLANATION

The split function splits the string 12/25/2001 into an array, called date , using the forward slash as the separator. The array subscript starts at 1. The second element of the date array is printed.

The sprintf Function

The sprintf function returns an expression in a specified format. It allows you to apply the format specifications of the printf function.

FORMAT

 variable=sprintf("string with format specifiers ", expr1, expr2, ... , expr2) 

Example 6.155.
 %  awk '{line = sprintf ("%15s %6.2f ",  ,); print line}' filename  

EXPLANATION

The first and third fields are formatted according to the printf specifications (a left-justified, 15-space string and a right-justified, 6-character floating-point number). The result is assigned to the user -defined variable line . See "The printf Function" on page 223.

 <  Day Day Up  >  


UNIX Shells by Example
UNIX Shells by Example (4th Edition)
ISBN: 013147572X
EAN: 2147483647
Year: 2004
Pages: 454
Authors: Ellie Quigley

flylib.com © 2008-2017.
If you may any questions please contact us: flylib@qtcs.net