Changing Files with awk

Changing Files with `awk`

While sed is line-oriented and lets you fiddle and diddle to your heart's content, awk is field-oriented and is ideal for manipulating database or comma-delimited files. For example, if you have an address book file, you can use awk to find and change information in fields you specify, as in Code Listing 6.11. In the following steps, we'll show you a sampling of the things you can do using awk to modify, in this example, an address book file.

Code Listing 6.11. `awk` lets you access individual fields in a file.

[ejr@hobbes manipulate]$ awk '{ print $1 }'   address.book Schmidt, Feldman, Brown, Smith, Jones, [ejr@hobbes manipulate]$

To change files with `awk`:

De-what?

A delimited file uses a specific character to show where one bit of information ends and another begins. Each piece of information is a separate field. For example, a file that contains "John, Doe, Thornton, Colorado" is comma-delimited, sporting a comma between fields. Other files, such as the /etc/passwd file, use a colon (:) to separate the fields. Just about any symbol that's not used in the content could be used as a delimiter.

1.	`awk '{ print $1 }' address.book` At the shell prompt, use `awk '{ print $1 }' address.book` to look at the `address.book` file and select (and send to standard output) the first field in each record (line). More specifically, starting from the inside out `$1` references the first field in each line. Unless you specify otherwise, `awk` assumes that a space separates the fields, so the first field starts at the beginning of the line and continues to the first space. `{}` contain the `awk` command, and the quotes are necessary to tie the `awk` command together (so the first space within the command isn't interpreted by the shell as the end of the command). See Code Listing 6.11.
2.	`awk -F, '{ print $1 }' address.book` The `-F` flag tells `awk` to use the character following itin this case, a comma (`,`) as the field separator. This change makes the output of the command a little cleaner and more accurate. If you were working with `/etc/passwd`, you'duse `-F:` to specify that the `:` is the field separator.

3.	`awk -F, '{ print $2 " " $1 " " $7 }' address.book > phone.list` With this code, you can pull specific fields, in an arbitrary order, from your database. Although it looks complex, it's just one additional step from the previous example. Rather than printing a single field from the address book, we're printing field 2, then a space, then field 1, then a space, then field 7. The final bit just redirects the output into a new file. This example would produce a list of names and phone numbers, as shown in Code Listing 6.12.
4.	`awk -F, '/CA/{ print $2 $1 $7 }' address.book > phone.list` You can also specify a matching pattern. Here, we added `/CA/` to search and act on only the lines that contain `CA`, so only those lines will be in the `phone.list` file.

Code Listing 6.12. With a little more tweaking, `awk` lets you do a lot of processing on the files to get just the information you need.

[ejr@hobbes manipulate]$ awk -F, '{print  $2 "" $1 " " $7 }' address book  > phone.list [ejr@hobbes manipulate]$ more phone.list     Sven Schmidt  555-555-8382     Fester Feldman     John Brown  918-555-1234     Sally Smith  801-555-8982     Kelly Jones  408-555-7253 [ejr@hobbes manipulate]$

Tips

You can load awk scripts from a file with awk -f script.awk filename. Just as with sed, this keeps there typing to a minimum, which is helpful with these long and convoluted commands. Refer to Chapter 10 for more details about scripting.
Take a glance at Sorting Files with sort later in this chapter and consider piping your awk output to sort. Let Unix do the tedious work for you!