Changing Files with awk


Changing Files with awk

While sed is line-oriented and lets you fiddle and diddle to your heart's content, awk is field-oriented and is ideal for manipulating database or comma-delimited files. For example, if you have an address book file, you can use awk to find and change information in fields you specify, as in Code Listing 6.11. In the following steps, we'll show you a sampling of the things you can do using awk to modify, in this example, an address book file.

Code Listing 6.11. awk lets you access individual fields in a file.

[ejr@hobbes manipulate]$ awk '{ print $1 }'   address.book Schmidt, Feldman, Brown, Smith, Jones, [ejr@hobbes manipulate]$ 

To change files with awk:

De-what?

A delimited file uses a specific character to show where one bit of information ends and another begins. Each piece of information is a separate field. For example, a file that contains "John, Doe, Thornton, Colorado" is comma-delimited, sporting a comma between fields. Other files, such as the /etc/passwd file, use a colon (:) to separate the fields. Just about any symbol that's not used in the content could be used as a delimiter.


1.

awk '{ print $1 }' address.book

At the shell prompt, use awk '{ print $1 }' address.book to look at the address.book file and select (and send to standard output) the first field in each record (line). More specifically, starting from the inside out

  • $1 references the first field in each line. Unless you specify otherwise, awk assumes that a space separates the fields, so the first field starts at the beginning of the line and continues to the first space.

  • {} contain the awk command, and the quotes are necessary to tie the awk command together (so the first space within the command isn't interpreted by the shell as the end of the command). See Code Listing 6.11.

2.

awk -F, '{ print $1 }' address.book

The -F flag tells awk to use the character following itin this case, a comma (,) as the field separator. This change makes the output of the command a little cleaner and more accurate. If you were working with /etc/passwd, you'duse -F: to specify that the : is the field separator.

3.

awk -F, '{ print $2 " " $1 " " $7 }' address.book > phone.list

With this code, you can pull specific fields, in an arbitrary order, from your database. Although it looks complex, it's just one additional step from the previous example. Rather than printing a single field from the address book, we're printing field 2, then a space, then field 1, then a space, then field 7. The final bit just redirects the output into a new file. This example would produce a list of names and phone numbers, as shown in Code Listing 6.12.

4.

awk -F, '/CA/{ print $2 $1 $7 }' address.book > phone.list

You can also specify a matching pattern. Here, we added /CA/ to search and act on only the lines that contain CA, so only those lines will be in the phone.list file.

Code Listing 6.12. With a little more tweaking, awk lets you do a lot of processing on the files to get just the information you need.

[ejr@hobbes manipulate]$ awk -F, '{print  $2 "" $1 " " $7 }' address book  > phone.list [ejr@hobbes manipulate]$ more phone.list     Sven Schmidt  555-555-8382     Fester Feldman     John Brown  918-555-1234     Sally Smith  801-555-8982     Kelly Jones  408-555-7253 [ejr@hobbes manipulate]$ 

Tips

  • You can load awk scripts from a file with awk -f script.awk filename. Just as with sed, this keeps there typing to a minimum, which is helpful with these long and convoluted commands. Refer to Chapter 10 for more details about scripting.

  • Take a glance at Sorting Files with sort later in this chapter and consider piping your awk output to sort. Let Unix do the tedious work for you!





Unix(c) Visual Quickstart Guide
UNIX, Third Edition
ISBN: 0321442458
EAN: 2147483647
Year: 2006
Pages: 251

flylib.com © 2008-2017.
If you may any questions please contact us: flylib@qtcs.net