Chapter 21: awk and sed


Overview

The Swiss army knife of the UNIX System toolkit is awk. Many useful awk programs are only one line long, and in fact even a one-line awk program can be the equivalent of a regular UNIX System tool. For example, with a one-line awk program, you can count the number of lines in a file (like wc), print the first field in each line (like cut), print all lines that contain the phrase “open source” (like grep), or exchange the position of the third and fourth fields in each line (like join and paste). However, awk is a programming language with control structures, functions, and variables that allow you to write even more complex programs.

awk is specially designed for working with structured files and text patterns. It has built-in features for breaking input lines into fields and comparing these fields to patterns that you specify This chapter will show you how to use awk to work with structured files such as inventories, mailing lists, and other tables or simple databases.

awk is often used in command pipelines with tools like sort, tr, or sed. Each of these commands can act as a preprocessor or filter to simplify a problem before solving it in awk. For example, it is difficult to sort lines in awk, so using sort on a file before passing the information to awk can make your programs much simpler. In fact, you can process a file in awk, send the result to sort through a pipeline, and then return the output to awk for further processing.

sed is an abbreviation for stream editor. Like awk, it can do complex pattern matching and editing on a stream of characters, although it does not have all of the powerful programming capabilities of awk. In addition to processing text like awk, sed can be used as an efficient noninteractive editor for very large files. sed uses a syntax that is very similar to many vi and ed commands. sed is more challenging to learn than awk, but it is often used as a preprocessor for awk programs.

This chapter will describe many of the commands of awk, enough to enable you to use it for many applications. It does not cover all of the functions, built-in variables, or control structures that awk provides. For a full description of the awk language with many examples, refer to The AWK Programming Language, by Alfred Aho, Brian Kernighan, and Peter Weinberger.

Because awk can be used for almost all of the same tasks, and most people find awk easier to use, this chapter does not devote as much time to sed. If you want to learn sed in greater depth, consult sed & awk, by Dale Dougherty and Arnold Robbins (see the last section of this chapter for bibliographical information).




UNIX. The Complete Reference
UNIX: The Complete Reference, Second Edition (Complete Reference Series)
ISBN: 0072263369
EAN: 2147483647
Year: 2006
Pages: 316

flylib.com © 2008-2017.
If you may any questions please contact us: flylib@qtcs.net