Project 59. Learn the sed Stream Editor"How do I write a script to perform the same sequence of editing commands on a number of text files?" This project shows you how to use the sed stream editor, which changes text files by reading editing commands from a script. Project 58 shows how to apply such commands to a batch of files. Project 61 covers more advanced use of sed, and Projects 60 and 62 cover the awk command. The sed BasicsThe sed stream editor was written to edit text files, but it's not an interactive editor like nano, vim, or emacs. Instead of following commands entered "live" by a user, sed executes edits according to instructions provided in a command script. The most common use of sed is to apply the same set of edits to many files, either as a one-time transformation or at regular intervalsto make a small change across hundreds of HTML files, for example, or to process Apache log files once a day. The sed command writes its output to standard out, so it can easily create new files as it edits existing ones. As of Mac OS X 10.4 (Tiger), sed also accepts the option -i, which directs it to write changes back to the original source file. A sed script consists of editing commands; each command describes a line range and a function. When sed receives files as input, it reads each line by line. When an input line falls in a command's line range, sed applies the corresponding function to that input line. An input line may fall in the line range of many commands and, therefore, will have many functions applied to it. You can write a sed script directly to the command line or to a file. This project considers simple scripts of just a few lines, which we'll write directly to the command line. Scripts that are more complex are usually written to files and are the subject of Project 61. Tip
Next, we'll look at a few examples to clarify what we've just learned. Let's EditSubstituting one pattern for another is a common use for sed. Should you wish to be formal and replace Jill with Jillean, for example, you could employ the following command. First, let's view the original file. $ cat fuse.txt Jill has a short fuse - light it and stand well back. So who lit Jill's fuse, and did he stand well back? Read on Next, invoke sed to perform the edit by typing $ sed 's/Jill/Jillean/g' fuse.txt Jillean has a short fuse - light it and stand well back. So who lit Jillean's fuse, and did he stand well back? Read on We invoked sed, passing the quoted script 's/Jill/Jillean/g' and the name of the file to process. Although not necessary in this example, it's wise always to quote the script to prevent the shell from expanding special characters before passing them to sed. Our script consists of one command, which does not define a line range; therefore, its function is applied to every line of the file. The function is s for substitute, which has the syntax s/match-text/replace-text/flags. The flag g is for global replace; see "sed Functions" later in this project. Our next example deletes all blank lines from a file. The sed command to do this specifies a line range "every blank line" and the function delete. "Every blank line" is defined by the regular expression ^$, delimited by forward slash characters (/). The function d deletes the matched line. $ sed '/^$/d' blanks.txt Line ranges are usually given as plain text or regular expressions, and all lines that contain a match for the text or regular expression fall in the line range. Make sed grepMake sed behave like grep by combining function p, to print matching lines, and option -n, to suppress the automatic echoing of every input line. We'll search the file biff.txt for all lines that contain the text Biff. $ sed -n '/Biff/p' biff.txt *I forget the name - let's assume Biff for want of a single syllable, grunt-able word). *'Biff' grinned, and I swear that I could hear a few synaptic connections sparking the thought 'threesome' (had he been able to count to three). *I declined the unspoken suggestion, confusing Biff somewhat (I've added a star [ * ] to mark each line because lines in the original text occupy several lines when printed in this book.) In this example, the line range is described by the plain-text expression /Biff/ ; every line that contains the text Biff will fall in the range and be printed by the function p. Encode a FileHave some fun encoding text files. In this example, we apply function y, which transforms input lines by replacing characters listed in the first set with those listed in the second set. Our example shifts all letters in the input text one place to the right. No filenames are specified, so sed reads and writes standard input and output. Press Control-d (end of input) when you get bored. $ sed 'y/abcdefghijklmnopqrstuvwxyz/¬ bcedfghijklmnopqrstuvwxyza/' this is just a bit of fun uijt jt kvtu b cju pg gvo <Control-d> Learn More
Tip
Line Ranges in sed ScriptsTip
Tip
Let's examine line ranges in more detail. The most basic line range is the empty one that matches all lines in the input file. When not empty, a line range may be a single address, which usually consists of a regular expression (such as /^$/ to select all empty lines) or plain text (such as /Jill/ to select all lines containing the text Jill). Two addresses separated by a comma select all lines from the first line that matches the first address to the first subsequent line that matches the second address. To select just the lines in Chapter One, for example, we might specify the line range /Chapter One/,/Chapter Two/ (assuming that Chapter One starts with the text "Chapter One," and similarly for Chapter Two). sed FunctionsImmediately following a line range, sed expects to see an editing function to be applied to each line in the range. Here are some of the most useful sed functions:
Multiple sed CommandsSuppose that we have a couple of replacements to make to a text file and that we also want to delete lines that contain the author's notes. We could apply sed several times, once for each edit, but instead, let's take advantage of the fact that a sed script, like any script, can include multiple commands. With sed, there are three methods available for writing multiple-command scripts:
The following examples have Jan drinking gin instead of vodka, make Sophie 5 years younger (she'll love me for that), and remove the author's notes. Here's the original text. $ cat sophie.txt Note Move this section down I returned to planet earth when a lady sat down beside me and announced: "Hi, I'm Sophie". "Hello, I'm... (thinking through a Vodka haze) Jan". Note Check the grammar here She smiled and we chatted for a while. Sophie was about 30, bleached-blonde, good-looking, and just a shade overweight. Now let's apply a three-command sed script in which we separate the lines of the script with semicolons. $ sed 's/Vodka/Gin/g;s/30/25/g;/^[N|n]ote/d' sophie.txt I returned to planet earth when a lady sat down beside me and announced: "Hi, I'm Sophie". "Hello, I'm... (thinking through a Gin haze) Jan". She smiled and we chatted for a while. Sophie was about 25, bleached-blonde, good-looking, and just a shade overweight. Alternatively, we could specify three separate commands by typing $ sed -e 's/Vodka/Gin/g' -e 's/30/25/g' -e '/^[N|n]ote/d' ¬ sophie.txt For the third alternative, we'll create a script file called 3edits and pass the name of that file to sed. A sed script is a regular text file with each edit command on a separate line. $ cat 3edits s/Vodka/Gin/g s/30/25/g /^[N|n]ote/d $ sed -f 3edits sophie.txt All three alternatives yield the same results. Complex Line RangesIn previous examples, we used a single address (in the form of a regular expression) to match the lines we wanted to edit. Tell sed to select a range of lines to edit by specifying two addresses. The first line to match the first address marks the start of the range. The first line after that to match the second address marks the end of the range. As an example, suppose that we want to remove all of many paragraphs within an HTML file that have been assigned the class tail. Each begins with the HTML tag <p >, and each ends with a closing tag </p>, like this abbreviated example. <p > <b>Six Vodkas: </b>A tale of Vodka, misunderstanding, an... </p> To delete all such paragraphs, we use the following command. $ sed '/<p >/,/<\/p>/d' tails.html Having matched and deleted the first such paragraph, sed continues to search the file for subsequent ranges and deletes all those it finds. To print the paragraphs instead of deleting them, type $ sed -n '/<p >/,/<\/p>/p' index.php ... <p > <b>The Immovable Object: </b>A tale of pain, pursed lips... </p> Suppose you wish to edit only in paragraphs that meet certain criteriachanging tale to tail only within paragraphs of class tail, for example. To do so, we again use a range to specify the matching criteria and apply a sed substitute function to the range. $ sed '/<p >/,/<\/p>/s/tale/tail/' index.php <p > <b>The Immovable Object: </b>A tail of pain, pursed lips... </p> Note
|