Project61.Learn Advanced sed


Project 61. Learn Advanced sed

"What functionality does the sed command offer for the more advanced editing tasks?"

This project presents a couple of tasks that illustrate some of the more advanced editing capabilities of the sed stream editor. Project 58 shows how to apply such commands to a batch of files. Project 59 covers basic use of sed, and Projects 60 and 62 cover the awk command.

Highlight a Block of Text

We'll illustrate some of the advanced feature of sed tHRough a couple of tasks that highlight first a block of text and then individual lines of text.

For task one, we'll write a sed script that highlights a block of text by placing >>> at the start of each line in the block. Let's assume that the region to be highlighted has previously been marked by mark-start and mark-end.

The (abridged) text looks like this. (In reality we might have hundreds of lines to mark, justifying the use of sed.)

$ cat sophie.txt I hopped out of the car and promptly ate gravel. The non-retracting seat belt had wrapped itself around my ankle mark-start clearly attempting to do what Sophie failed to do during the drive home - kill me. :-) Sophie rushed to my rescue, mark-end helped me up, and brushed off the stones from my dress. There are better ways to get "stoned"!


The sed script must specify the line range as /mark-start/,/mark-end/, and within this range, add the text >>> to the start of each matching line. ("Start of line" is denoted by the regular expression ^.)

To highlight the marked block from file sophie.txt, type the following command.

Learn More

See Project 77 if you are unfamiliar with regular expressions.


$ sed '/mark-start/,/mark-end/s/^/>>>/' sophie.txt I hopped out of the car and promptly ate gravel. The non-retracting seat belt had wrapped itself around my ankle >>>mark-start >>>clearly attempting to do what Sophie failed to do during >>>the drive home - kill me. :-) Sophie rushed to my rescue >>>mark-end helped me up, and brushed off the stones from my dress. There are better ways to get "stoned"!


(For illustrative purposes, we've let output go to the Terminal screen. Normally, you'd write it back to the file by specifying option -i orbefore Mac OS X 10.4by redirecting output to a new file.)

When you wish to remove the highlights we just added, write a sed script to delete >>> from the line range; then delete the marker lines too (we probably don't want them there). The following command does just this, using three edit commands: one to remove >>> and one each to remove the two marker lines.

$ sed '/mark-start/,/mark-end/s/^>>>//;/mark-start/d;¬      /mark-end/d' sophie.txt


If you'd like to remove the highlight characters from the text but leave the marker lines in, including the highlight, you need to be a bit cleverer. We could either remove >>> and then add them back to lines starting with mark-, or we could be more specific in selecting the range of lines to modify. The latter alternative shows off more sed tricks, so we'll choose that one.

Type the following command line.

$ sed '/mark-start/,/mark-end/{/^>>>mark-/!s/^>>>//;}' ¬       sophie.txt I hopped out of the car and promptly ate gravel. The non-retracting seat belt had wrapped itself around my ankle >>>mark-start clearly attempting to do what Sophie failed to do during the drive home - kill me. :-) Sophie rushed to my rescue, >>>mark-end helped me up, and brushed off the stones from my dress. There are better ways to get "stoned"!


This does the trick but requires some explanation. The pattern should be familiar. The action employs braces {...} to introduce a function list. A function list lets us apply more than one function to a line range, where normally only one is allowed, and also allows us to apply a further line range within the existing one.

The example selects lines within the marked block that also start with >>>mark- by specifying /^>>>mark-/ within the braces. The exclamation point that follows inverts the sense of the match, thereby selecting only lines that do not start with >>>mark- (within the marked block). To the selected lines, we apply a substitute function s/^>>>// to remove the text >>> from the start of the line. The function list is terminated by ;}.

Tip

If a pattern contains the delimiter character /, escape it by using a backslash. For example, we can specify the pattern http:// using /http:\/\//. Alternatively, for the substitute function (but not for specifying a line range), we can choose a different delimiter, such as %. We would type %http://%.


Highlight Lines

For our second task, we'll search a text file for a pattern, sophie, and mark each line that contains the pattern by inserting a line containing vvvvvvvvvvvv before it and another containing ^^^^^^^^^^^^ after it. To make things more interesting, we'll make our search case insensitive. Here's the sed script, which we've written to a script file called mark.script.

Learn More

Project 59 covers basic sed use and has an example of using the y function.


$ cat mark.script # convert input to lower case, making a copy in hold space h; y/ABCDEFGHIJKLMNOPQRSTUVWXYZ/abcdefghijklmnopqrstuvwxyz/ # match lines that contain the search term 'sophie' /sophie/ { i\ vvvvvvvvvvvv a\ ^^^^^^^^^^^^ } # restore original input from the hold space, and print it x


Let's examine the script. First, lines starting with the hash (#) symbol are treated as comments and ignored by sed.

The sed command does not provide a case-insensitive pattern match, so we employ the y function to convert the input line to lowercase, and later, we specify the search term in lowercase. Before we corrupt the input line, we must preserve it by copying it to the hold space, using function h. Note that neither function has a line range and, therefore, is applied to every line of input.

Next, we operate on all lines containing the text sophie. The line range is followed by a function list in {...}, meaning that every function inside the braces is applied to a matching line. The function i\ writes the text vvvvvvvvvvvv before the current line, and the function a\ writes the text ^^^^^^^^^^^^ after the current line. The text to be written must be on a new script line, as shown.

Finally, the function x exchanges the contents of the hold space with the current input line, effectively restoring our input to its original mixed-case form before sed writes it to the Terminal screen.

The text passage we'll be marking is in file sophie.txt.

$ cat sophie.txt I hopped out of the car and promptly ate gravel. The non-retracting seat belt had wrapped itself around my ankle clearly attempting to do what Sophie failed to do during the drive home - kill me. :-) Sophie rushed to my rescue, helped me up, and brushed off the stones from my dress. There are better ways to get "stoned"!


To mark the text, we invoke sed, and pass it the name of our script and the input file as arguments.

$ sed -f mark.script sophie.txt I hopped out of the car and promptly ate gravel. The non-retracting seat belt had wrapped itself around my ankle vvvvvvvvvvvv clearly attempting to do what Sophie failed to do during ^^^^^^^^^^^^ vvvvvvvvvvvv the drive home - kill me. :-) Sophie rushed to my rescue, ^^^^^^^^^^^^ helped me up, and brushed off the stones from my dress. There are better ways to get "stoned"!





Mac OS X UNIX 101 Byte-Sized Projects
Mac OS X Unix 101 Byte-Sized Projects
ISBN: 0321374118
EAN: 2147483647
Year: 2003
Pages: 153
Authors: Adrian Mayo

flylib.com © 2008-2017.
If you may any questions please contact us: flylib@qtcs.net