Flylib.com

Books Software

 
 
 

Forms of the Sed Command


Forms of the Sed Command

The syntax of the sed command comes in two forms, as follows :


sed

[

-an

]

command [file


]


sed

[

-an

]

[-e command]

[

-f


command_file] [file


]

You will use one form or the other, but not both, in one command. The first form is the simpler one: one sed command is used to edit the input file(s). If no input files are used, sed edits data read from stdin. Here is a command of this type:


sed '/A/d' goodoleboys.txt

The sed command is /A/d. The input file is goodoleboys.txt, and the modified data is written to stdout .

You need to use the second form when you want to apply more than one editing command to the input. This type of command is shown in the following example:

sed -e 's/Daisy/Ethel/' -f seddata1.txt goodoleboys.txt

The e option allows you to include editing commands within the command string itself, while the f option tells sed to read editing commands from a file. In this example, sed reads commands from two places. First, it applies the command following the e option. Then, it applies the commands in the seddata1.txt file. If the e option followed the f option, sed would apply the commands in seddata1.txt file first.

You can use the filter form of sed when you're doing interactive or scripted Qshell work. Sometimes, a sed solution is easier to write correctly than other forms of Qshell variable expansion and substitution.

Here is an example of sed as a filter:

for i in *.txt ; do j=$(echo $i  sed -e 's/\.txt$/.new/');
echo mv $i $j; done

This example sends all text-file names in the current directory to sed in a command substitution. Sed changes each name so that it has an extension of .new , and assigns the result to the j variable. The loop then generates a command using the mv utility to rename the original text file to the new file. The mv command isn't executed directly; instead, echo is used to display the mv command to standard output. Displaying generated Qshell commands is always a prudent debugging step before executing them.



Sed Options

The four options for sed are listed in Table 18.1.

Table 18.1: Sed Options

Option

Description

a

Delay opening of files to which output is directed with the w command.

e

Read a sed command from the following argument.

f

Read sed commands from the file named in the following argument.

n

Do not automatically write to stdout .

The a option delays the opening of files that are to be overwritten until the last possible moment. Normally, Qshell clears files that are to be overwritten before sed begins to run. This means that files will be cleared that might not be written to. The a option ensures that a file is not cleared unless it is written to.

The e option, which is repeatable, precedes a sed command. The f option, which is also repeatable, precedes the name of a file in which sed commands are stored. The e and f options are not mutually exclusive. As you saw in the previous example, you may use both of them in the same command.

The n option tells sed not to automatically write the contents of the pattern space to stdout after applying all editing commands.



Sed Commands

A sed command consists of three parts: the address, a function, and arguments. You may precede the address and function parts of the command with white space. As the following syntax shows, the only required part is the function:


[address[,address]]


function


[arguments]

Let's look at each of the three parts in more detail.

Address

The address identifies the lines to be selected. Depending on the function, you may specify no address, a single address, or two addresses separated from one another by a comma.

If you do not specify an address, all lines of the input file are selected for editing. If you specify one address, only the lines matching the address are edited. If you specify two addresses, sed edits one or more ranges of lines.

Each address can be

  • A line number, from all input files numbered consecutively

  • A dollar sign, to indicate the last line of the last input file

  • A regular expression delimited by the forward-slash character, /

The regular expressions are similar to the basic regular expressions that grep and other utilities recognize, but sed adds two features of its own:

  • The escape sequence \n matches the newline character.

  • Any character other than a backslash or newline may be used as a delimiter in regular expressions. Any delimiter may be escaped with a backslash.

Table 18.2 lists the regular-expression metacharacters for sed .

Table 18.2: Metacharacters for Use with Sed

Metacharacter

Description

(period)

Match any character except end-of-line.

*

Match zero or more occurrences of the preceding character.

^

Match from the beginning of the line.

$

Match from the end of the line.

[ ]

Match any character within the brackets. Ranges may be specified with a hyphen.

[^ ]

Negate the groups or ranges of characters in the brackets. The caret must be the first character within the brackets.

\{m\}

Match exactly m occurrences of the preceding character.

\{m,\}

Match m or more occurrences of the preceding character.

\{m,n\}

Match m to n occurrences of the preceding character.

\

Turn off the special meaning of the following character.

\(\)

Define a back reference to save matched characters as a pattern. The saved pattern can be referenced with a backslash followed by a number.

//

Match the last-used regular expression.

Function and Arguments

The function is the command itself. It tells sed what to do with the input record. All functions are one character long. They are listed in Table 18.3.

Table 18.3: Sed Functions

Function

Arguments

Description

Maximum Addresses

a

text

Write text to stdout after writing the pattern space.

1

b

label (optional)

Branch to a label. If a label is not specified, branch to the end of the list of functions.

2

c

text

Replace line(s) with new text.

2

d

Do not write the pattern space to stdout.

2

D

Delete the pattern space up to and including the first newline character

2

g

Copy the holding buffer to the pattern space.

2

G

Append the holding buffer to the pattern space.

2

h

Copy the pattern space to the holding buffer.

2

H

Append the pattern space to the holding buffer.

2

i

text

Write text to stdout before writing the pattern space.

1

l(ell)

Replace nonprintable characters with visual representations.

2

n

Write the pattern space to stdout (unless the-n option was specified), and read the next line of input into the pattern space.

2

N

Append the next input line to the pattern space

2

p

Print the pattern space to stdout immediately.

2

P

Print the pattern space, up to and including the first newline character, immediately.

2

q

Terminate the editing session after processing the current input record.

1

r

file name

Read a file into stdout.

1

s

search string, replacement string, flags

Substitute one string for another.

2

t

Branch if substitutions have been made.

2

w

file name

Write the pattern space to a file.

2

x

Exchange the contents of the holding buffer and the pattern space.

2

y

Replace each character in a set with the corresponding character of another set.

2

=

Write the line number to stdout.

1

: ( colon )

label

Define a label as a target for a branch.

You may negate a function by preceding it with an exclamation point. The following example deletes all lines except line 2:

sed '2!d' goodoleboys.txt

You may also include more than one function for an address. Enclose the group of functions in braces, and follow each function with a semicolon and at least one space, as shown here:

sed '2{h; d; }' goodoleboys.txt

When sed reads line 2, it executes two functions, h and d .

Instead of a semicolon and space, you may also separate functions with newline characters, like this:

sed '2{
>
h
>
d
>
}' goodoleboys.txt

As you can see, sed allows each function to be listed on its own line. When sed reads line 2, it executes two functions, h and d . Notice that the commands are separated by newline characters rather than by semicolons. The greater-than signs are the Qshell secondary prompt character.