CPS 444/544 Lecture notes: sed (stream editor)
Coverage: [UPE] Chapter 4, §4.3 (pp. 108-114)
ex (line editor)
- vi is close to a full programming language because of its
use of ex
- a masterpiece in user-interface software design
- approaches to studying: memorize commands or
learn/know general syntax
- general syntax of ex commands: :[address]command[options]
- deleting all blank lines:
- example address
- 10,20 (lines 10 thru 20)
- .,100 (current line thru line 100)
- .,$ (current line thru last line of file)
- % = 1,$
- :set list (display each
TAB as ^Is and EOLs as $)
- :set nolist
- search and replace:
:%s/RE/replacement_text/g
(same as 1,$s/RE/replacement_text/g); examples:
- :%s/Alice/Lucy/g (the g makes it global,
i.e., replace all occurrences, not just the first, on each line)
- %s/hello/& world/g (& represents the
matched text)
- :%s/[TAB]/ /g
(replaces TABs with 3 consecutive spaces on every line)
- %s/[ TAB][ TAB]*$//
(purges trailing whitespace on every line)
- :%s/fprintf/FPRINTF/g
(replaces all occurrences of fprintf with FPRINTF)
- :.,$s/fprintf/FPRINTF/g
(replaces occurrences of fprintf from the
current line (.) to the last line of the file ($)
with FPRINTF)
- :10,20s/fprintf/FPRINTF/g
(replaces occurrences of fprintf
from line 10 to 20 with FPRINTF)
- :%s/^\([A-Z][a-z-]*\)[,][ ]\([A-Z][a-z-]*\)$/\2 \1/
(converts names from <last>, <first>
format to <first> <last> format)
- :%s/^\([[:alpha:]]*\)[ ]\([[:alpha:]]*\)$/\2, \1/
(undoes the previous transformation)
- move text: :100,200m.
(moves lines 100 thru 200 to the current line)
- another example - :10,20w newfile
(extracts lines 10 thru 20 and writes them to newfile)
Essential sed
- (non-interactive) stream editor
- beginnings of a complete command language
- execution model
for each line in the input stream:
- read input line into pattern space,
- apply commands to pattern space,
- send pattern space to stdout.
- similar syntax to ex
- basic syntax: <condition><action>
- detailed syntax:
[<address>[,<address>]][!]<command>[<arguments>]
| conditions | actions |
| /RE/ | d |
| m,n | p |
| $ | q |
| <condition>! | s/RE/string/ |
| <condition1>,<condition2>! |
w <filename> |
| i |
| a |
- invoking sed
- sed '<edit commands>' <file(s)>
- cat <file(s)> | sed '<edit commands>'
- sed -f <edit commands file> <file(s)>
- -e option
- sed -e '{ ... }' file (...
represents more than one editing command
expression on separate lines, and address space applies to all commands
...)
- if curly braces ({ }) omitted, put an individual, and
possibly distinct, address for each editing command expression
- use of -n option which suppresses output (step 3 above)
(with and without p action or d action)
- without -n option, p action assumed
- two examples which produced the same output
- one with -n: sed -n '/one/p' file
- one without -n: sed '/one/!d' file
- sed -n '/<RE>/p' file = grep <RE>
file
- sed is Turing complete
Some representative examples
- sed 's/[TAB]/ /g' main.c
# converts every TAB to three consecutive spaces on every line
(will changes take effect in the file main.c?)
- sed 's/[ TAB][ TAB]*$//' main.c
# purges trailing whitespace from each line
- sed 's/index1/index2/g' main.c # replace string index1 with string
index2
- sed -n '20,30p' file
- sed '1,10d' file
- sed '$d' file
- du -a | sed 's/.*[TAB]//' (ref. [UPE] p. 109)
- sed 's/^\([A-Z][a-z-]*\)[,][ ]\([A-Z][a-z-]*\)$/\2 \1/' file
- sed '10,20w newfile' file
- sed '1,/^$/d' file
- sed -n '/^$/,/^end/p' file
- sed 's/^/[TAB]' file (ref. [UPE] p. 109)
- sed '/./s/^/[TAB]/' file (ref. [UPE] p. 110)
- sed '/^$/!s/^/[TAB]/' file (! inverts the condition)
(ref. [UPE] p. 110)
- deleting line(s) which contain the strings one or two
sed '/one/d
/two/d' file
- put the editing commands above in a file commands.sed and invoke:
sed -f commands.sed <file(s)>
More examples
For the remainder of these notes, consider the following file
named faculty.details:
Name: Barbara Smith Office: 150 Anderson Hall Course: CPS 350
Name: James P. Buckley Office: 139 Anderson Hall Course: ASI 150
Name: Dale Courte Office: 147 Anderson Hall Course: CPS 499/599
Name: Saverio Perugini Office: 145 Anderson Hall Course: CPS 444/544
Name: Atif Abueida Office: 105-B Science Center Course: MTH 218
Name: Jennifer Seitzer Office: 147 Anderson Hall Course: CPS 481/581
Name: R. Sritharan Office: 149 Anderson Hall Course: CPS 487
Name: Joseph E. Lang Office: 146 Anderson Hall Course: CPS 346
Name: William F. Moroney Office: 305 St. Joe's Course: PSY 495/506
Examples:
sed -n '/CPS/p' faculty.details # same as grep CPS faculty.details
sed '/CPS/!d' faculty.details # same as above
sed -n '/[/]/p' faculty.details # prints lines with a cross-listed course; same as sed -n '/\//p' or grep '\/' faculty.details
sed '/\//d' faculty.details # print lines containing a non-cross-listed course; same as grep -v '\/' faculty.details
sed 's/^Name:[ ]//' faculty.details # removes "Name: " from file faculty.details
sed 's/^Name:[ ]//' faculty.details | sed 's/Office:[ ]//' # removes "Name: " & "Office: " from faculty.details
# how can we purge all attribute labels (i.e., "Name: ", "Office: ", "Course: ")? multiple ways:
sed 's/[A-Za-z][A-Za-z]*: //g' faculty.details
sed 's/[A-Za-z]+: //g' faculty.details # will not work, since sed uses basic regular expressions and not full REs
sed 's/[A-Za-z]\{1,\}: //g' faculty.details
sed 's/^Name:[ ]//' faculty.details | sed 's/Office:[ ]//' | sed 's/Course:[ ]//' # purges all attribute labels
sed -e 's/^Name:[ ]//
s/Office:[ ]//
s/Course:[ ]//' faculty.details
cat sedfile
s/^Name:[ ]//
s/Office:[ ]//
s/Course:[ ]//
sed -f sedfile faculty.details
sed 's/^Name:[ ]\(.*\)Office:[ ]\(.*\)Course:[ ]\(.*\)$/\1\2\3/' faculty.details
sed 's/[A-Za-z][A-Za-z]*://g' faculty.details
d for delete
- delete lines from the output stream, not original file
- examples:
- sed 'd' faculty.details reads in one line at a time into
a buffer (work space), deletes it, and prints the contents of the
buffer (in this case, empty)
- sed '1d' faculty.details reads in one line at a time
into the buffer, deletes it if it is line 1, and prints the buffer
contents onto output (in this case, all lines except 1 would be output)
- sed '$d' faculty.details does the same, but for the last line
- sed '2,4d' faculty.details deletes lines from 2 up to
and including line 4
- sed '/Smith/,/ran/d' faculty.details deletes lines
starting from one which matches Smith up to and including one
which matches ran
- sed '/Smith/,/ran/!d' faculty.details negates the address (i.e.,
do not delete these lines, and delete others)
p for print
- print lines from the buffer
- examples:
- sed 'p' faculty.details reads in one line
at a time into the buffer and prints each. Notice that by default
sed prints what is in the buffer. Therefore, you will get
two copies of each line.
- in sed -n 'p' faculty.details, the -n suppresses
the default print action of sed. Therefore, this is the equivalent
of doing a cat.
- we can use the same addressing commands as before (e.g.,
sed -n 4,6 'p' faculty.details prints lines 4 through 6).
More sed jargon
- = prints (just) the line number
- a appends text at the end of the buffer; use
it as a\ followed by what you want to append
- b branches out of pattern matching
(i.e., stop attempting to make more matches)
Exercises
Write sed commands/scripts to do the following:
Put the editing commands above in a file rank.f and invoke it
as: sed -n -f rank.f faculty.details. Note that the
b commands are important, otherwise since the last command
is supposed to work for all lines (note the lack of addresses),
everybody will also be listed as an assistant professor. Also note that
you can append multiple lines, each must be followed by a \ except
the last line (observe the \ after the a command). The
braces { and } must be where they are
(i.e., the { must end the
first line and the } must be on a line by itself).
print the lines in the format
<name>:<office>:<course>
(i.e., strip the headers Name: and Office
and Course::
sed
's/Name: \(.*\) Office: \(.*\) Course: \(.*\)/\1:\2:\3/' faculty.details)
print the lines in the format <course>:<office>:<name>
sed 's/Name: \(.*\) Office: \(.*\) Course: \(.*\)/\3:\2:\1/' faculty.details
break down every entry onto three lines:
sed 's/Name: \(.*\) Office: \(.*\) Course: \(.*\)/\1\n\2\n\3/' faculty.details
A tale of two buffers
Normally, sed reads one line at a time into its
main buffer (sometimes called the pattern buffer). There is another
buffer (called the hold buffer) available for use.
Some commands to work with this buffer include:
- h copies the contents of the main buffer into the hold
buffer, thus overwriting whatever it was that was already in the
hold buffer
- g copies the contents of the hold buffer into the main buffer,
overwriting it
- H does the same as h, except it appends the contents of
the main buffer after the last line in the hold buffer
- G does the same as g, again in the `append' sense
- x exchanges contents of the two buffers; what was
in hold buffer is now in the pattern buffer, and vice versa;
a buffer (work space), deletes it, and prints the contents of the
buffer (in this case, empty)
- N reads in an additional line and appends it to the
contents of the pattern buffer; in between the original line
and the newly added line, N will insert a newline (\n)
character; useful for reading in multiple
lines at a time (see flip example below)
More exercises
Write sed commands/scripts to (put the solutions in a separate file,
and invoke using sed -n -f option):
Notice that you have to do the transformations in this order,
else everybody gets assigned to Miriam Hall! This example shows that
sed reads in one line at a time, applies all the commands
sequentially, then picks the next line, and so on. This is
in contrast to reading all lines at once, applying the first
command, then reading all again, applying the second command,
and so on.
Pretty print the file so that each line has one line before it describing
what it is about (e.g., "The next line is about Barbara Smith") before
the first line.
{
h # hold buffer now contains what was matched
s/Name: \(.*\) Office: .* Course: .*/The next line is about: \1/
G # appends hold buffer to pattern buffer
p
}
Completely capitalize the names of faculty.
{
h # save the current line in hold buffer
s/Name: \(.*\) Office: .* Course: .*/\1/
y/abcdefghijklmnopqrstuvwxyz/ABCDEFGHIJKLMNOPQRSTUVWXYZ/
G # current buffer contains a capital name, newline, old line
s/\(.*\)\nName: \(.*\) Office: \(.*\)/Name: \1 Office: \3/
p
}
Flip alternate lines
$p
{
N # read the next line, we now have two lines
s/\(.*\)\n\(.*\)/\2\n\1/ # flip the two lines
p # print it
}
Delete all the blank lines.
/^$/ {
d
b
}
p
Replace multiple blank lines wherever they occur with just one blank line.
/^$/ {
N
/^\n$/D
}
p
Notice that this uses a new command, namely D. D
is just like d, it deletes the contents of the pattern (main)
buffer. However, while d deletes the entire buffer,
D deletes only until the first embedded newline.
References
| [UPE] |
B.W. Kernighan and R. Pike. The UNIX Programming Environment.
Prentice Hall, Upper Saddle River, NJ, Second edition, 1984.
|
|