CPS 444/544 Lecture notes: Filters
Coverage: [UPE] Chapter 4, §4.2 (pp. 106-108)
tr (ansliterate)
- only reads from standard input
- syntax: tr <string1> <string2>
- converts characters in <string1>
to those, respectively, in <string2>
- tr A-Z a-z < myfile
- options:
- tr -d (delete character(s) in <string1>)
- tr -c
(act on complement of <string1>)
- tr -s (squeeze strings of repeated characters)
sort
- can be fine-tuned to sort columns in a variety of ways
- sort -n (numeric-sort:
compare according to string numerical value)
- sort -g (general-numeric-sort:
compare according to general numerical value)
- sort -r (reverse sort:
reverse the result of comparisons)
- sort -rn (reverse numeric-sort)
- sort -d (dictionary order:
consider only blanks and alphanumeric characters)
- sort -b (ignore leading blanks)
- sort -f (ignore-case:
fold lower case to upper case characters)
- sort -k=2 (sort on column 2)
- sort -t":" -k=2 (sort on column 2 using colon delimited columns)
uniq
- purges duplicate consecutive lines (must be adjacent)
- fast (linear time)
- options:
- uniq -d (only prints the lines which are repeated)
- uniq -u (only prints the lines which are not repeated)
- uniq -c (count)
hello
hi
hi
hello
exercise: give output of following command lines on above input stream:
uniq
uniq -u
uniq -d
uniq -c
to purge duplicates, first sort and then apply uniq, e.g.,
sort name | uniq = sort -u names
Spellers
- spell
- ispell (interactive spell)
- aspell
- add following line to your .vimrc
to invoke aspell on the current file in vim
using <ctrl-t>:
map ^T <CR>:!aspell
--dont-backup check %<CR>:e! %<CR>
Pipeline of filters
(recall UNIX model of computation;
communication mechanism setup for free by the shell)
$ spell uist2003.tex | sort | uniq
$ spell uist2003.tex | sort | uniq | wc -l
$ spell uist2003.tex | sort -u
$ spell uist2003.tex | sort -u | wc -l
cut and paste
- extract or merge fields or columns from lines
- $ who | cut -d" " -f1 | paste - -
- join (relational database operator)
File comparison utilities
cmp
diff (find and output
differences between two files or two directories)
$ diff file1 file2
$ diff dir1 dir2
sdiff
Printing utilities
- script
- lpr
- lpd
- lpq
- a2ps (ascii to postscript)
- enscript
- nenscript
- ghostview
- gv
- ggv
- xpdf
- acroread
- ps2pdf
- pdf2ps
- latex
- dvips
- troff
- nroff
- expand (converts tabs to spaces)
- unexpand
- indent
- dos2unix
- unix2dos
- xfig
- ppds
References
| [UPE] |
B.W. Kernighan and R. Pike. The UNIX Programming Environment.
Prentice Hall, Upper Saddle River, NJ, Second edition, 1984.
|
|