Commands

colrm

[startcol [endcol]]

Removes the specified columns from a file. This command reads from standard input. Startcol must be specified. If no endcol is specified, colrm will remove all columns up to the end of the line. Otherwise, colrm will remove all the columns from startcol to endcol.

Example: To remove the first three characters from each line in a testfile, use

cat testfile | colrm 1 3

column

[-tx] [-c columns] [-s sep] [input_file ]

This program formats the input data into multiple columns. Typically, you will be formatting a file; however, if no file is specified, column will read from standard input.

Example: To create a table from a text file delimited by "|"s, use

column -t -s \| datafile.txt > table.txt

This command is useful for formatting data created by databases and spreadsheets.

`-c`	Format output, for a display, the specified number of columns wide.
`-s`	Specify character(s) used to delimit columns.
`-t`	Format output as a table.
`-t`	Determine the number of columns the input contains and create a table. Columns are delimited with white space, by default, or pretty-printing displays.
`-x`	Force column to fill columns before filling rows. (Default is to fill rows before columns.)

csplit

[OPTION] PATTERN

Splits the input set into output files according to the criteria set by PATTERN. The output filenames are constructed of a prefix ("xx" by default) and a suffix (00-99 by default).

Example: To split up a file into 100 line segments:

csplit -k testfile 100 {*}

Example: To split up file into segments delimited by the word "Chapter":

csplit -k testfile /Chapter/ {*}

Use this with the -k option; otherwise, you usually don't get anything.

PATTERN types

`line_num`	Copy to output all lines from input up to, but not including, line_num.
`{repeat_count}`	Used with the other options to specify the number of times to repeat the splitting. May be either an integer or, to repeat until the input is depleted, an asterisk.
`/reg_exp/[[+\|-]offset]`	Copy to output all lines from input all lines up to, but not including, the line containing a match for regular_expression. Optionally, include (+ offset) or exclude (- offset) the specified number of lines ahead of or behind the match line.
`%reg_exp%[offset]`	Discard all input up to the line that matches reg_exp.
`-fPREFIX, --prefix=PREFIX`	Use the specified prefix when constructing output filenames.
`-bSUFFIX, --suffix=SUFFIX`	Use the specified suffix when constructing output filenames.
`-nDIGITS, --digits=DIGITS`	Construct output filenames that are DIGITS digits long.
`-k, --keep-files`	In case of error, keep the output files. (Default is to remove them.)
`-z, --elide-empty-files`	Do not generate zero length output files.
`-s, -q, --silent, --quiet`	Suppress printing of output file sizes.

cut

[OPTION] [FILE]

Writes to standard output the specified parts of the input set. The RANGEs are specified as integers separated by a dash (e.g., 1-9). There may be more than one range specified (e.g., 1-9, 12-43). Separate multiple ranges with commas.

Example: To print to standard output the first three fields in a file delimited by the vertical bar symbol, use

cut -f 1,2,3 -d \| testfile

Example: To print to standard output the first five characters of each line in the file testfile, use

cut -c 1-5 testfile

`-bRANGE, --bytes=RANGE`	Output only the bytes specified in RANGE.
`-cRANGE, --characters=RANGE`	Output only output the characters specified by RANGE.
`-fRANGE, --fields=RANGE`	Output only output the fields specified by RANGE. By default, fields are delimited by TABS.
`-dDELIM, --delimiter=DELIM`	Optionally, specify a field delimiter other than <TAB>.
`-s, --only-delimited`	When used with -f, ignore lines that do not contain the field delimiter character.

expand

[OPTION] [FILE]

Used to turn <TAB>s into spaces. Reads from standard input (default) or a file.

`-TAB1[,TAB2]... -tTAB1[,TAB2]... --tabs=TAB1[,TAB2]...`	Set the tab stops. If only one TAB is specified, it will space any succeeding tab stops at equivalent intervals. Otherwise, multiple <TAB> stops may be set at the specified columns.
`-i, --initial`	Convert only tabs that precede nonspace or nontab characters.

fmt

[OPTION] [FILE]

Reformats input set to produce lines of (at most) a certain width. (Default = 75.) This command preserves blank lines and spaces by default.

Example: To format textfile into a file 65 lines wide and send the results to standard ouput, use

fmt -w 65 textfile

`-c, --crown-margin`	This option tells fmt to preserve the indentation of the first two lines within a paragraph and align the left margin of any subsequent lines with the second line.
`-t, --tagged-paragraph`	This option is similar to -c, except that if indentation of the first and second lines of the paragraph match, the first line is treated as a one-line paragraph.
`-s, --split-only`	This option splits lines only, never joins.
`-u, --uniform-spacing`	This option reduces spacing between words to one space and between sentences to two spaces.
`-WIDTH, -wWIDTH, --width=WIDTH`	This option forces fmt to fill lines to specify width. By default, fmt leaves a little room at the end of the line.
`-pPREFIX, --prefix=PREFIX`	This option formats only lines beginning with PREFIX.

fold

[OPTION] [FILE]

Writes each input file (or standard input) to standard output, breaking any lines longer than 80 characters.

Example: To write textfile to standard output in 40 byte length lines, use

fold -b 40 textfile

Use the -b option when you're concerned with size, rather than appearance. A <tab> takes up one byte, but looks like several spaces.

`-b, --bytes`	Count bytes rather than columns.
`-s, --spaces`	Try to break only at word boundaries.
`-wWIDTH, --width=WIDTH`	Use lines of width WIDTH rather than the default of 80.

merge

[ options ] mod1_file original_file mod2_file

If mod1_file and mod2_file are both modified versions of original_file, merge incorporates into mod1_file all of the changes that turned original_file into mod2_file. merge has problems when both mod1_file and mod2_file worked on the same chunk of lines. In that case, both sets of modifications are included, together with a message indicating what was done.

Example: To merge the changes you made to your homework last night with the changes you made last Sunday, use

merge homework.yestrdy homework.c homework.sun

`-A`	In case of conflicting modifications, mod2_file wins.
`-E, -e`	Specify varying levels of information in case of conflicts.
`-L label`	Specify a label to be used, instead of the corresponding filename, in case of conflict.
`-p`	Rather than overwrite mod1_file, send results to standard output.
`-q`	Suppress warning about conflicts.

paste

[OPTION] [FILE]

Merges the specified files horizontally.

Example: Say you have three files: f1, f2, and f3. f1 contains 1 line: "abc". f2 contains 1 line: "def". f3 contains 1 line: "ghi". In that case, the command

paste f1 f2 f3

would yield

abc def ghi

`-s, --serial`	Paste the lines of each file on a separate line, rather than merging them onto a single line.
`-dDELIM-LIST, --delimitersDELIM-LIST'`	Use the listed delimiter(s), instead of <TAB> to separate the merged files.

sort

[OPTION] [FILE]

Sort (or at least compares) the lines in the input set. Writes the results to standard output by default. May also be used to merge files or check if they are already sorted.

Example: To sort the contents of file1 and file2 numerically, merging them into file3, use

sort -nr file1 file2 > file3

`-c`	Check to see whether the files are already sorted.
`-m`	Merge the files while sorting.
`-b`	Ignore leading blanks.
`-d`	When sorting, ignore all characters, except letters, digits, and blanks.
`-f`	Treat lowercase and uppercase characters as if they were the same.
`-g`	Sort numerically, but also allow floating point numbers.
`-i`	Ignore nonprintable characters.
`-M`	Order any three-letter month abbreviations (Jan., Feb., ) by month, rather than alphabetically.
`-n`	Sort numerically.
`-r`	Reverse the results of the sort.
`-oOUTPUT-FILE`	Send results to OUTPUT-FILE, rather than standard output.
`-tSEPARATOR`	Use character SEPARATOR as the field separator when finding the sort keys in each line.

[OPTION] SET1 [SET2]

tr (translate) copies from standard input to standard output, while making one of the following modifications to the data stream:

1. Translate (and optionally squeeze) specified characters.

2. Squeeze repeated characters.

3. Delete characters.

4. Delete characters, then squeeze repeated characters from the result.

The idea is that a character in the ordered set SET1 is translated into the corresponding character in SET2.

Example: To convert all uppercase letters in a file into lowercase letters and store in the file uppercase.txt, use

cat sourcefile | tr '[:upper:]' '[:lower:]' > uppercase.txt

Backslash escapes
A backslash followed by a character that is not among the following ones causes an error message:
`\a`	Control-G
`\b`	Control-H
`\f`	Control-L
`\n`	Control-J
`\r`	Control-M
`\t`	Control-I
`\v`	Control-K
`\OOO'`	The character with the value given by OOO, which is 1 to 3 octal digits

`Ranges`	Specify a range by giving the first and last characters specified by a dash (e.g., A-Z, 0-9).
`Repeated characters`	The notation "[CN]" in SET2 expands to N copies of character C (e.g., [a6] = aaaaaa).

Character classes
The following are some of the predefined character classes that are available:
alnum	Letters and digits.
alpha	Letters.
blank	Horizontal white space.
cntrl	Control characters.
digit	Digits.
graph	Printable characters, not including space.
lower	Lowercase letters.
print	Printable characters, including space.
punct	Punctuation characters.
space	Horizontal or vertical white space.
upper	Uppercase letters.
xdigit	Hexadecimal digits.

`-d, --delete`	Delete any input characters found in SET1. When given just the "--delete" ("-d") option, "tr" removes any input characters that are in SET1.
`-s, --squeeze-repeats`	Replace each character in SET1 that is repeated with a single instance of that character.

unexpand

[OPTION] [FILE]

Copies standard input to standard output, while replacing any sequences of spaces that are equal to a <TAB> with (you guessed it) a <TAB>.

Example: To compress any sequences of five spaces in textfile into a <TAB>, use

unexpand -i 5 -a textfile

`-TAB1[,TAB2]..., -tTAB1[,TAB2]..., --tabs=TAB1[,TAB2]...`	Set the tab stops. (If only one TAB is specified, it will space any succeeding tab stops at equivalent intervals. Otherwise, multiple <TAB> stops may be set at the specified columns.)
`-i`	Specify the width of a single tab. (Default is 8.)
`-a, --all`	Convert all applicable strings, not just the initial ones.

uniq

[OPTION] [INPUT [OUTPUT]]

uniq goes through a sorted file and discards any repeats it finds.

Example: Let's say that your company has engaged in a partnership with people who are supposed to ship you a data file each month. You're supposed to load it into your database and process it. The guys who shipped you the file say that the field, which occurs from characters 5 to 10, is the primary key, but their database is terrible and Informix won't let you load their file because what they're trying to call a primary key isn't unique. The following will make your problem go away:

uniq +4 -w 6 -u datafile > datafile.good

The +4 option tells uniq to ignore the first four characters. The -w 6 specifies that only six characters need be unique. The -u option tells the program to discard any duplicates, and the > character redirects the results from standard output to the file datafile.good.

`-N, -fN, --skip-fields=N`	Skip N tab-delimited fields on each line before checking for uniqueness.
`+N, -sN, --skip-chars=N`	Skip N characters before checking for uniqueness.
`-c, --count`	Add a count of the number of times each line occurred to the output.
`-i, --ignore-case`	When comparing lines, treat uppercase and lowercase as equivalent.
`-d, --repeated`	Print only the duplicated lines.
`-u, --unique`	Print only the unique lines.
`-wN,--check-chars=N`	Compare only N characters on each line (instead of the whole line.)