Module 28 csplit (SV)

Previous Table of Contents Next


Module 28
csplit (SV)

DESCRIPTION

The external csplit command divides an input into smaller files based on context or line numbers. The context splitting is performed based on words and patterns you specify on the command line. The line numbers must also be specified as context strings. The input file remains unaltered.

COMMAND FORMAT

Following is the general format of the csplit command.

 csplit [ -ks ] -fprefix ] - arg1 [arg2 ... ]  csplit [ -ks ]      -fprefix ] file arg1 [ arg2 ... ] 

Options

The following list describes the options and their arguments that may be used to control how csplit functions.

-k All output files created are not removed if an error occurs. The standard procedure for csplit is to remove all files it has created if an error occurs.
-s Suppresses the displaying of character counts. Normally csplit displays the character counts for each file it creates.
-f prefix Replaces the xx prefix used for output filenames. Must be 12 characters or less in length. Some System V systems implement BSD file systems which allow for longer filenames.

Arguments

The following list describes the arguments that may be passed to the csplit command.

- Forces csplit to read from the standard input. Useful if you need csplit to read the output of a pipe.
file The input file that is read by csplit and divided into smaller files.
arg1 The strings that are used by csplit to divide the input into smaller, more manageable output files. See the discussion in the following Context section.

Context

Using csplit allows multiple ways to divide input into smaller files. The arguments provided on the command line may be any of the following:


NOTE:    
Enclose all regular expressions in quotes to prevent interpretation by the shell.

/ expr / Creates a file containing lines from the current input line up to but not including the location of the regular expression expr . You may place a + n or - n after this argument to specify the number of lines before or after the expression to split the input. No spaces are allowed between the expression and this offset number. The current line is changed to be the new position defined by the expression. The regular expressions recognized by ed are used by csplit .
% expr % Do not create a file containing lines specified by the regular expression. Same as the / expr / context but the output is NOT created.
lineno Creates a file from the current position up to but not including lineno. Lineno must be an integer number.
{ num } Repeat preceding argument. You use this argument in conjunction with any of the above arguments. If you use { num } after an expr , csplit uses the argument num times. If { num } follows a lineno , the input is split every lineno lines for num times. For example, csplit file 100 {10} would split the file every 100 lines.

FURTHER DISCUSSION

The csplit command is an enhanced split command. It lets you divide a file based on certain criteria. You are not restricted to an equal number of lines per output file as with the split command. By specifying regular expression strings as arguments you can divide the input into files based on meaningful sections. The splitting of the input can also be done based on line numbers. Thus csplit can perform equal file splitting and context file splitting.

The output is written to files with a prefix and an extension. You can specify the prefix using the -f prefix option. If you do not specify a prefix, csplit uses the xx prefix as a default. The suffix is also placed on the output files by the csplit command. The suffixes begin at 00 and continue to 99. The input is divided into these files based on the following concepts:

xx00 First file contains all lines from the beginning of the file up to but not including the line referenced by the first argument.
xx01 Contains all lines from arg1 (argument one) up to but not including arg2.
xxNN Contains all lines from argNN-1 to the end of the file, where NN is the number of arguments you specified on the command line.

One useful example of the csplit command is to divide a large troff or nroff document into smaller sections. One way to do this is to divide each level 1 heading into a file of its own. For instance, you might try the following command:

 cj> csplit -k document '/^.H 1/' {99} 

This will split a file containing troff (nroff) level 1 headings into files containing an entire level 1 heading. The -k option keeps csplit from removing all the files it created if there are not 99 occurrences of the ^.H 1 string in the input file.

DIAGNOSTICS AND BUGS

One of the messages returned by csplit is not very clear, it is:

arg - out of range This message is informing you that one of the arguments you supplied did not locate a position within the allowed range. The allowed range is from the current line to the end of the file.


Previous Table of Contents Next

Copyright Wordware Publishing, Inc.


Illustrated UNIX System V
Illustrated Unix System V/Bsd
ISBN: 1556221878
EAN: 2147483647
Year: N/A
Pages: 144
Authors: Robert Felps

flylib.com © 2008-2017.
If you may any questions please contact us: flylib@qtcs.net