Processing Files on TAPE in UNIX Environments | SAS 9.1 Companion For Unix Enivronments

Introduction to Processing Tape Files

Tape devices are inherently slow and should be used on a regular basis only for archiving files or for transferring data from one system to another.

There are four UNIX commands that are frequently used to process tape files on UNIX:

mt	positions the tape (winds forward and rewinds). On AIX, this command is tctl .
dd	converts, reblocks, translates , and copies files.
cat	concatenates , copies, and prints files.
tar	saves and restores archive files.
remsh	connects to the specified host and executes the specified command.

For a complete description of these commands, refer to the man pages.

In addition, you will almost always need to use a no-rewind device and the SAS system option TAPECLOSE=LEAVE to get the results you want.

You can use either the TAPE device type or the PIPE device type to process tape files.

Using the TAPE Device Type

To use the TAPE device type, enter the FILENAME statement as follows :

FILENAME fileref TAPE 'tape-device-pathname' < options >;

The tape-device-pathname is the pathname of the special file associated with the tape device. Check with your system administrator for details. Enclose the name in quotation marks.

For example, this FILENAME statement associates YR1999 with a file stored on a tape that is mounted on device /dev/tp0 :

 filename yr1999 tape '/dev/tp0';

Using the PIPE Device Type

You can also use the PIPE device type together with UNIX dd command to process the tape:

FILENAME fileref PIPE ' UNIX-commands ';

UNIX-commands are the commands needed to process the tape.

Using the PIPE device type and the dd command can process the tape more efficiently than the TAPE device type, and it allows you to use remote tape drives . However, using UNIX commands in your application means that the application will have to be modified if it is ported to a non-UNIX environment.

For example, the following DATA step writes an external file to tape:

 options tapeclose=leave;   x 'mt -t /dev/rmt/0mn rewind';   filename outtape pipe 'dd of=/dev/rmt/0mn 2> /dev/null';   data _null_;      file outtape;      put '1 one';      put '2 two';      put '3 three';      put '4 four';      put '5 five';   run;

The following DATA step reads the file from tape:

 options tapeclose=leave;   x 'mt -t /dev/rmt/0mn rewind';   filename intape pipe 'dd if=/dev/rmt/0mn 2> /dev/null';   data numbers;      infile intape pad;      input digit word .;   run;

If the tape drive that you want to access is a remote tape drive, you can access the remote tape drive by adding remsh machine-name to the X and FILENAME statements. For example, if the remote machine name is wizard , then you could read and write tape files on wizard by modifying the X and FILENAME statements as follows:

 x 'remsh wizard mt -t /dev/rmt/0mn rewind';   filename intape pipe 'remsh wizard \                   dd if=/dev/rmt/0mn 2> /dev/null';

Working with External Files Created on the Mainframe

There are three main points to remember when dealing with tapes on UNIX that were created on a mainframe:

UNIX does not support IBM standard label tapes. IBM standard label tapes contain user data files and labels, which themselves are files on the tape. To process the user data files on these tapes, use a no-rewind device (such as /dev/ rmt/0mn ) and the mt command with the fsf count subcommand to position the tape to the desired user data file. The formula for calculating count is
```
 count =(3x user_data_file_number)-2 
```
UNIX does not support multivolume tapes. To process multivolume tapes on UNIX, the contents of each tape must be copied to disk using the dd command. After all of the tapes have been unloaded, you can use the cat command to concatenate all of the pieces in the correct order. You can then use SAS to process the concatenated file on disk.
You must know the DCB characteristics of the file. The records in files that are created on a mainframe are not delimited with end-of-line characters , so you must specify the original DCB parameters on the INFILE or FILENAME statement. In the INFILE statement, specify the record length, record format, and block size with the LRECL, RECFM, and BLKSIZE host options. In the FILENAME statement, if you use the PIPE device-type and the dd command, you must also specify the block size with the ibs subcommand. For more information about host options on the INFILE statement, see "INFILE Statement" on page 299. For more information about the ibs subcommand, refer to the man page for the dd command.

Example: Multivolume, Standard Label Tapes

This example assumes the use of a no-rewind device and TAPECLOSE=LEAVE.

Suppose that you are given a two-reel, multivolume, standard label tape set containing a mainframe external file and told that the record length is 7 and the record format is fixed. You will need to unload the data portion of each tape into disk files, concatenate the two disk files, and process the resultant file.

Make sure that the first tape is in the tape drive, then use the mt command to rewind the tape, skip over the label file, and position the tape at the beginning of the user data file. In this case, the user data file that you want to access is the first (and only) user data file on the tape. To skip over the label and position the tape at the beginning of the user data file, use the fsf count subcommand. Using the formula in "Working with External Files Created on the Mainframe" on page 150, the fsf count value is 1.

 mt -t /dev/rmt/0mn rewind   mt -t /dev/rmt/0mn fsf 1   dd if=/dev/rmt/0mn of=/tmp/tape1 ibs=7

Repeat this process with the second tape, then concatenate the two disk files into one file.

 mt -t /dev/rmt/0mn rewind   mt -t /dev/rmt/0mn fsf 1   dd if=/dev/rmt/0mn of=/tmp/tape2 ibs=7   cat /tmp/file1 /tmp/file2 > /tmp/ebcdic.numbers

You can then use the following DATA step to refer to the concatenated file ( /tmp/ ebcdic.numbers ) and to convert the data using the appropriate EBCDIC informats:

 filename ibmfile '/tmp/ebcdic.numbers';   data numbers;      infile ibmfile lrecl=7 recfm=f;      length digit 8 temp $ 1 word $ 6;      input temp $ebcdic1. word $ebcdic6.;      digit=input(temp,8.);      drop temp;   run;