This section describes SAS procedures that have behavior or syntax that is specific to UNIX environments. Each procedure description includes a brief "UNIX specifics" section that explains which aspect of the procedure is specific to UNIX. Each procedure is described in both this documentation and in Base SAS Procedures Guide .
UNIX specifics: FILE= option in the CONTENTS statement
See: CATALOG Procedure in Base SAS Procedures Guide
PROC CATALOG CATALOG=< libref. > catalog <ENTRYTYPE= etype > <KILL>; CONTENTS <OUT= SAS-data-set > <FILE= fileref >;
| Note |
This is a simplified version of the CATALOG procedure syntax. For the complete syntax and its explanation, see the CATALOG procedure in Base SAS Procedures Guide . |
fileref
The FILE= option in the CONTENTS statement of the CATALOG procedure accepts a fileref. If the
proc catalog catalog=sasuser.profile;
contents file=myfile;
run;
| Note |
The filename that is created is always in lowercase, even if you specified it in uppercase. |
UNIX specifics: name and location of transport file
See: CIMPORT Procedure in Base SAS Procedures Guide
PROC CIMPORT destination = libref < libref. > member-name < option(s) >;
| Note |
This is a simplified version of the CIMPORT procedure syntax. For the complete syntax and its explanation, see the CIMPORT procedure in Base SAS Procedures Guide . |
destination
identifies the file(s) in the transport file as a single SAS data set, single SAS catalog, or multiple
libref <libref.>member-name
specifies the name of the SAS data set, catalog, or library to be created from the transport file.
| Note |
Starting in SAS 9.1, you can use the MIGRATE procedure to convert your SAS files. For more information, see "Migrating 32-Bit SAS Files to 64-Bit in UNIX Environments" on page 106. |
The CIMPORT procedure imports a transfer file that was created ( exported ) by the CPORT procedure. The transport file can contain a SAS data set, a SAS catalog, or an entire SAS library.
Typically the INFILE= option is used to
| Note |
CIMPORT works only with transport files created by the CPORT procedure. If the transport file was created using the XPORT engine with the COPY procedure, then another PROC COPY must be used to restore the transport file. For more information about PROC COPY, see Base SAS Procedures Guide . |
For this example, a SAS data library that contains multiple SAS data sets was exported to a file (called transport-file) using the CPORT procedure on a foreign host. The transport file is then moved by a binary transfer to the receiving host.
The following code
libname newlib 'SAS-data-library'; filename tranfile 'transport-file'; proc cimport lib=newlib infile=tranfile; run;
"CPORT Procedure" on page 275
"Migrating 32-Bit SAS Files to 64-Bit in UNIX Environments" on page 106
The MIGRATE Procedure at support.sas.com/rnd/migration
Moving and Accessing SAS Files
Prints descriptions of one or more files from a SAS data library
UNIX specifics: information displayed in the SAS output
See: CONTENTS Procedure in Base SAS Procedures Guide
PROC CONTENTS < option(s) >;
The CONTENTS procedure produces the same information as the CONTENTS statement in the DATASETS procedure. See "DATASETS Procedure" on page 276 for sample output.
Converts BMDP and OSIRIS system files, and SPSS export files to SAS data sets
UNIX specifics: all
PROC CONVERT product-specification < option-list >;
The CONVERT procedure converts BMDP and OSIRIS system files, and SPSS export files to SAS data sets. The procedure is supplied for compatibility. The procedure invokes the appropriate engine to convert files.
PROC CONVERT produces one output data set, but no printed output. The new data set contains the same information as the input system file; exceptions are noted in "How Missing Values Are Handled" on page 273.
The procedure converts system files from these three products:
BMDP saves files up to and including the most recent release of BMDP (available for AIX, HP-UX, and Solaris only).
SPSS saves files in a portable file format. An SPSS portable file can have any file extension. Two common extensions are .por and .exp.
OSIRIS saves files through and including OSIRIS IV. (Hierarchical file structures are not supported.)
Because the BMDP, OSIRIS, and SPSS products are
In the PROC CONVERT statement, product-specification is required and can be one of the following:
BMDP=
fileref
<(CODE=
code
CONTENT=
converts the first member of a BMDP save file created under UNIX (AIX) into a SAS data set. Here is an example:
filename save '/usr/mydir/bmdp.dat'; proc convert bmdp=save; run;
If you have more than one save file in the BMDP file referenced by the fileref argument, you can use two options in parentheses after fileref . The CODE= option specifies the code of the save file that you want, and the CONTENT= option specifies the content of the save file. For example, if a file with CODE=JUDGES has a content of DATA, you can use the following statements:
filename save '/usr/mydir/bmdp.dat';
proc convert bmdp=save(code=judges
content=data);
run;
OSIRIS= fileref libref
specifies a fileref or libref for the OSIRIS file to be converted into a SAS data set. You must also include the DICT= option.
SPSS= fileref libref
specifies a fileref or libref for the SPSS export file that is to be converted into a SAS data set. The SPSS file must be created by using the SPSS EXPORT command, but it can be from any operating system.
The option-list can be one or more of the following:
DICT= fileref libref
specifies a fileref or libref of the dictionary file for the OSIRIS file. DICT= is valid only when used with the OSIRIS product specification.
FIRSTOBS= n
gives the number of the observation where the conversion is to begin, so that you can skip observations at the beginning of the BMDP, OSIRIS, or SPSS file.
OBS= n
specifies the number of the last observation to be converted. This enables you to exclude observations at the end of the file.
OUT= SAS-data-set
names the SAS data set that will hold the converted data. If OUT= is omitted, SAS still creates a Work data set and automatically names it DATA n , just as if you had omitted a data set name in a DATA statement. See Chapter 4, "Using SAS Files," on page 101 for more information.
If a numeric variable in the input data set has no value or a system missing value, CONVERT
The following sections explain how names are assigned to the SAS
| Caution |
Be sure that the translated names will be unique. Variable names are translated as indicated in the following sections. |
Variable names from the BMDP save file are used in the SAS data set, but nontrailing blanks and all special
For single-response variables, the V1 through V9999 name becomes the SAS variable name. For multiple-response variables, the suffix R n is added to the variable name where n is the response. For example, V25R1 would be the first response of the multiple-response V25. If the variable after V1000 has 100 or more responses, responses above 99 are eliminated. Numeric variables that OSIRIS stores in character, fixed-point binary, or floating-point binary mode become SAS numeric variables. Alphabetic variables become SAS character variables; any alphabetic variable of length greater than 200 is truncated to 200. The OSIRIS variable description becomes a SAS variable label, and OSIRIS print format information becomes a SAS format.
SPSS variable names and variable labels become variable names and labels without change. SPSS alphabetic variables become SAS character variables of the same length. SPSS blank values are converted to SAS missing values. SPSS print formats become SAS formats, and the SPSS default precision of no decimal places becomes part of the variables' formats. The SPSS DOCUMENT data is copied so that the CONTENTS procedure can display it. SPSS value labels are not
These three examples show how to convert BMDP, OSIRIS, and SPSS files to SAS data sets.
Converting a BMDP save file
The following statements convert a BMDP save file and produce the temporary SAS data set Temp, which contains the converted data:
filename bmdpfile 'bmdp.savefile'; proc convert bmdp=bmdpfile out=temp; run;
Converting an OSIRIS file
The following statements convert an OSIRIS file and produce the temporary SAS data set Temp, which contains the converted data:
filename osirfile 'osirdata';
filename dictfile 'osirdict';
proc convert osiris=osirfile dict=dictfile
out=temp;
run;
Converting an SPSS file
The following statements convert an SPSS Release 9 file and produce the temporary SAS data set Temp, which contains the converted data:
filename spssfile 'spssfile.num1'; proc convert spss=spssfile out=temp; run;
The CONVERT procedure is closely
filename myfile 'mybmdp.dat';
proc convert bmdp=myfile out=temp;
run;
libname myfile bmdp 'mybmdp.dat';
data temp;
set myfile._first_;
run;
However, the BMDP, OSIRIS, and SPSS engines provide more
"Accessing BMDP, OSIRIS, or SPSS Files in UNIX Environments" on page 125
Writes SAS data sets and catalogs into a special format in a transport file that can be moved between different
UNIX specifics: name and location of transport file
See: CPORT Procedure in Base SAS Procedures Guide
PROC CPORT source-type = libref < libref. > member-name < option(s) >;
| Note |
This is a simplified version of the CPORT procedure syntax. For the complete syntax and its explanation, see the "CPORT Procedure" in Base SAS Procedures Guide . |
source-type
identifies the file(s) to export as either a single SAS data set, single SAS catalog, or multiple members of a SAS data library.
libref <libref.> member-name
specifies the name of the SAS data set, catalog, or library to be exported.
| Note |
Starting in SAS 9.1, you can use the MIGRATE procedure to convert your SAS files. For more information, see "Migrating 32-Bit SAS Files to 64-Bit in UNIX Environments" on page 106. |
The CPORT procedure creates a transport file to later be restored ( imported ) by the CIMPORT procedure. The transport file can contain a SAS data set, SAS catalog, or an entire SAS library.
Typically the FILE= option is used to specify the
In this example, a SAS data library (called Oldlib) that contains multiple SAS data sets is being exported to the file, called transport-file.
libname oldlib 'SAS-data-library'; filename tranfile 'transport-file'; proc cport lib=oldlib file=tranfile; run;
This transport file is then typically moved by binary transfer to a different host where the CIMPORT procedure will be used to restore the SAS data library.
"CIMPORT Procedure" on page 270
"Migrating 32-Bit SAS Files to 64-Bit in UNIX Environments" on page 106
The MIGRATE Procedure at support.sas.com/rnd/migration
Moving and Accessing SAS Files
Lists, copies, renames, and deletes SAS files, and also manages indexes for and appends SAS data sets in a SAS data library
UNIX specifics: Directory information, CONTENT statement output
See: DATASETS Procedure in Base SAS Procedures Guide
PROC DATASETS < option(s) >; CONTENTS < option(s) ;>
| Note |
This is a simplified version of the DATASETS procedure syntax. For the complete syntax and its explanation, see the DATASETS procedure in Base SAS Procedures Guide . |
CONTENTS option(s)
the value for option(s) can be the following:
DIRECTORY
prints a list of information specific to the UNIX operating environment.
The output from the DATASETS procedure shows you the libref, engine, and physical name that are associated with the library, as well as the names and other properties of the SAS files that are contained in the library. Some of the SAS data library information, such as the filenames and access permissions, that is displayed in the SAS log by the DATASETS procedure depends on the operating environment and the engine. The information generated by the CONTENTS statement also varies according to the device type or access method associated with the data set.
If you specify the DIRECTORY option in the CONTENTS statement, the directory information is displayed in both the log and output
The CONTENTS statement in the DATASETS procedure generates the same Engine/ Host Dependent information as the CONTENTS procedure.
The following SAS code creates two data sets, Grades.sas7bdat and Majors.sas7bdat, and runs PROC DATASETS on Majors.sas7bdat.
options nodate pageno=1;
libname classes '.';
data classes.grades (label='First Data Set');
input student year state $ grade1 grade2;
label year='Year of Birth';
format grade1 4.1;
datalines;
1000 1980 NC 85 87
1042 1981 MD 92 92
1095 1979 PA 78 72
1187 1980 MA 87 94
;
data classes.majors(label='Second Data Set');
input student $ year state $ grade1 grade2 major $;
label state='Home State';
format grade1 5.2;
datalines;
1000 1980 NC 84 87 Math
1042 1981 MD 92 92 History
1095 1979 PA 79 73 Physics
1187 1980 MA 87 74 Dance
1204 1981 NC 82 96 French
;
proc datasets library=classes;
contents data=majors directory;
run;
The output of this example is shown in Output 15.1. The first page of output from this example SAS code is produced by the DIRECTORY option in the CONTENTS statement. This information also appears on the SAS log. Pages 2 and 3 in this output describe the data set Classes.Majors.sas7bdat and appear only on the SAS output.
Output 15.1: PROC DATASETS Example
|
|
The SAS System
The DATASETS Procedure
Directory
Libref CLASSES
Engine V9
Physical Name /remote/u/yourid
File Name /remote/u/yourid
Inode Number 1058605
Access Permission rwxrwxrwx
Owner Name yourid
File Size (bytes) 1024
Member File
# Name Type Size Last modified
1 GRADES DATA 16384 12MAY2003:14:30:19
2 MAJORS DATA 16384 12MAY2003:14:31:20
The SAS System
The DATASETS Procedure
Data Set Name CLASSES.MAJORS Observations 5
Member Type DATA Variables 6
Engine V9 Indexes 0
Created Monday, May 12, 2003 14:31:20 Observation Length 48
Last Modified Monday, May 12, 2003 14:31:20 Deleted Observations 0
Protection Compressed NO
Data Set Type Sorted NO
Label Second Data Set
Data Representation HP_UX_64, RS_6000_AIX_64, SOLARIS_64, HP_IA64
Encoding latin1 Western (ISO)
Engine/Host Dependent Information
Data Set Page Size 8192
Number of Data Set Pages 1
First Data Page 1
Max Obs per Page 169
Obs in First Data Page 5
Number of Data Set Repairs 0
File Name /remote/u/yourid/majors.sas7bdat
Release Created 9.0101B0
Host Created SunOS
Inode Number 1059264
Access Permission rw-r--r--
Owner Name yourid
File Size (bytes) 16384
The SAS System
The DATASETS Procedure
Alphabetic List of Variables and Attributes
# Variable Type Len Format Label
4 grade1 Num 8 5.2
5 grade2 Num 8
6 major Char 8
3 state Char 8 Home State
1 student Char 8
2 year Num 8
|
|
"CONTENTS Procedure" in Base SAS Procedures Guide
Lists the current values of all SAS system options
UNIX specifics: options available only under UNIX
See: OPTIONS Procedure in Base SAS Procedures Guide
PROC OPTIONS =< option(s) >
| Note |
This is a simplified version of the OPTIONS procedure syntax. For the complete syntax and its explanation, see the OPTIONS procedure in Base SAS Procedures Guide . |
option(s)
HOST NOHOST
displays only host options (HOST) or only portable options (NOHOST). PORTABLE is an alias for NOHOST.
RESTRICT
displays the system options that have been restricted by your site administrator. These options cannot be changed by the
If your site administrator has not restricted any options, then the following message will appear in the SAS log:
Your Site Administrator has not restricted any options.
PROC OPTIONS lists the current values of the system options that are available in all operating environments and, if you specify the HOST option in the PROC OPTIONS statement, it lists those options that are available only under UNIX (host options). The option values displayed by PROC OPTIONS depend on the default values shipped with SAS, the default values specified by your site administrator, the default values in your own configuration file, any changes made in your current session through the System Options window or OPTIONS statement, and possibly, the device on which you are running SAS.
For more information about a specific option, refer to Chapter 17, "System Options under UNIX," on page 311.
For more information about restricted options, see "Order of Precedence for SAS Configuration Files" on page 17.
Defines PMENU facilities for windows created with SAS software
UNIX specifics:
ATTR= and
See: PMENU Procedure in Base SAS Procedures Guide
PROC PMENU <CATALOG=< libref .> catalog > <DESC ' entry-description '>;
| Note |
This is a simplified version of the PMENU procedure syntax. For the complete syntax and its explanation, see the PMENU procedure in Base SAS Procedures Guide . |
CATALOG= <libref.>catalog
specifies the catalog in which you want to store PMENU entries. If you omit libref , the PMENU entries are stored in a catalog in the Sasuser data library. If you omit CATALOG=, the entries are stored in the Sasuser.Profile catalog.
DESC 'entry-description'
provides a description of the PMENU catalog entries created in the step.
The PMENU procedure defines PMENU facilities for windows created by using the WINDOW statement in Base SAS software, the %WINDOW macro statement, the BUILD procedure of SAS/AF software, or the SAS Component Language (SCL) PMENU function with SAS/AF, SAS/CALC, and SAS/FSP software.
Under UNIX, the following options are ignored:
ATTR= and COLOR= options in the TEXT statement. The colors and attributes for text and input fields are controlled by the CPARMS colors specified in the SASCOLOR window. See "Customizing Colors in UNIX Environments" on page 84 for more information.
ACCELERATE= and the MNEMONIC= options in the ITEM statement.
Defines destinations for SAS procedure output and the SAS log
UNIX specifics: Valid values of file specification
See: PRINTTO Procedure in Base SAS Procedures Guide
PROC PRINTTO < option(s) >
| Note |
This is a simplified version of the PRINTTO procedure syntax. For the complete syntax and its explanation, see the PRINTTO procedure in Base SAS Procedures Guide . |
LOG= file-specification
specifies a fully qualified pathname (in quotation marks), an environment variable, a fileref, or a file in the current directory (without extension).
PRINT= file-specification
specifies a fully qualified pathname (in quotation marks), an environment variable, a fileref, or a file in the current directory (without extension). If you specify a fileref that is defined with the PRINTER device-type keyword, output is sent directly to the printer.
The following statements send any SAS log entries that are generated after the RUN statement to the external file that is associated with the fileref MyFile:
filename myfile '/users/myid/mydir/mylog'; proc printto log=myfile; run;
If MyFile has not been defined as a fileref, PROC PRINTTO will create the file MyFile.log in the current directory.
The following statements send any procedure output that is generated after the RUN statement to the file /users/myid/mydir/myout :
proc printto print='/users/myid/mydir/myout'; run;
The following statements send the procedure output from the CONTENTS procedure directly to the system printer:
filename myfile printer; proc printto print=myfile; run; proc contents data=oranges; run;
To redirect the SAS log and procedure output to their original default destinations, run PROC PRINTTO without any options:
proc printto; run;
If MYPRINT and MYLOG have not been defined as
proc printto print=myprint log=mylog; run;
If filerefs MyPrint and MyLog had been defined, the output would have gone to the files associated with these filerefs.
Chapter 6, "Printing and Routing Output," on page 153
Sorts observations in a SAS data set by one or more variables, then stores the resulting sorted observations in a new SAS data set or
UNIX specifics: sort utilities available
See: SORT Procedure in Base SAS Procedures Guide
PROC SORT < option(s) >< collating-sequence-option >
| Note |
This is a simplified version of the SORT procedure syntax. For the complete syntax and its explanation, see the SORT procedure in Base SAS Procedures Guide . |
option(s)
specifies the maximum amount of memory available to the SORT procedure. For further explanation of the SORTSIZE= option, see the following Details section.
TAGSORT
stores only the BY variables and the observation number in temporary files. For further explanation of the TAGSORT option, see the following Details section.
| Note |
The TAGSORT option is ignored when used with a host sort. |
The SORT procedure sorts observations in a SAS data set by one or more character or numeric variables, either replacing the original data set or creating a new, sorted data set. By default under UNIX, the SORT procedure uses the ASCII collating sequence.
The SORT procedure uses the sort utility specified by the SORTPGM system option. Sorting can be done by SAS or the syncsort utility. You can use all of the options available to the SAS sort utility, such as the SORTSEQ and NODUPKEY options. In some situations, you can improve your performance by using the NOEQUALS option. If you specify an option that is not supported by the host sort, then the SAS sort will be used instead. For more information about all of the options that are available, see the SORT procedure in Base SAS Procedures Guide .
You can use the SORTSIZE= option in the PROC SORT statement to limit the amount of memory available to the SORT procedure. This option can reduce the amount of swapping SAS must do to sort the data set.
| Note |
If you do not specify the SORTSIZE= option, PROC SORT uses the value of the SORTSIZE system option. The SORTSIZE system option can be defined on the command line or in the SAS configuration file. |
The syntax of the SORTSIZE= option is as
SORTSIZE= memory-specification
where memory-specification can be one of the following:
|
n |
specifies the amount of memory in bytes. |
|
n K |
specifies the amount of memory in 1-kilobyte
|
|
n M |
specifies the amount of memory in 1-megabyte multiples. |
|
n G |
specifies the amount of memory in 1-gigabyte multiples. |
The default SAS configuration file sets this option based on the value of the SORTSIZE system option. The default for the SORTSIZE system option is MAX; however, the value of MAX depends on your operating system. To view the value of MAX for your operating environment, run the following code:
proc options option=sortsize; run;
You can override the default value of the SORTSIZE system option by
specifying a different SORTSIZE= value in the PROC SORT statement
submitting an OPTIONS statement that sets the SORTSIZE system option to a new value
setting the SORTSIZE system option on the command line during the invocation of SAS.
In general, you should set the SORTSIZE= option no larger than the amount of physical memory available to the SAS process. If the SORTSIZE= value is larger than the amount of available memory, then the operating system will be forced to page excessively. If the SORTSIZE= value is too small, then not all of the sorting can be done in memory, which also results in more disk I/O.
When the SORTSIZE= value is large enough to sort the entire data set in memory, you can achieve optimal sort performance. If the entire data set to be sorted will not fit in memory, SAS creates a temporary utility file to store the data. In this case, SAS uses a sort algorithm that is
| Note |
You can also use the SORTSIZE system option, which has the same effect as the SORTSIZE= option in the PROC SORT statement. |
The TAGSORT option in the PROC SORT statement is useful when there might not be enough disk space to sort a large SAS data set. When you specify the TAGSORT option, only the sort keys (that is, the variables specified in the BY statement) and the observation number for each observation are stored in the temporary utility files. The sort keys, together with the observation number, are referred to as tags . At the completion of the sorting process, the tags are used to retrieve the records from the input data set in sorted order. Thus, in cases where the total number of bytes of the sort keys is small compared with the length of the record, temporary disk use is reduced considerably.
You must have enough disk space to hold an additional copy of the data set (the output data set) and the utility file that contains the tags. By default, this utility file is stored in the Work library. If this directory is too small, you can change this directory using the WORK system option. For more information, see "WORK System Option" on page 381.
| Note |
If you are using a host sort utility, then you can use the SORTDEV system option to change the location of your temporary files. For more information, see "SORTDEV System Option" on page 366. |
Note that while using the TAGSORT option may reduce temporary disk use, the processing time could be higher. However, on computers with limited available disk space, the TAGSORT option might enable sorts to be performed in situations where they would
You need to consider the following items when determining the amount of the disk space needed to run PROC SORT:
input SAS data set
PROC sort uses the input data set specified by the DATA= option.
output SAS data set
PROC SORT stores the output data set in the location specified by the OUT= option. If the OUT= option is not specified, PROC SORT stores the output SAS data set in the Work library.
utility file stored in the Work library
This utility file is approximately the size of the input SAS data set.
temporary output SAS data set
During the sort, PROC SORT creates its output in the directory specified in the OUT= option (or directory of the input data set if the OUT= option is not specified). The temporary data set has the same filename as the original data set, except it has an extension of .lck. After the sort completes successfully, the original data set is deleted, and the temporary data set is
You can reduce the amount of disk space needed by specifying the OVERWRITE option on the PROC SORT statement. When you specify this option, SAS replaces the input data set with the sorted data set. This option should only be used with a data set that is
Generally, SAS uses the memory value specified in the REALMEMSIZE system option. However, this value is limited by the SORTSIZE value (which is limited by the value of the MEMSIZE system option). If SORTSIZE is set to the default value of MAX, then PROC SORT uses the REALMEMSIZE value to determine the amount of memory to use. For information about setting the REALMEMSIZE system option, see "Guidelines for Setting the REALMEMSIZE System Option" on page 285.
| Note |
If you receive an out of memory error, then increase the value of MEMSIZE. For more information, see "MEMSIZE System Option" on page 344. |
Since PROC SORT uses the REALMEMSIZE system option to determine how much memory to use, it is important that the REALMEMSIZE value reflects the amount of memory that is available on your system. If REALMEMSIZE is set too high, then PROC SORT might use more memory than is actually available. Using too much memory will cause excessive paging and adversely impact system performance.
In general, REALMEMSIZE should be set to the amount of physical memory (not including swap space) that you expect to be available to SAS at run time. A good starting value is the amount of physical memory installed on the computer less the amount that is being used by running applications and the operating system. You can experiment with the REALMEMSIZE value until you reach optimum performance for your environment. In some cases, optimum performance can be achieved with a very low REALMEMSIZE value. A low value could cause SAS to use less memory and leave more memory for the operating system to perform I/O caching.
For more information, see "REALMEMSIZE System Option" on page 353.
If you want to provide your own collating sequences or change a collating sequence provided for you, use the TRANTAB procedure to create or modify translation tables. For more information about the TRANTAB procedure, see SAS National Language Support (NLS): User's Guide . When you create your own translation tables, they are stored in your Sasuser.Profile catalog, and they override any translation tables by the same name that are stored in the Host catalog.
| Note |
System managers can modify the Host catalog by copying newly created tables from the Profile catalog to the Host catalog. Then all users can access the new or modified translation table. |
If you are using the SAS windowing environment and want to see the names of the collating sequences that are stored in the Host catalog, issue the following command from any window:
catalog sashelp.host
If you are not using the SAS windowing environment, then issue the following statements to generate a list of the contents of the Host catalog:
proc catalog catalog=sashelp.host; contents; run;
Entries of type TRANTAB are the collating sequences.
To see the contents of a particular translation table, use the following statements:
proc trantab table= table-name ; list; run;
The contents of collating sequences are displayed in the SAS log.
UNIX has one host sort utility, syncsort . You can use this sorting application as an alternative sorting algorithm to the SAS sort. SAS determines which sort to use by the values that are set for the SORTNAME, SORTPGM, SORTCUT, and SORTCUTP system options.
To specify a host sort utility as the sort algorithm, complete the following steps:
Specify the name of the host utility ( syncsort ) in the SORTNAME system option.
Set the SORTPGM system option to tell SAS when to use the host sort utility.
If SORTPGM=HOST, then SAS will always use the host sort utility.
If SORTPGM=BEST, then SAS chooses the best sorting method (either the SAS sort or the host sort) for the situation. For more information, see "Sorting Based on Size or Observations" on page 286.
The sort routine that SAS uses can be based on either the number of observations in a data set or on the size of the data set. When the SORTPGM system option is set to BEST, SAS uses the first available and pertinent sorting algorithm based on this order of precedence:
host sort utility
SAS sort utility
SAS looks at the values for the SORTCUT and SORTCUTP system options to determine which sort to use.
The SORTCUT option specifies the number of observations above which the host sort utility is used instead of the SAS sort. The SORTCUTP option specifies the number of bytes in the data set above which the host sort utility is used.
If SORTCUT and SORTCUTP are set to zero, SAS uses the SAS sort routine. If you specify both options and either condition is met, SAS uses the host sort utility.
When the following OPTIONS statement is in effect, the host sort utility is used when the number of observations is 501 or greater:
options sortpgm=best sortcut=500;
In this example, the host sort utility is used when the size of the data set is greater than 40M:
options sortpgm=best sortcutp=40M;
For more information about these sort options, see "SORTCUT System Option" on page 364, "SORTCUTP System Option" on page 365, and "SORTPGM System Option" on page 368.
By default, the host sort utilities use the location that is specified in the -WORK option for temporary files. To change the location of these temporary files, specify a location by using the SORTDEV system option. Here is an example:
options sortdev=''/tmp/host'';
For more information, see "SORTDEV System Option" on page 366.
To specify options for the sort utility, use the SORTANOM system option. For a list of valid options, see "SORTANOM System Option" on page 363.
To pass parameters to the sort utility, use the SORTPARM system option. The parameters that you can specify depend on the host sort utility. For more information, see "SORTPARM System Option" on page 367.
| Caution |
If you are using a host sort utility to sort your data, then specifying the SORTSEQ= option might corrupt the character BY variables if the sort sequence translation table and its inverse are not one-to-one mappings. In other words for the sort to work, the translation table must map each character to a unique weight, and the inverse table must map each weight to a unique character variable. |
If your translation tables do not map one-to-one, then you can use one of the following
create a translation table that maps one-to-one. Once you create a translation table that maps one-to-one, you can easily create a corresponding inverse table using the TRANTAB procedure. If your table is not mapped one-to-one, then you will receive the following note in the SAS log when you try to create an inverse table:
NOTE: This table cannot be mapped one to one.
For more information, see TRANTAB Procedure in SAS National Language Support (NLS): User's Guide .
use the SAS sort. You can specify the SAS sort using the SORTPGM system option. For more information, see "SORTPGM System Option" on page 368.
specify the collation order options of your host sort utility. See the documentation for your host sort utility for more information.
create a view with a dummy BY variable. For an example, see "Example: Creating a View with a
| Note |
After using one of these methods, you might need to perform
|
The following code is an example of creating a view using a dummy BY variable:
options no date nostimer ls-78 ps-60;
options sortpgm=host msglevel=i;
data one;
input name $ age;
datalines;
anne 35
ALBERT 10
JUAN 90
janet 5
bridget 23
BRIAN 45
;
data oneview / view=oneview;
set one;
name1=upcase(name);
run;
proc sort data=oneview out=final(drop=name1);
by name1;
run;
proc print data=final;
run;
The output is the following:
Output 15.2: Creating a View with a Dummy BY Variable
|
|
The SAS System Obs name age 1 ALBERT 10 2 anne 35 3 BRIAN 45 4 bridget 23 5 janet 5 6 JUAN 90
|
|
"TRANTAB Procedure" in SAS National Language Support (NLS): User's Guide
"MEMSIZE System Option" on page 344
"REALMEMSIZE System Option" on page 353
"SORTANOM System Option" on page 363
"SORTCUT System Option" on page 364
"SORTCUTP System Option" on page 365
"SORTDEV System Option" on page 366
"SORTNAME System Option" on page 367
"SORTPARM System Option" on page 367
"SORTPGM System Option" on page 368
"SORTSIZE System Option" on page 368
"UTILLOC System Option" in SAS Language Reference: Dictionary