Results: DATASETS Procedure | Base SAS 9.1 Procedures Guide, Volumes 1, 2, 3 and 4

Directory Listing to the SAS Log

The PROC DATASETS statement lists the SAS files in the procedure input library unless the NOLIST option is specified. The NOLIST option prevents the creation of the procedure results that go to the log. If you specify the MEMTYPE= option, only specified types are listed. If you specify the DETAILS option, PROC DATASETS prints these additional columns of information: Obs, Entries or Indexes , Vars , and Label .

Directory Listing as SAS Output

The CONTENTS statement lists the directory of the procedure input library if you use the DIRECTORY option or specify DATA=_ALL_.

If you want only a directory, use the NODS option and the _ALL_ keyword in the DATA= option. The NODS option suppresses the description of the SAS data sets; only the directory appears in the output.

Note: The CONTENTS statement does not put a directory in an output data set. If you try to create an output data set using the NODS option, you receive an empty output data set. Use the SQL procedure to create a SAS data set that contains information about a SAS data library.

Note: If you specify the ODS RTF destination, the PROC DATASETS output will go to both the SAS log and the ODS output area. The NOLIST option will suppress output to both. To see the output only in the SAS log, use the ODS EXCLUDE statement by specifying the member directory as the exclusion.

Procedure Output

The CONTENTS Statement

The only statement in PROC DATASETS that produces procedure output is the CONTENTS statement. This section shows the output from the CONTENTS statement for the GROUP data set, which is shown in Output 15.3.

Output 15.3: Data Set Attributes for the GROUP Data Set

 The SAS System                              1                               The DATASETS Procedure  Data Set Name        HEALTH.GROUP                     Observations         148  Member Type          DATA                             Variables            11  Engine               V9                               Indexes              1  Created              Wednesday, February              Observation Length   96                       05, 2003 02:20:56  Last Modified        Wednesday, February              Deleted Observations 0                       05, 2003 02:20:56  Protection           READ                             Compressed           NO  Data Set Type                                         Sorted               YES  Label                Test Subjects  Data Representation  WINDOWS_32  Encoding             wlatin1  Western (Windows)

Only the items in the output that require explanation are discussed.

Data Set Attributes

Here are descriptions of selected fields shown in Output 15.3:

Member Type

is the type of library member (DATA or VIEW).

Protection

indicates whether the SAS data set is READ, WRITE, or ALTER password protected.

Data Set Type

names the special data set type (such as CORR, COV, SSPC, EST, or FACTOR), if any.

Observations

is the total number of observations currently in the file. Note that for a very large data set, if the number of observations exceeds the number that can be stored in a double-precision integer, the count will show as missing.

Deleted Observations

is the number of observations marked for deletion. These observations are not included in the total number of observations, shown in the Observations field. Note that for a very large data set, if the number of deleted observations exceeds the number that can be stored in a double-precision integer, the count will show as missing.

Compressed

indicates whether the data set is compressed. If the data set is compressed, the output includes an additional item, Reuse Space (with a value of YES or NO), that indicates whether to reuse space that is made available when observations are deleted.

Sorted

indicates whether the data set is sorted. If you sort the data set with PROC SORT, PROC SQL, or specify sort information with the SORTEDBY= data set option, a value of YES appears here, and there is an additional section to the output. See Sort Information on page 364 for details.

Data Representation

is the format in which data is represented on a computer architecture or in an operating environment. For example, on an IBM PC, character data is represented by its ASCII encoding and byte-swapped integers. Native data representation refers to an environment for which the data representation compares with the CPU that is accessing the file. For example, a file that is in Windows data representation is native to the Windows operating environment.

Encoding

is the encoding value. Encoding is a set of characters ( letters , logograms, digits, punctuation, symbols, control characters, and so on) that have been mapped to numeric values (called code points) that can be used by computers. The code points are assigned to the characters in the character set when you apply an encoding method.

Engine and Operating Environment-Dependent Information

The CONTENTS statement produces operating environment-specific and engine-specific information. This information differs depending on the operating environment. The following output is from the Windows operating environment.

Output 15.4: Engine and Operating Environment Dependent Information Section of CONTENTS Output

 Engine/Host Dependent Information  Data Set Page Size          8192  Number of Data Set Pages    4  First Data Page             1  Max Obs per Page            84  Obs in First Data Page      62  Index File Page Size        4096  Number of Index File Pages  2  Number of Data Set Repairs  0  File Name                   c:\Myfiles\health\group.sas7bdat  Release Created             9.0101B0  Host Created                XP_PRO

Alphabetic List of Variables and Attributes

Here are descriptions of selected columns in Output 15.5:

is the logical position of each variable in the observation. This is the number that is assigned to the variable when it is defined.

Variable

is the name of each variable. By default, variables appear alphabetically .
Note: Variable names are sorted such that X1, X2, and X10 appear in that order and not in the true collating sequence of X1, X10, and X2. Variable names that contain an underscore and digits may appear in a nonstandard sort order. For example, P25 and P75 appear before P2_5.

Type

specifies the type of variable: character or numeric.

Len

specifies the variable s length, which is the number of bytes used to store each of a variable s values in a SAS data set.

Transcode

specifies whether a character variable is transcoded. If the attribute is NO, then transcoding is suppressed. By default, character variables are transcoded when required. For information on transcoding, see SAS National Language Support (NLS): User s Guide .

Output 15.5: Variable Attributes Section

 Alphabetic List of Variables and Attributes    #   Variable   Type   Len   Format    Informat   Label                            Transcode    9   BIRTH      Num      8   DATE7.    DATE7.                                      YES    4   CITY       Char    15   $.        $.                                          NO    3   FNAME      Char    15   $.        $.                                          NO   10   HIRED      Num      8   DATE7.    DATE7.                                      YES   11   HPHONE     Char    12   $.        $.                                          YES    1   IDNUM      Char     4   $.        $.                                          YES    7   JOBCODE    Char     3   $.        $.                                          YES    2   LNAME      Char    15   $.        $.                                          YES    8   SALARY     Num      8   COMMA8.              current salary excluding bonus   YES    6   SEX        Char     1   $.        $.                                          YES    5   STATE      Char     2   $.        $.                                          YES

Note: If none of the variables in the SAS data set has a format, informat, or label associated with it, or if none of the variables are set to no transcoding, then the column for that attribute does not display.

Alphabetic List of Indexes and Attributes

The section shown in Output 15.6 appears only if the data set has indexes associated with it.

Output 15.6: Index Attributes Section

 Alphabetic List of Indexes and Attributes                                        # of                  Unique    NoMiss    Unique  #      Index    Option    Option    Values    Variables  1      vital    YES       YES          148    BIRTH SALARY

indicates the number of each index. The indexes are numbered sequentially as they are defined.

Index

displays the name of each index. For simple indexes, the name of the index is the same as a variable in the data set.

Unique Option

indicates whether the index must have unique values. If the column contains YES, the combination of values of the index variables is unique for each observation.

Nomiss Option

indicates whether the index excludes missing values for all index variables. If the column contains YES, the index does not contain observations with missing values for all index variables.

# of Unique Values

gives the number of unique values in the index.

Variables

names the variables in a composite index.

Sort Information

The section shown in Output 15.7 appears only if the Sorted field has a value of YES.

Output 15.7: Sort Information Section

 The SAS System                            2  The DATASETS Procedure     Sort Information   Sortedby       LNAME   Validated      NO   Character Set  ANSI

Sortedby

indicates how the data are currently sorted. This field contains either the variables and options you use in the BY statement in PROC SORT, the column name in PROC SQL, or the values you specify in the SORTEDBY= option.

Validated

indicates whether PROC SORT or PROC SQL sorted the data. If PROC SORT or PROC SQL sorted the data set, the value is YES. If you assigned the sort information with the SORTEDBY= data set option, the value is NO.

Character Set

is the character set used to sort the data. The value for this field can be ASCII, EBCDIC, or PASCII.

Collating Sequence

is the collating sequence used to sort the data set. This field does not appear if you do not specify a specific collating sequence that is different from the character set. (not shown)

Sort Option

indicates whether PROC SORT used the NODUPKEY or NODUPREC option when sorting the data set. This field does not appear if you did not use one of these options in a PROC SORT statement. (not shown)

PROC DATASETS and the Output Delivery System (ODS)

Most SAS procedures send their messages to the SAS log and their procedure results to the output. PROC DATASETS is unique because it sends procedure results to both the SAS log and the procedure output file. When the interface to ODS was created, it was decided that all procedure results (from both the log and the procedure output file) should be available to ODS. In order to implement this feature and maintain compatibility with earlier releases, the interface to ODS had to be slightly different from the usual interface.

By default, the PROC DATASETS statement itself produces two output objects: Members and Directory. These objects are routed to the SAS log. The CONTENTS statement produces three output objects by default: Attributes, EngineHost, and Variables. (The use of various options adds other output objects.) These objects are routed to the procedure output file. If you open an ODS destination (such as HTML, RTF, or PRINTER), all of these objects are, by default, routed to that destination.

You can use ODS SELECT and ODS EXCLUDE statements to control which objects go to which destination, just as you can for any other procedure. However, because of the unique interface between PROC DATASETS and ODS, when you use the keyword LISTING in an ODS SELECT or ODS EXCLUDE statement, you affect both the log and the listing.

ODS Table Names

PROC DATASETS and PROC CONTENTS assign a name to each table they create. You can use these names to reference the table when using the Output Delivery System (ODS) to select tables and create output data sets. For more information, see The Complete Guide to the SAS Output Delivery System .

PROC CONTENTS generates the same ODS tables as PROC DATASETS with the CONTENTS statement.

Table 15.6: ODS Tables Produced by the DATASETS Procedure without the CONTENTS Statement
ODS Table	Description	Table is generated:
Directory	General library information	unless you specify the NOLIST option.
Members	Library member information	unless you specify the NOLIST option.

Table 15.7: ODS Table Names Produced by PROC CONTENTS and PROC DATASETS with the CONTENTS Statement
ODS Table	Description	Table is generated:
Attributes	Data set attributes	unless you specify the SHORT option.
Directory	General library information	if you specify DATA=< libref .>_ALL_ or the DIRECTORY option. ^[*]
EngineHost	Engine and operating environment information	unless you specify the SHORT option.
IntegrityConstraints	A detailed listing of integrity constraints	if the data set has integrtiy constraints and you do not specify the SHORT option.
IntegrityConstraintsShort	A concise listing of integrity constraints	if the data set has integrity constraints and you specify the SHORT option
Indexes	A detailed listing of indexes	if the data set is indexed and you do not specify the SHORT option.
IndexesShort	A concise listing of indexes	if the data set is indexed and you specify the SHORT option.
Members	Library member information	if you specify DATA=< libref .>_ALL_ or the DIRECTORY option. ^[*]
Position	A detailed listing of variables by logical position in the data set	if you specify the VARNUM option and you do not specify the SHORT option.
PositionShort	A concise listing of variables by logical position in the data set	if you specify the VARNUM option and the SHORT option.
Sortedby	Detailed sort information	if the data set is sorted and you do not specify the SHORT option.
SortedbyShort	Concise Sort information	if the data set is sorted and you specify the SHORT option.
Variables	A detailed listing of variables in alphabetical order	unless you specify the SHORT option.
VariablesShort	A concise listing of variables in alphabetical order	if you specify the SHORT option.
^[] For PROC DATASETS, if both the NOLIST option and either the DIRECTORY option or DATA= < libref.* >_ALL_ are specified, then the NOLIST option is ignored. ^[] For PROC DATASETS, if both the NOLIST option and either the DIRECTORY option or DATA=< libref.* >_ALL_ are specified, then the NOLIST option is ignored.

Output Data Sets

The CONTENTS Statement

The CONTENTS statement is the only statement in the DATASETS procedure that generates output data sets.

The OUT= Data Set

The OUT= option in the CONTENTS statement creates an output data set. Each variable in each DATA= data set has one observation in the OUT= data set. These are the variables in the output data set:

CHARSET
- the character set used to sort the data set. The value is ASCII, EBCDIC, or PASCII. A blank appears if the data set does not have sort information stored with it.

COLLATE
- the collating sequence used to sort the data set. A blank appears if the sort information for the input data set does not include a collating sequence.

COMPRESS
- indicates whether the data set is compressed.

CRDATE
- date the data set was created.

DELOBS
- number of observations marked for deletion in the data set. (Observations can be marked for deletion but not actually deleted when you use the FSEDIT procedure of SAS/FSP software.)

ENCRYPT
- indicates whether the data set is encrypted.

ENGINE
- name of the method used to read from and write to the data set.

FLAGS
- indicates whether an SQL view is protected ( P ) or contributes ( C ) to a derived variable.
  
  P
  
  indicates the variable is protected. The value of the variable can be displayed but not updated.
  
  C
  
  indicates whether the variable contributes to a derived variable.
  
  The value of FLAG is blank if P or C does not apply to an SQL view or if it is a data set view.

FORMAT
- variable format. The value of FORMAT is a blank if you do not associate a format with the variable.

FORMATD
- number of decimals you specify when you associate the format with the variable. The value of FORMATD is 0 if you do not specify decimals in the format.

FORMATL
- format length. If you specify a length for the format when you associate the format with a variable, the length you specify is the value of FORMATL. If you do not specify a length for the format when you associate the format with a variable, the value of FORMATL is the default length of the format if you use the FMTLEN option and 0 if you do not use the FMTLEN option.

GENMAX
- maximum number of versions for the generation group.

GENNEXT
- the next generation number for a generation group.

GENNUM
- the version number.

IDXCOUNT
- number of indexes for the data set.

IDXUSAGE
- use of the variable in indexes. Possible values are
  - NONE
    - the variable is not part of an index.
  - SIMPLE
    - the variable has a simple index. No other variables are included in the index.
  - COMPOSITE
    - the variable is part of a composite index.
  - BOTH
    - the variable has a simple index and is part of a composite index.

INFORMAT
- variable informat. The value is a blank if you do not associate an informat with the variable.

INFORMD
- number of decimals you specify when you associate the informat with the variable. The value is 0 if you do not specify decimals when you associate the informat with the variable.

INFORML
- informat length. If you specify a length for the informat when you associate the informat with a variable, the length you specify is the value of INFORML. If you do not specify a length for the informat when you associate the informat with a variable, the value of INFORML is the default length of the informat if you use the FMTLEN option and 0 if you do not use the FMTLEN option.

JUST
- justification (0=left, 1=right).

LABEL
- variable label (blank if none given).

LENGTH
- variable length.

LIBNAME
- libref used for the data library.

MEMLABEL
- label for this SAS data set (blank if no label).

MEMNAME
- SAS data set that contains the variable.

MEMTYPE
- library member type (DATA or VIEW).

MODATE
- date the data set was last modified.

NAME
- variable name.

NOBS
- number of observations in the data set.

NODUPKEY
- indicates whether the NODUPKEY option was used in a PROC SORT statement to sort the input data set.

NODUPREC
- indicates whether the NODUPREC option was used in a PROC SORT statement to sort the input data set.

NPOS
- physical position of the first character of the variable in the data set.

POINTOBS
- indicates if the data set can be addressed by observation.

PROTECT
- the first letter of the level of protection. The value for PROTECT is one or more of the following:
  
  A
  
  indicates the data set is alter-protected.
  
  R
  
  indicates the data set is read-protected.
  
  W
  
  indicates the data set is write-protected.

REUSE
- indicates whether the space made available when observations are deleted from a compressed data set should be reused. If the data set is not compressed, the REUSE variable has a value of NO.

SORTED
- the value depends on the sorting characteristics of the input data set. Possible values are
  
  . (period)
  
  for not sorted.
  
  for sorted but not validated.
  
  1
  
  for sorted and validated.

SORTEDBY
- the value depends on that variable s role in the sort. Possible values are
  - . (period)
    - if the variable was not used to sort the input data set.
  - n
    - where n is an integer that denotes the position of that variable in the sort. A negative value of n indicates that the data set is sorted by the descending order of that variable.

TYPE
- type of the variable (1=numeric, 2=character).

TYPEMEM
- special data set type (blank if no TYPE= value is specified).

VARNUM
- variable number in the data set. Variables are numbered in the order they appear.

The output data set is sorted by the variables LIBNAME and MEMNAME.

Note: The variable names are sorted so that the values X1, X2, and X10 are listed in that order, not in the true collating sequence of X1, X10, X2. Therefore, if you want to use a BY statement on MEMNAME in subsequent steps, run a PROC SORT step on the output data set first or use the NOTSORTED option in the BY statement.

The following is an example of an output data set created from the GROUP data set, which is shown in Example 4 on page 381 and in Procedure Output on page 360.

Output 15.8: The Data Set HEALTH.GRPOUT

 An Example of an Output Data Set                      1  OBS LIBNAME  MEMNAME    MEMLABEL     TYPEMEM  NAME     TYPE  LENGTH  VARNUM    1 HEALTH    GROUP   Test Subjects           BIRTH      1      8       9    2 HEALTH    GROUP   Test Subjects           CITY       2     15       4    3 HEALTH    GROUP   Test Subjects           FNAME      2     15       3    4 HEALTH    GROUP   Test Subjects           HIRED      1      8      10    5 HEALTH    GROUP   Test Subjects           HPHONE     2     12      11    6 HEALTH    GROUP   Test Subjects           IDNUM      2      4       1    7 HEALTH    GROUP   Test Subjects           JOBCODE    2      3       7    8 HEALTH    GROUP   Test Subjects           LNAME      2     15       2    9 HEALTH    GROUP   Test Subjects           SALARY     1      8       8   10 HEALTH    GROUP   Test Subjects           SEX        2      1       6   11 HEALTH    GROUP   Test Subjects           STATE      2      2       5  OBS             LABEL               FORMAT  FORMATL  FORMATD  INFORMAT  INFORML    1                                 DATE       7        0       DATE       7    2                                 $          0        0       $          0    3                                 $          0        0       $          0    4                                 DATE       7        0       DATE       7    5                                 $          0        0       $          0    6                                 $          0        0       $          0    7                                 $          0        0       $          0    8                                 $          0        0       $          0    9 current salary excluding bonus  COMMA      8        0                  0   10                                 $          0        0       $          0   11                                 $          0        0       $          0                     An Example of an Output Data Set                       2  Obs INFORMD JUST NPOS NOBS ENGINE           CRDATE           MODATE DELOBS    1    0      1    8   148   V9   29JAN02:08:06:46 29JAN02:09:13:36    0    2    0      0   58   148   V9   29JAN02:08:06:46 29JAN02:09:13:36    0    3    0      0   43   148   V9   29JAN02:08:06:46 29JAN02:09:13:36    0    4    0      1   16   148   V9   29JAN02:08:06:46 29JAN02:09:13:36    0    5    0      0   79   148   V9   29JAN02:08:06:46 29JAN02:09:13:36    0    6    0      0   24   148   V9   29JAN02:08:06:46 29JAN02:09:13:36    0    7    0      0   76   148   V9   29JAN02:08:06:46 29JAN02:09:13:36    0    8    0      0   28   148   V9   29JAN02:08:06:46 29JAN02:09:13:36    0    9    0      1    0   148   V9   29JAN02:08:06:46 29JAN02:09:13:36    0   10    0      0   75   148   V9   29JAN02:08:06:46 29JAN02:09:13:36    0   11    0      0   73   148   V9   29JAN02:08:06:46 29JAN02:09:13:36    0  OBS IDXUSAGE  MEMTYPE IDXCOUNT PROTECT FLAGS COMPRESS REUSE SORTED SORTEDBY    1 COMPOSITE  DATA       1      R--    ---     NO     NO      0       .    2 NONE       DATA       1      R--    ---     NO     NO      0       .    3 NONE       DATA       1      R--    ---     NO     NO      0       .    4 NONE       DATA       1      R--    ---     NO     NO      0       .    5 NONE       DATA       1      R--    ---     NO     NO      0       .    6 NONE       DATA       1      R--    ---     NO     NO      0       .    7 NONE       DATA       1      R--    ---     NO     NO      0       .    8 NONE       DATA       1      R--    ---     NO     NO      0       1    9 COMPOSITE  DATA       1      R--    ---     NO     NO      0       .   10 NONE       DATA       1      R--    ---     NO     NO      0       .   11 NONE       DATA       1      R--    ---     NO     NO      0       .                          An Example of an Output Data Set                       3  OBS CHARSET COLLATE NODUPKEY NODUPREC ENCRYPT POINTOBS GENMAX GENNUM GENNEXT    1  ANSI              NO       NO      NO      YES       0      .      .    2  ANSI              NO       NO      NO      YES       0      .      .    3  ANSI              NO       NO      NO      YES       0      .      .    4  ANSI              NO       NO      NO      YES       0      .      .    5  ANSI              NO       NO      NO      YES       0      .      .    6  ANSI              NO       NO      NO      YES       0      .      .    7  ANSI              NO       NO      NO      YES       0      .      .    8  ANSI              NO       NO      NO      YES       0      .      .    9  ANSI              NO       NO      NO      YES       0      .      .   10  ANSI              NO       NO      NO      YES       0      .      .   11  ANSI              NO       NO      NO      YES       0      .      .

Note: For information about how to get the CONTENTS output into an ODS data set for processing, see Example 7 on page 386.

The OUT2= Data Set

The OUT2= option in the CONTENTS statement creates an output data set that contains information about indexes and integrity constraints. These are the variables in the output data set:

IC_OWN
- contains YES if the index is owned by the integrity constraint.

INACTIVE
- contains YES if the integrity constraint is inactive.

LIBNAME
- libref used for the data library.

MEMNAME
- SAS data set that contains the variable.

MG
- the value of MESSAGE=, if it is used, in the IC CREATE statement.

MSGTYPE
- the value will be blank unless an integrity constraint is violated and you specified a message.

NAME
- the name of the index or integrity constraint.

NOMISS
- contains YES if the NOMISS option is defined for the index.

NUMVALS
- the number of distinct values in the index (displayed for centiles).

NUMVARS
- the number of variables involved in the index or integrity constraint.

ONDELETE
- for a foreign key integrity constraint, contains RESTRICT or SET NULL if applicable (the ON DELETE option in the IC CREATE statement).

ONUPDATE
- for a foreign key integrity constraint, contains RESTRICT or SET NULL if applicable (the ON UPDATE option in the IC CREATE statement).

RECREATE
- the SAS statement necessary to recreate the index or integrity constraint.

REFERENCE
- for a foreign key integrity constraint, contains the name of the referenced data set.

TYPE
- the type. For an index, the value is Index while for an integrity constraint, the value is the type of integrity constraint (Not Null, Check, Primary Key, etc.).

UNIQUE
- contains YES if the UNIQUE option is defined for the index.

UPERC
- the percentage of the index that has been updated since the last refresh (displayed for centiles).

UPERCMX
- the percentage of the index update that triggers a refresh (displayed for centiles).

WHERE
- for a check integrity constraint, contains the WHERE statement.

P	indicates the variable is protected. The value of the variable can be displayed but not updated.
C	indicates whether the variable contributes to a derived variable.

A	indicates the data set is alter-protected.
R	indicates the data set is read-protected.
W	indicates the data set is write-protected.

. (period)	for not sorted.
	for sorted but not validated.
1	for sorted and validated.