Preprocessing Input Data for BY-Group Processing


Sorting Observations for BY-Group Processing

You can use the SORT procedure to change the physical order of the observations in the data set. You can either replace the original data set, or create a new, sorted data set by using the OUT= option of the SORT procedure. In this example, PROC SORT rearranges the observations in the data set INFORMATION based on ascending values of the variables State and ZipCode, and replaces the original data set.

 proc sort data=information;     by State ZipCode;  run; 

As a general rule, when you use PROC SORT, specify the variables in the BY statement in the same order that you plan to specify them in the BY statement in the DATA step. For a detailed description of the default sorting orders for numeric and character variables, see the SORT procedure in Base SAS Procedures Guide .

Indexing for BY-Group Processing

You can also ensure that observations are processed in ascending numeric or character order by creating an index based on one or more variables in the SAS data set. If you specify a BY statement in a DATA step, SAS looks for an appropriate index. If one exists, SAS automatically retrieves the observations from the data set in indexed order.

Note: Because indexes require additional resources to create and maintain, you should determine if their use significantly improves performance. Depending on the nature of the data in your SAS data set, using PROC SORT to order data values can be more advantageous than indexing. For an overview of indexes, see 'Understanding SAS Indexes' on page 518.




SAS 9.1 Language Reference. Concepts
SAS 9.1 Language Reference Concepts
ISBN: 1590471989
EAN: 2147483647
Year: 2004
Pages: 255

flylib.com © 2008-2017.
If you may any questions please contact us: flylib@qtcs.net