Statistical Procedures


Available Statistical Procedures

Table 1.2 on page 6 lists statistical procedures according to task. Table A1.1 on page 1355 lists the most common statistics and the procedures that compute them.

Table 1.2: Elementary Statistical Procedures by Task

To produce

Use this procedure

Which

Descriptive statistics

CORR

computes simple descriptive statistics.

 

MEANS or SUMMARY

computes descriptive statistics; can produce printed output and output data sets. By default, PROC MEANS produces printed output and PROC SUMMARY creates an output data set.

 

REPORT

computes most of the same statistics as PROC TABULATE; allows customization of format.

 

SQL

computes descriptive statistics for data in one or more DBMS tables; can produce a printed report or create a SAS data set.

 

TABULATE

produces tabular reports for descriptive statistics; can create an output data set.

 

UNIVARIATE

computes the broadest set of descriptive statistics; can create an output data set.

Frequency and cross-tabulation tables

FREQ

produces one-way to n -way tables; reports frequency counts; computes chi-square tests; computes tests and measures of association and agreement for two-way to n -way cross-tabulation tables; can compute exact tests and asymptotic tests; can create output data sets.

 

TABULATE

produces one-way and two-way cross-tabulation tables; can create an output data set.

 

UNIVARIATE

produces one-way frequency tables.

Correlation analysis

CORR

computes Pearson s, Spearman s, and Kendall s correlations and partial correlations ; also computes Hoeffding s D and Cronbach s coefficient alpha.

Distribution analysis

UNIVARIATE

computes tests for location and tests for normality.

 

FREQ

computes a test for the binomial proportion for one-way tables; computes a goodness-of-fit test for one-way tables; computes a chi-square test of equal distribution for two-way tables.

Robust estimation

UNIVARIATE

computes robust estimates of scale, trimmed means, and Winsorized means.

Data transformation

Computing ranks

RANK

computes ranks for one or more numeric variables across the observations of a SAS data set and creates an output data set; can produce normal scores or other rank scores.

Standardizing data

STANDARD

creates an output data set that contains variables that are standardized to a given mean and standard deviation.

Low-resolution graphics [*]

 

CHART

produces a graphical report that can show one of the following statistics for the chart variable: frequency counts, percentages, cumulative frequencies, cumulative percentages, totals, or averages.

 

UNIVARIATE

produces descriptive plots such as stem and leaf, box plot, and normal probability plot.

[*] To produce high-resolution graphical reports, use SAS/GRAPH software.

Efficiency Issues

Quantiles

For a large sample size n , the calculation of quantiles, including the median, requires computing time proportional to n log( n ). Therefore, a procedure, such as UNIVARIATE, that automatically calculates quantiles may require more time than other data summarization procedures. Furthermore, because data is held in memory, the procedure also requires more storage space to perform the computations . By default, the report procedures PROC MEANS, PROC SUMMARY, and PROC TABULATE require less memory because they do not automatically compute quantiles. These procedures also provide an option to use a new fixed-memory quantiles estimation method that is usually less memory intense . See Quantiles on page 555 for more information.

Computing Statistics for Groups of Observations

To compute statistics for several groups of observations, you can use any of the previous procedures with a BY statement to specify BY-group variables. However, BY- group processing requires that you previously sort or index the data set, which for very large data sets may require substantial computer resources. A more efficient way to compute statistics within groups without sorting is to use a CLASS statement with one of the following procedures: MEANS, SUMMARY, or TABULATE.

Additional Information about the Statistical Procedures

Appendix 1, SAS Elementary Statistics Procedures, on page 1353 lists standard keywords, statistical notation, and formulas for the statistics that base SAS procedures compute frequently. The individual statistical procedures discuss the statistical concepts that are useful to interpret the output of a procedure.




Base SAS 9.1.3 Procedures Guide (Vol. 1)
Base SAS 9.1 Procedures Guide, Volumes 1, 2, 3 and 4
ISBN: 1590472047
EAN: 2147483647
Year: 2004
Pages: 260

flylib.com © 2008-2017.
If you may any questions please contact us: flylib@qtcs.net