Tip: Supports the Output Delivery System. See Output Delivery System on page 32 for details.
ODS Table Name : Standard
Reminder: You can use the ATTRIB, FORMAT, LABEL, and WHERE statements. See Chapter 3, Statements with the Same Function in Multiple Procedures, on page 57 for details. You can also use any global statements. See Global Statements on page 18 for a list.
PROC STANDARD < option(s) >;
BY <DESCENDING> variable-1 < <DESCENDING> variable-n >
<NOTSORTED>;
FREQ variable ;
VAR variable(s) ;
WEIGHT variable ;
To do this | Use this statement |
---|---|
Calculate separate standardized values for each BY group | BY |
Identify a variable whose values represent the frequency of each observation | FREQ |
Select the variables to standardize and determine the order in which they appear in the printed output | VAR |
Identify a variable whose values weight each observation in the statistical calculations | WEIGHT |
PROC STANDARD < option(s) >;
To do this | Use this option | |
---|---|---|
Specify the input data set | DATA= | |
Specify the output data set | OUT= | |
Computational options | ||
Exclude observations with nonpositive weights | EXCLNPWGT | |
Specify the mean value | MEAN= | |
Replace missing values with a variable mean or MEAN= value | REPLACE | |
Specify the standard deviation value | STD= | |
Specify the divisor for variance calculations | VARDEF= | |
Control printed output | ||
Print statistics for each variable to standardize | |
If you do not specify MEAN=, REPLACE, or STD=, the output data set is an identical copy of the input data set.
DATA= SAS-data-set
identifies the input SAS data set.
Main discussion: Input Data Sets on page 19
Restriction: You cannot use PROC STANDARD with an engine that supports concurrent access if another user is updating the data set at the same time.
EXCLNPWGT
excludes observations with nonpositive weight values (zero or negative). The procedure does not use the observation to calculate the mean and standard deviation, but the observation is still standardized. By default, the procedure treats observations with negative weights like those with zero weights and counts them in the total number of observations.
MEAN= mean-value
standardizes variables to a mean of mean-value .
Alias: M=
Default: mean of the input values
Featured in: Example 1 on page 1185
OUT= SAS-data-set
identifies the output data set. If SAS-data-set does not exist, PROC STANDARD creates it. If you omit OUT=, the data set is named DATA n , where n is the smallest integer that makes the name unique.
Default: DATA n
Featured in: Example 1 on page 1185
prints the original frequency, mean, and standard deviation for each variable to standardize.
Featured in: Example 2 on page 1187
REPLACE
replaces missing values with the variable mean.
Interaction: If you use MEAN=, PROC STANDARD replaces missing values with the given mean.
Featured in: Example 2 on page 1187
STD= std-value
standardizes variables to a standard deviation of std-value .
Alias: S=
Default: standard deviation of the input values
Featured in: Example 1 on page 1185
VARDEF= divisor
specifies the divisor to use in the calculation of variances and standard deviation. Table 46.1 on page 1181 shows the possible values for divisor and the associated divisors.
Value | Divisor | Formula for Divisor |
---|---|---|
DF | degrees of freedom | n ˆ’ 1 |
N | number of observations | n |
WDF | sum of weights minus one | ( & pound ; i w i ) ˆ’ 1 |
WEIGHT WGT | sum of weights | i w i |
The procedure computes the variance as CSS/divisor , where CSS is the corrected sums of squares and equals . When you weight the analysis variables, CSS equals where x w is the weighted mean.
Default: DF
Tip: When you use the WEIGHT statement and VARDEF=DF, the variance is an estimate of ƒ 2 , where the variance of the i th observation is ( x i ) = ƒ 2 /w i and w i is the weight for the i th observation. This yields an estimate of the variance of an observation with unit weight.
Tip: When you use the WEIGHT statement and VARDEF=WGT, the computed variance is asymptotically (for large n ) an estimate of ƒ 2 / w , where w is the average weight. This yields an asymptotic estimate of the variance of an observation with average weight.
Main discussion: Keywords and Formulas on page 1354
Calculates standardized values separately for each BY group.
Main discussion: BY on page 58
Featured in: Example 2 on page 1187
BY <DESCENDING> variable-1 < <DESCENDING> variable-n ><NOTSORTED>; Required Arguments
variable
specifies the variable that the procedure uses to form BY groups. You can specify more than one variable. If you do not use the NOTSORTED option in the BY statement, the observations in the data set must either be sorted by all the variables that you specify, or they must be indexed appropriately. These variables are called BY variables .
DESCENDING
specifies that the data set is sorted in descending order by the variable that immediately follows the word DESCENDING in the BY statement.
NOTSORTED
specifies that observations are not necessarily sorted in alphabetic or numeric order. The data are grouped in another way, such as chronological order.
The requirement for ordering or indexing observations according to the values of BY variables is suspended for BY-group processing when you use the NOTSORTED option. In fact, the procedure does not use an index if you specify NOTSORTED. The procedure defines a BY group as a set of contiguous observations that have the same values for all BY variables. If observations with the same values for the BY variables are not contiguous, the procedure treats each contiguous set as a separate BY group.
Specifies a numeric variable whose values represent the frequency of the observation.
Tip: The effects of the FREQ and WEIGHT statements are similar except when calculating degrees of freedom.
See also: For an example that uses the FREQ statement, see FREQ on page 61
FREQ variable ;
variable
specifies a numeric variable whose value represents the frequency of the observation. If you use the FREQ statement, the procedure assumes that each observation represents n observations, where n is the value of variable . If n is not an integer, the SAS System truncates it. If n is less than 1 or is missing, the procedure does not use that observation to calculate statistics but the observation is still standardized.
The sum of the frequency variable represents the total number of observations.
Specifies the variables to standardize and their order in the printed output.
Default: If you omit the VAR statement, PROC STANDARD standardizes all numeric variables not listed in the other statements.
Featured in: Example 1 on page 1185
VAR variable(s) ;
variable(s)
identifies one or more variables to standardize.
Specifies weights for analysis variables in the statistical calculations.
See also: For information about calculating weighted statistics and for an example that uses the WEIGHT statement, see WEIGHT on page 63
WEIGHT variable ;
variable
specifies a numeric variable whose values weight the values of the analysis variables. The values of the variable do not have to be integers. If the value of the weight variable is
Weight value | PROC STANDARD |
---|---|
| counts the observation in the total number of observations |
less than 0 | converts the weight value to zero and counts the observation in the total number of observations |
missing | excludes the observation from the calculation of mean and standard deviation |
To exclude observations that contain negative and zero weights from the calculation of mean and standard deviation, use EXCLNPWGT. Note that most SAS/STAT procedures, such as PROC GLM, exclude negative and zero weights by default.
Tip: When you use the WEIGHT statement, consider which value of the VARDEF= option is appropriate. See VARDEF= on page 1181 and the calculation of weighted statistics in Keywords and Formulas on page 1354 for more information.
Note: Prior to Version 7 of the SAS System, the procedure did not exclude the observations with missing weights from the count of observations.