Standardizes the values of one or more variables
Category: Mathematical
CALL STDIZE (< option-1, option-2, ... ,> variable-1 <, variable-2, ... >);
option
specifies a character expression whose values can be uppercase, lowercase, or mixed case letters . Leading and trailing blanks are ignored.
Default: STD
Restriction: Use a separate argument for each option because you cannot specify more than one option in a single argument.
Tip: Character expressions can end with an equal sign that is followed by another argument that is a numeric constant, variable, or expression.
See Also: PROC STDIZE in SAS/STAT User's Guide for information about formulas and other details. The options that are used in CALL STDIZE are the same as those used in PROC STDIZE.
Option includes the following three categories:
standardization-options
specify how to compute the location and scale measures that are used to standardize the variables. The following standardization options are available:
ABW=
must be followed by an argument that is a numeric expression specifying the tuning constant.
AGK=
must be followed by an argument that is a numeric expression that specifies the proportion of pairs to be included in the estimation of the within-cluster variances.
AHUBER=
must be followed by an argument that is a numeric expression specifying the tuning constant.
AWAVE=
must be followed by an argument that is a numeric expression specifying the tuning constant.
EUCLEN
specifies the Euclidean length.
IQR
specifies the interquartile range.
L=
must be followed by an argument that is a numeric expression with a value greater than or equal to 1 specifying the power to which differences are to be raised in computing an L(p) or Minkowski metric.
MAD
specifies the median absolute deviation from the median.
MAXABS
specifies the maximum absolute values.
MEAN
specifies the arithmetic mean (average).
MEDIAN
specifies the middle number in a set of data that is ordered according to rank.
MIDRANGE
specifies the midpoint of the range.
RANGE
specifies a range of values.
SPACING=
must be followed by an argument that is a numeric expression that specifies the proportion of data to be contained in the spacing.
STD
specifies the standard deviation.
SUM
specifies the result that you obtain when you add numbers .
USTD
unstandardizes variables when you also specify the METHOD=IN option.
VARDEF-options
specify the divisor to be used in the calculation of variances. VARDEF options can have the following values:
DF
specifies degrees of freedom.
N
specifies the number of observations.
miscellaneous-options
Miscellaneous options can have the following values:
ADD=
is followed by a numeric argument that specifies a number to add to each value after standardizing and multiplying by the value from the MULT= option. The default value is 0.
FUZZ=
is followed by a numeric argument that specifies the relative fuzz factor.
MISSING=
is followed by a numeric argument that specifies a value to be assigned to variables that have a missing value.
MULT=
is followed by a numeric argument that specifies a number by which to multiply each value after standardizing. The default value is 1.
NORM
normalizes the scale estimator to be consistent for the standard deviation of a normal distribution. This option affects only the methods AGK=, IQR, MAD, and SPACING=.
PSTAT
writes the values of the location and scale measures in the log.
REPLACE
replaces missing values with the value 0 in the standardized data (this value corresponds to the location measure before standardizing). To replace missing values by other values, see the MISSING= option.
SNORM
normalizes the scale estimator to have an expectation of approximately 1 for a standard normal distribution.
Tip: This option affects only the SPACING= method.
Default: DF
variable
is numeric. These values will be standardized according to the method that you use.
The CALL STDIZE routine transforms one or more arguments that are numeric variables by subtracting a location measure and dividing by a scale measure. You can use a variety of location and scale measures.
In addition, you can multiply each standardized value by a constant and you can add a constant. The final output value would be
where
result | specifies the final value that is returned for each variable. |
add | specifies the constant to add (ADD= option). |
mult | specifies the constant to multiply by (MULT= option). |
original | specifies the original input value. |
location | specifies the location measure. |
scale | specifies the scale measure. |
You can replace missing values by any constant. If you do not specify the MISSING= or the REPLACE option, variables that have missing values are not altered . The initial estimation method for the ABW=, AHUBER=, and AWAVE= methods is MAD. Percentiles are computed using definition 5. For more information about percentile calculations, see 'SAS Elementary Statistics Procedures' in Base SAS Procedures Guide .
The CALL STDIZE routine is similar to the STDIZE procedure in the SAS/STAT product. However, the CALL STDIZE routine is primarily useful for standardizing the rows of a SAS data set, whereas the STDIZE procedure can standardize only the columns of a SAS data set. For more information, see PROC STDIZE in SAS/STAT User's Guide .
The following SAS statements produce these results.
SAS Statements | Results |
---|---|
retain x 1 y 2 z 3; call stdize(x,y,z); put x= y= z=; | x=-1 y=0 z=1 |
retain w 10 x 11 y 12 z 13; call stdize('iqr',w,x,y,z); put w= x= y= z=; | w=-0.75 x=-0.25 y=0.25 z=0.75 |
retain w . x 1 y 2 z 3; call stdize('range',w,x,y,z); put w= x= y= z=; | w=. x=0 y=0.5 z=1 |
retain w . x 1 y 2 z 3; call stdize('mult=',10,'missing=', -1,'range',w,x,y,z); put w= x= y= z=; | w=-1 x=0 y=5 z=10 |