Examples: STANDARD Procedure


Example 1: Standardizing to a Given Mean and Standard Deviation

Procedure features:

  • PROC STANDARD statement options:

    • MEAN=

    • OUT=

    • STD=

  • VAR statement

Other features:

  • PRINT procedure

This example

  • standardizes two variables to a mean of 75 and a standard deviation of 5

  • specifies the output data set

  • combines standardized variables with original variables

  • prints the output data set.

Program

Set the SAS system options. The NODATE option specifies to omit the date and time when the SAS job began . The PAGENO= option specifies the page number for the next page of output that SAS produces. The LINESIZE= option specifies the line size . The PAGESIZE= option specifies the number of lines for a page of SAS output.

 options nodate pageno=1 linesize=80 pagesize=60; 

Create the SCORE data set. This data set contains test scores for students who took two tests and a final exam. The FORMAT statement assigns the Z w.d format to StudentNumber. This format pads right-justified output with 0s instead of blanks. The LENGTH statement specifies the number of bytes to use to store values of Student.

 data score;     length Student $ 9;     input Student $ StudentNumber Section $           Test1 Test2 Final @@;     format studentnumber z4.;     datalines;  Capalleti 0545 1 94 91 87  Dubose    1252 2 51 65 91  Engles    1167 1 95 97 97  Grant     1230 2 63 75 80  Krupski   2527 2 80 69 71  Lundsford 4860 1 92 40 86  McBane    0674 1 75 78 72  Mullen    6445 2 89 82 93  Nguyen    0886 1 79 76 80  Patel     9164 2 71 77 83  Si        4915 1 75 71 73  Tanaka    8534 2 87 73 76  ; 

Generate the standardized data and create the output data set STNDTEST. PROC STANDARD uses a mean of 75 and a standard deviation of 5 to standardize the values. OUT= identifies STNDTEST as the data set to contain the standardized values.

 proc standard data=score mean=75 std=5 out=stndtest; 

Specify the variables to standardize. The VAR statement specifies the variables to standardize and their order in the output.

 var test1 test2;  run; 

Create a data set that combines the original values with the standardized values. PROC SQL joins SCORE and STNDTEST to create the COMBINED data set (table) that contains standardized and original test scores for each student. Using AS to rename the standardized variables NEW.TEST1 to StdTest1 and NEW.TEST2 to StdTest2 makes the variable names unique.

 proc sql;     create table combined as 
 select old.student, old.studentnumber,         old.section,         old.test1, new.test1 as StdTest1,         old.test2, new.test2 as StdTest2,         old.final  from score as old, stndtest as new  where old.student=new.student; 

Print the data set. PROC PRINT prints the COMBINED data set. ROUND rounds the standardized values to two decimal places. The TITLE statement specifies a title.

 proc print data=combined noobs round;     title 'Standardized Test Scores for a College Course';  run; 

Output

The data set contains variables with both standardized and original values. StdTest1 and StdTest2 store the standardized test scores that PROC STANDARD computes.

 Standardized Test Scores for a College Course                1               Student                         Std               Std  Student      Number     Section    Test1    Test1    Test2    Test2    Final  Capalleti     0545         1         94     80.54      91     80.86      87  Dubose        1252         2         51     64.39      65     71.63      91  Engles        1167         1         95     80.91      97     82.99      97  Grant         1230         2         63     68.90      75     75.18      80  Krupski       2527         2         80     75.28      69     73.05      71  Lundsford     4860         1         92     79.79      40     62.75      86  McBane        0674         1         75     73.40      78     76.24      72  Mullen        6445         2         89     78.66      82     77.66      93  Nguyen        0886         1         79     74.91      76     75.53      80  Patel         9164         2         71     71.90      77     75.89      83  Si            4915         1         75     73.40      71     73.76      73  Tanaka        8534         2         87     77.91      73     74.47      76 

Example 2: Standardizing BY Groups and Replacing Missing Values

Procedure features:

  • PROC STANDARD statement options:

    • PRINT

    • REPLACE

  • BY statement

Other features:

  • FORMAT procedure

  • PRINT procedure

  • SORT procedure

This example

  • calculates Z scores separately for each BY group using a mean of 1 and standard deviation of 0

  • replaces missing values with the given mean

  • prints the mean and standard deviation for the variables to standardize

  • prints the output data set.

Program

Set the SAS system options. The NODATE option specifies to omit the date and time when the SAS job began. The PAGENO= option specifies the page number for the next page of output that SAS produces. The LINESIZE= option specifies the line size. The PAGESIZE= option specifies the number of lines for a page of SAS output.

 options nodate pageno=1 linesize=80 pagesize=60; 

Assign a character string format to a numeric value. PROC FORMAT creates the format POPFMT to identify birth rates with a character value.

 proc format;     value popfmt 1='Stable'                  2='Rapid';  run; 

Create the LIFEEXP data set. Each observation in this data set contains information on 1950 and 1993 life expectancies at birth for 16 nations. [*] The birth rate for each nation is classified as stable (1) or rapid (2). The nations with missing data obtained independent status after 1950.

 data lifexp;     input PopulationRate Country $char14. Life50 Life93 @@;     label life50='1950 life expectancy'           life93='1993 life expectancy';     datalines;  2 Bangladesh     .  53 2 Brazil        51 67  2 China          41 70 2 Egypt         42 60  2 Ethiopia       33 46 1 France        67 77 
 1 Germany        68 75 2 India         39 59  2 Indonesia      38 59 1 Japan         64 79  2 Mozambique      . 47 2 Philippines   48 64  1 Russia          . 65 2 Turkey        44 66  1 United Kingdom 69 76 1 United States 69 75  ; 

Sort the LIFEEXP data set. PROC SORT sorts the observations by the birth rate.

 proc sort data=lifexp;     by populationrate;  run; 

Generate the standardized data for all numeric variables and create the output data set ZSCORE. PROC STANDARD standardizes all numeric variables to a mean of 1 and a standard deviation of 0. REPLACE replaces missing values. PRINT prints statistics.

 proc standard data=lifexp mean=0 std=1 replace                print out=zscore; 

Create the standardized values for each BY group. The BY statement standardizes the values separately by birth rate.

 by populationrate; 

Assign a format to a variable and specify a title for the report. The FORMAT statement assigns a format to PopulationRate. The output data set contains formatted values. The TITLE statement specifies a title.

 format populationrate popfmt.;     title1 'Life Expectancies by Birth Rate';  run; 

Print the data set. PROC PRINT prints the ZSCORE data set with the standardized values. The TITLE statements specify two titles to print.

 proc print data=zscore noobs;     title 'Standardized Life Expectancies at Birth';     title2 'by a Country''s Birth Rate';  run; 

Output

PROC STANDARD prints the variable name , mean, standard deviation, input frequency, and label of each variable to standardize for each BY group.

Life expectancies for Bangladesh, Mozambique, and Russia are no longer missing. The missing values are replaced with the given mean (0).

 Life Expectancies by Birth Rate                       1  ---------------------------- PopulationRate=Stable ----------------------------                                  Standard    Name              Mean        Deviation             N    Label    Life50       67.400000         1.854724             5    1950 life expectancy    Life93       74.500000         4.888763             6    1993 life expectancy  ----------------------------- PopulationRate=Rapid ----------------------------                                  Standard    Name              Mean        Deviation             N     Label    Life50       42.000000         5.033223             8     1950 life expectancy    Life93       59.100000         8.225300            10     1993 life expectancy                       Standardized Life Expectancies at Birth                    2                               by a Countrys Birth Rate                 Population                    Rate        Country            Life50      Life93                   Stable       France            -0.21567     0.51138                   Stable       Germany            0.32350     0.10228                   Stable       Japan             -1.83316     0.92048                   Stable       Russia             0.00000    -1.94323                   Stable       United Kingdom     0.86266     0.30683                   Stable       United States      0.86266     0.10228                   Rapid        Bangladesh         0.00000    -0.74161                   Rapid        Brazil             1.78812     0.96045                   Rapid        China             -0.19868     1.32518                   Rapid        Egypt              0.00000     0.10942                   Rapid        Ethiopia          -1.78812    -1.59265                   Rapid        India             -0.59604    -0.01216                   Rapid        Indonesia         -0.79472    -0.01216                   Rapid        Mozambique         0.00000    -1.47107                   Rapid        Philippines        1.19208     0.59572                   Rapid        Turkey             0.39736     0.83888 

[*] Data are from Vital Signs 1994: The Trends That Are Shaping Our Future , Lester R. Brown, Hal Kane, and David Malin Roodman, eds. Copyright 1994 by Worldwatch Institute. Reprinted by permission of W.W. Norton & Company, Inc.




Base SAS 9.1.3 Procedures Guide (Vol. 1)
Base SAS 9.1 Procedures Guide, Volumes 1, 2, 3 and 4
ISBN: 1590472047
EAN: 2147483647
Year: 2004
Pages: 260

flylib.com © 2008-2017.
If you may any questions please contact us: flylib@qtcs.net