Details


Missing Values

When computing statistics for an analysis variable, PROC SURVEYMEANS omits observations with missing values for that variable. The procedure bases statistics for each variable only on observations that have nonmissing values for that variable. If you specify the MISSING option described on page 4323 in the PROC SURVEYMEANS statement, the procedure treats missing values of a categorical variable as a valid category.

An observation is also excluded if it has a missing value for any STRATA or CLUSTER variable, unless the MISSING option is used.

If an observation has a missing value or a nonpositive value for the WEIGHT variable, then PROC SURVEYMEANS excludes that observation from the analysis.

The procedure performs univariate analysis and analyzes each VAR variable separately. Thus, the number of missing observations may be different for different variables . You can specify the keyword NMISS in the PROC SURVEYMEANS statement to display the number of missing values for each analysis variable in the 'Statistics' table.

If you have missing values in your survey data for any reason (such as nonresponse), this can compromise the quality of your survey results. An observation without missing values is called a complete respondent, and an observation with missing values is called an incomplete respondent. If the complete respondents are different from the incomplete respondents with regard to a survey effect or outcome, then the survey estimates will be biased and will not accurately represent the survey population. A variety of techniques in sample design and survey operations can reduce nonresponse. Once data collection is complete, you can use imputation to replace missing values with acceptable values, and you can use sampling weight adjustments to compensate for nonresponse. You should complete this data preparation and adjustment before you analyze your data with PROC SURVEYMEANS. Refer to Cochran (1977), Kalton and Kaspyzyk (1986), and Brick and Kalton (1996) for more details. PROC SURVEYMEANS assumes that missing data are missing at random, because the patterns of missing data are unknown. Therefore, PROC SURVEYMEANS excludes those observations with missing values.

If there is evidence indicating that the missing data are not at random, for example, if complete respondents are different from incomplete respondents for your study, you can use the DOMAIN statement to compute the descriptive statistics among complete respondents from your survey data without imputation on incomplete respondents. See Example 70.4 on page 4358.

If missing values result in empty strata in the sample, then they will have an impact on the statistical computation, which uses the total number of strata. If all the observations in a stratum have missing weights or missing values for the current analysis variable, this stratum is an empty stratum . For example,

  data new;   input stratum y z w;   datalines;   1 . 13 40   1 2  9  .   1 .  5 25   2 5 10 20   2 8 60 15   ;   proc surveymeans df mean nobs nmiss;   strata stratum;   var y z;   weight w;   run;  

You analyze variable Y and Z , with weight variable W and stratum variable STRATUM . For variable Y , all observations have missing values or missing weights in STRATUM =1, therefore, the analysis for variable Y uses only observations in STRATUM =2. Thus, for variable Y , STRATUM =1 is an empty stratum and STRATUM =2 is a non-empty stratum. Note, however, that STRATUM =1 is a nonempty stratum for variable Z .

If your sample design contains stratification, PROC SURVEYMEANS analyzes only the data in non-empty strata. Therefore, the total number of strata for an analysis variable means the total number of non-empty strata. In this example, the total number of strata for Y and Z is one and two, respectively.

Survey Data Analysis

Specification of Population Totals and Sampling Rates

If your analysis should include a finite population correction ( fpc ), you can input either the sampling rate or the population total using the RATE= option or the TOTAL= option. (You cannot specify both of these options in the same PROC SURVEYMEANS statement.) If you do not specify one of these options, the procedure does not use the fpc when computing variance estimates. For fairly small sampling fractions, it is appropriate to ignore this correction. Refer to Cochran (1977) and Kish (1965).

If your design has multiple stages of selection and you are specifying the RATE= option, you should input the first-stage sampling rate, which is the ratio of the number of PSUs in the sample to the total number of PSUs in the study population. If you are specifying the TOTAL= option for a multistage design, you should input the total number of PSUs in the study population. See the section 'Primary Sampling Units (PSUs)' on page 4335 for more details. For a nonstratified sample design, or for astratified sample design with the same sampling rate or the same population total in all strata, you should use the RATE= value option or the TOTAL= value option. If your sample design is stratified with different sampling rates or population totals in the strata, then you can use the RATE= SAS-data-set optionortheTOTAL= SAS data-set option to name a SAS data set that contains the stratum sampling rates or totals. This data set is called a secondary data set , as opposed to the primary data set that you specify with the DATA= option.

The secondary data set must contain all the stratification variables listed in the STRATA statement and all the variables in the BY statement. If there are formats associated with the STRATA variables and the BY variables, then the formats must be consistent in the primary and the secondary data sets. If you specify the TOTAL= SAS-data-set option, the secondary data set must have a variable named _TOTAL_ that contains the stratum population totals. Or if you specify the RATE= SAS-data-set option, the secondary data set must have a variable named _RATE_ that contains the stratum sampling rates. If the secondary data set contains more than one observation for any one stratum, then the procedure uses the first value of _TOTAL_ or _RATE_ for that stratum and ignores the rest.

The value in the RATE= option or the values of _RATE_ in the secondary data set must be nonnegative numbers . You can specify value as a number between 0 and 1. Or you can specify value in percentage form as a number between 1 and 100, and PROC SURVEYMEANS will convert that number to a proportion. The procedure treats the value 1 as 100%, and not the percentage form 1%.

If you specify the TOTAL= value option, value must not be less than the sample size . If you provide stratum population totals in a secondary data set, these values must not be less than the corresponding stratum sample sizes.

Primary Sampling Units (PSUs)

When you have clusters, or primary sampling units (PSUs), in your sample design, the procedure estimates variance from the variation among PSUs. See the section 'Variance and Standard Error of the Mean' on page 4338 and the section 'Variance and Standard Deviation of the Total' on page 4341. You can use the CLUSTER statement to identify the first stage clusters in your design. PROC SURVEYMEANS assumes that each cluster represents a PSU in the sample and that each observation is an element of a PSU. If you do not specify a CLUSTER statement, the procedure treats each observation as a PSU.

Domain Analysis

It is common practice to compute statistics for subpopulations, or domains, in addition to computing statistics for the entire study population. Analysis for domains using the entire sample is called domain analysis ( subgroup analysis, subpopulation analysis, subdomain analysis). The formation of these subpopulations of interest may be unrelated to the sample design. Therefore, the sample sizes for the subpopulations may actually be random variables.

In order to incorporate this variability into the variance estimation, you should use a DOMAIN statement. Note that using a BY statement provides completely separate analyses of the BY groups. It does not provide a statistically valid subpopulation or domain analysis, where the total number of units in the subpopulation is not known with certainty . For more detailed information about domain analysis, refer to Kish (1965).

Statistical Computations

The SURVEYMEANS procedure uses the Taylor expansion method to estimate sampling errors of estimators based on complex sample designs. This method obtains a linear approximation for the estimator and then uses the variance estimate for this approximation to estimate the variance of the estimate itself (Woodruff 1971, Fuller 1975). When there are clusters, or PSUs, in the sample design, the procedure estimates variance from the variation among PSUs. When the design is stratified, the procedure pools stratum variance estimates to compute the overall variance estimate. For t tests of the estimates, the degrees of freedom equals the number of clusters minus the number of strata in the sample design.

For a multistage sample design, the variance estimation method depends only on the first stage of the sample design. So, the required input includes only first-stage cluster (PSU) and first-stage stratum identification. You do not need to input design information about any additional stages of sampling. This variance estimation method assumes that the first-stage sampling fraction is small, or the first-stage sample is drawn with replacement, as it often is in practice.

Quite often in complex surveys, respondents have unequal weights, which reflect unequal selection probabilities and adjustments for nonresponse. In such surveys, the appropriate sampling weights must be used to obtain valid estimates for the study population.

For more information on the analysis of sample survey data, refer to Lee, Forthoffer, and Lorimor (1989), Cochran (1977), Kish (1965), and Hansen, Hurwitz, and Madow (1953).

Definition and Notation

For a stratified clustered sample design, together with the sampling weights, the sample can be represented by an n — ( P + 1) matrix

click to expand

where

  • h = 1 , 2 , , H is the stratum number, with a total of H strata

  • i = 1 , 2 , , n h is the cluster number within stratum h , with a total of n h clusters

  • j = 1 , 2 , , m hi is the unit number within cluster i of stratum h , with a total of m hi units

  • p = 1 , 2 , , P is the analysis variable number, with a total of P variables

  • click to expand is the total number of observations in the sample

  • w hij denotes the sampling weight for observation j in cluster i of stratum h

  • click to expand are the observed values of the analysis variables for observation j in cluster i of stratum h , including both the values of numerical variables and the values of indicator variables for levels of categorical variables.

For a categorical variable C , let l denote the number of levels of C , and denote the level values as c 1 , c 2 , ,c l . Then there are l indicator variables associated with these levels. That is, for level C = c k ( k =1 , 2 , , l ), a y ( q ) ( q ˆˆ {1 , 2 ,...,P }) contains the values of the indicator variable for the category C = c k , with the value of observation j in cluster i of stratum h :

click to expand

Therefore, the total number of analysis variables, P , is the total number of numerical variables plus the total number of levels of all categorical variables.

Also, f h denotes the sampling rate for stratum h . You can use the TOTAL= option or the RATE= option to input population totals or sampling rates. See the section 'Specification of Population Totals and Sampling Rates' on page 4334 for details. If you input stratum totals, PROC SURVEYMEANS computes f h as the ratio of the stratum sample size to the stratum total. If you input stratum sampling rates, PROC SURVEYMEANS uses these values directly for f h . If you do not specify the TOTAL= option or the RATE= option, then the procedure assumes that the stratum sampling rates f h are negligible, and a finite population correction is not used when computing variances.

This notation is also applicable to other sample designs. For example, for a sample design without stratification, you can let H = 1; for a sample design without clusters, you can let m hi = 1 for every h and i .

Mean

When you specify the keyword MEAN, the procedure computes the estimate of the mean (mean per element) from the survey data. Also, the procedure computes the mean by default if you do not specify any statistic-keywords in the PROC SURVEYMEANS statement.

PROC SURVEYMEANS computes the estimate of the mean as

click to expand

where

click to expand

is the sum of the weights over all observations in the sample.

Variance and Standard Error of the Mean

When you specify the keyword STDERR, the procedure computes the standard error of the mean. Also, the procedure computes the standard error by default if you specify the keyword MEAN, or if you do not specify any statistic-keywords in the PROC SURVEYMEANS statement. The keyword VAR requests the variance of the mean.

PROC SURVEYMEANS uses the Taylor series expansion theory to estimate the variance of the mean . The procedure computes the estimated variance as

click to expand

where if n h > 1,

click to expand

and if n h = 1,

click to expand

The standard error of the mean is the square root of the estimated variance.

click to expand

Ratio

When you use a RATIO statement, the procedure produces statistics requested by the statistics-keywords in the PROC SURVEYMEANS statement.

Suppose that you want to calculate the ratio of variable Y over variable X . Let x hij be the value of variable X for the j th member in cluster i in the h th stratum.

The ratio of Y over X is

click to expand

PROC SURVEYMEANS uses the Taylor series expansion method to estimate the variance of the ratio as

click to expand

where if n h > 1,

click to expand

and if n h =1,

click to expand

The standard error of the ratio is the square root of the estimated variance.

click to expand

t Test for the Mean

If you specify the keyword T, PROC SURVEYMEANS computes the t -value for testing that the population mean equals zero, H : Y = 0. The test statistic equals

click to expand

The two-sided p -value for this test is

click to expand

where T is a random variable with the t distribution with df degrees of freedom.

PROC SURVEYMEANS calculates the degrees of freedom for the t test as the number of clusters minus the number of strata. If there are no clusters, then df equals the number of observations minus the number of strata. If the design is not stratified, then df equals the number of clusters minus one. The procedure displays df for the t test if you specify the keyword DF in the PROC SURVEYMEANS statement.

If missing values or missing weights are present in your data, the number of strata, the number of observations, and the number of clusters are counted based on the observations in non-empty strata. See the section 'Missing Values' on page 4333 for details. For degrees of freedom in domain analysis, see the section 'Domain Statistics' on page 4342.

Confidence Limits for the Mean

If you specify the keyword CLM, the procedure computes two-sided confidence limits for the mean. Also, the procedure includes the confidence limits by default if you do not specify any statistic-keywords in the PROC SURVEYMEANS statement.

The confidence coefficient is determined by the value of the ALPHA= option, which by default equals 0.05 and produces 95% confidence limits. The confidence limits are computed as

click to expand

where is the estimate of the mean, StdErr( ) is the standard error of the mean, and t df, ± / 2 is the 100(1 ˆ’ ± / 2) percentile of the t distribution with df calculated as described in the section ' t Test for the Mean' on page 4339.

If you specify the keyword UCLM, the procedure computes the one-sided upper 100(1 ˆ’ ± ) confidence limit for the mean:

click to expand

If you specify the keyword LCLM, the procedure computes the one-sided lower 100(1 ˆ’ ± ) confidence limit for the mean:

click to expand

Coefficient of Variation

If you specify the keyword CV, PROC SURVEYMEANS computes the coefficient of variation, which is the ratio of the standard error of the mean to the estimated mean.

click to expand

If you specify the keyword CVSUM, PROC SURVEYMEANS computes the coefficient of variation for the estimated total, which is the ratio of the standard deviation of the sum to the estimated total.

click to expand

Proportions

If you specify the keyword MEAN for a categorical variable, PROC SURVEYMEANS estimates the proportion, or relative frequency, for each level of the categorical variable. If you do not specify any statistic-keywords in the PROC SURVEYMEANS statement, the procedure estimates the proportions for levels of the categorical variables, together with their standard errors and confidence limits.

The procedure estimates the proportion in level c k for variable C as

click to expand

where is the value of the indicator function for level C = c k , defined in these ction 'Definition and Notation' on page 4336, and equals 1 if the observed value of variable C equals c k , and equals 0 otherwise . Since the proportion estimator is actually an estimator of the mean for an indicator variable, the procedure computes its variance and standard error according to the method outlined in the section 'Variance and Standard Error of the Mean' on page 4338. Similarly, the procedure computes confidence limits for proportions as described in the section 'Confidence Limits for the Mean' on page 4340.

Total

If you specify the keyword SUM, the procedure computes the estimate of the population total from the survey data. The estimate of the total is the weighted sum over the sample.

click to expand

For a categorical variable level, estimates its total frequency in the population.

Variance and Standard Deviation of the Total

When you specify the keyword STD or the keyword SUM, the procedure estimates the standard deviation of the total. The keyword VARSUM requests the variance of the total.

PROC SURVEYMEANS estimates the variance of the total as

click to expand

where if n h > 1,

click to expand
click to expand

and if n h = 1,

click to expand

The standard deviation of the total equals

click to expand

Confidence Limits of a Total

If you specify the keyword CLSUM, the procedure computes confidence limits for the total. The confidence coefficient is determined by the value of the ALPHA= option, which by default equals 0.05 and produces 95% confidence limits. The confidence limits are computed as

click to expand

where is the estimate of the total, Std( ) is the estimated standard deviation, and t df, ± /2 is the 100(1 ˆ’ ± / 2) percentile of the t distribution with df calculated as described in the section ' t Test for the Mean' on page 4339.

If you specify the keyword UCLSUM, the procedure computes the one-sided upper 100(1 ˆ’ ± ) confidence limit for the sum:

click to expand

If you specify the keyword LCLSUM, the procedure computes the one-sided lower 100(1 ˆ’ ± ) confidence limit for the sum:

click to expand

Domain Statistics

When you use a DOMAIN statement to request a domain analysis, the procedure computes the requested statistics for each domain.

For a domain D , let I D be the corresponding indicator variable:

click to expand

Let

click to expand

The requested statistics for variable y in domain D are computed based on the values of z .

Domain Mean The estimated mean of y in the domain D is

click to expand

where

click to expand

The variance of is estimated by

click to expand

where if n h > 1,

click to expand

and if n h = 1,

click to expand

Domain Total The estimated total in domain D is

click to expand

and its estimated variance is

click to expand

where if n h > 1,

click to expand

and if n h = 1,

click to expand

Degrees of Freedom For domain analysis, PROC SURVEYMEANS computes the degrees of freedom for t tests as the number of clusters in the non-empty strata minus the number of non-empty strata. When the sample design has no clusters, the degrees of freedom equals the number of observations in non-empty strata minus the number of non-empty strata. As discussed in the section 'Missing Values' on page 4333, missing values and missing weights can result in empty strata. In domain analysis, an empty stratum can also occur when the stratum contains no observations in the specified domain. If no observations in a whole stratum belong to a domain, then this stratum is called an empty stratum for that domain.

For example,

  data new;   input str clu y w d;   datalines;   1 1 . 40 9   1 2 2  . 9   1 3 . 25 9   2 4 5 20 9   2 5 8 15 9   3 6 5 30 7   3 7 9 89 7   3 8 6 23 7   ;   proc surveymeans df nobs nclu nmiss;   strata str;   cluster clu;   var y;   weight w;   domain d;   run;  
Table 70.2: Calculations of df for Y
 

Domain D =7

Domain D =9

Non Empty Strata

STR =3

STR =2

Clusters Used in the Analysis

CLU =6, CLU =7, and CLU =8

CLU =4 and CLU =5

df

3 ˆ’ 1= 2

2 ˆ’ 1= 1

Although there are three strata in the data set, STR =1 is an empty stratum for variable Y because of missing values and missing weights. In addition, no observations in stratum STR =3 belong to domain D =9. Therefore, STR =3 becomes an empty stratum as well for variable Y in domain D =9. As a result, the total number of non-empty strata for domain D =9 is one. The non-empty stratum for domain D =9 and variable Y is stratum STR =2. The total number of clusters for domain D =9 is two, which belong to stratum STR =2. Thus, for variable Y in domain D =9, the degrees of freedom for the t tests of the domain mean is df = 2 ˆ’ 1 = 1. Similarly, for domain D =7, strata STR =1 and STR =2 are both empty strata, so the total number of strata is one ( STR =3), and the total number of clusters is three ( CLU =6, CLU =7, and CLU =8). Table 70.2 illustrates how domains affect the total number of clusters and total number of strata in the df calculation. Figure 70.8 shows the df computed by the procedure.

start figure
  The SURVEYMEANS Procedure   Domain Analysis: d   d    Variable               N          N Miss        Clusters        DF   ------------------------------------------------------------------------------   7    y                      3               0               3         6   9    y                      2               2               2         4   ------------------------------------------------------------------------------  
end figure

Figure 70.8: Degrees of Freedoms in Domain Analysis

Output

Output Data Sets

Output data sets from PROC SURVEYMEANS are produced using ODS (Output Delivery System). ODS encompasses more than just the production of output data sets. For example, you can use ODS to manipulate the format of your output, the headers and titles of the tables, and the order of the columns in a table. For a more detailed description on using ODS, see Chapter 14, 'Using the Output Delivery System.'

Displayed Output

By default PROC SURVEYMEANS displays a 'Data Summary' table and a 'Statistics' table. If you specify CLASS variables, or if you specify any character variables in the VAR statement, then the procedure displays a 'Class Level Information' table. If you specify the LIST option in the STRATA statement, then the procedure displays a 'Stratum Information' table. If you have a DOMAIN statement, the procedure displays a 'Domain Analysis' table. If you have a RATIO statement, the procedure displays a 'Ratio Analysis' table.

Data and Sample Design Summary

  • The 'Data Summary' table provides information on the input data set and the sample design. This table displays the total number of valid observations, where an observation is considered valid if it has nonmissing values for all procedure variables other than the analysis variables; that is, for all specified STRATA, CLUSTER, and WEIGHT variables. This number may differ from the number of nonmissing observations for an individual analysis variable, which the procedure displays in the 'Statistics' table. See the section 'Missing Values' on page 4333 for more information.

  • PROC SURVEYMEANS displays the following information in the 'Data Summary' table:

    • Number of Strata, if you specify a STRATA statement

    • Number of Clusters, if you specify a CLUSTER statement

    • Number of Observations, which is the total number of valid observations

    • Sum of Weights, which is the sum over all valid observations, if you specify a WEIGHT statement

Class Level Information

  • If you use a CLASS statement to name classification variables for categorical analysis, or if you list any character variables in the VAR statement, then PROC SURVEYMEANS displays a 'Class Level Information' table. This table contains the following information for each classification variable:

    • Class Variable, which lists each CLASS variable name

    • Levels, which is the number of values or levels of the classification variable

    • Values, which lists the values of the classification variable. The values are separated by a white space character; therefore, to avoid confusion, you should not include a white space character within a classification variable value.

Stratum Information

  • If you specify the LIST option in the STRATA statement, PROC SURVEYMEANS displays a 'Stratum Information' table. This table displays the number of valid observations in each stratum, as well as the number of nonmissing stratum observations for each analysis variable. The 'Stratum Information' table provides the following for each stratum:

    • Stratum Index, which is a sequential stratum identification number

    • STRATA variable(s), which lists the levels of STRATA variables for the stratum

    • Population Total, if you specify the TOTAL= option

    • Sampling Rate, if you specify the TOTAL= option or the RATE= option. If you specify the TOTAL= option, the sampling rate is based on the number of valid observations in the stratum.

    • N Obs, which is the number of valid observations

    • Variable, which lists each analysis variable name

    • Levels, which identifies each level for categorical variables

    • N, which is the number of nonmissing observations for the analysis variable

    • Clusters, which is the number of clusters, if you specify a CLUSTER statement

Statistics

  • The 'Statistics' table displays all of the statistics that you request with statistickeywords described on page 4326 in the PROC SURVEYMEANS statement. If you do not specify any statistic-keywords, then by default this table displays the following information for each analysis variable: the sample size, the mean, the standard error of the mean, and the confidence limits for the mean. The 'Statistics' table may contain the following information for each analysis variable, depending on which statistic-keywords you request:

    • Variable name

    • Level, which identifies each level for categorical variables

    • N, which is the number of nonmissing observations

    • N Miss, which is the number of missing observations

    • Minimum

    • Maximum

    • Range

    • Number of Clusters

    • Sum of Weights

    • DF, which is the degrees of freedom for the t test

    • Mean

    • Std Error of Mean, which is the standard error of the mean

    • Var of Mean, which is the variance of the mean

    • t Value, for testing H : population MEAN = 0

    • Pr > t , which is the two-sided p -value for the t test

    • 100(1 ˆ’ ± )% CL for Mean, which are two-sided confidence limits for the mean

    • 100(1 ˆ’ ± )% Upper CL for Mean, which are one-sided upper confidence limits for the mean

    • 100 (1 ˆ’ ± )% Lower CL for Mean, which are one-sided lower confidence limits for the mean

    • Coeff of Variation, which is the coefficients of variation for the mean and the sum

    • Sum

    • Std Dev, which is the standard deviation of the sum

    • Var of Sum, which is the variance of the sum

    • 100(1 ˆ’ ± )% CL for Sum, which are two-sided confidence limits for the sum

    • 100(1 ˆ’ ± )% Upper CL for Sum, which are one-sided upper confidence limits for the sum

    • 100(1 ˆ’ ± )% Lower CL for Sum, which are one-sided lower confidence limits for the Sum

Domain Analysis

  • If you use a DOMAIN statement, the procedure displays statistics in each domain in a 'Domain Analysis' table. A 'Domain Analysis' table contains all the columns in the 'Statistics' table, plus columns of domain variable values.

  • Note that depending on how you define the domains with domain variables, the procedure may produce more than one 'Domain Analysis' table. For example, in the following DOMAIN statement

      domain A B*C*D A*C C;  
  • you use four definitions to define domains:

    • A : all the levels of A

    • C : all the levels of C

    • A*C : all the interactive levels of A and C

    • B*C*D : all the interactive levels of B , C , and D

  • The procedure displays four 'Domain Analysis' tables, one for each domain definition. However, if you use ODS output statement to create an output data set for domain analysis, the output data set contains a variable Domain whose values are these domain definitions.

Ratio Analysis

  • The 'Ratio Analysis' table displays all of the statistics that you request with statistic-keywords in the PROC statement described on page 4326. If you do not specify any statistic-keywords, then by default this table displays the ratio and its standard error. The 'Ratio Analysis' table may contain the following information for each ratio, depending on which statistic-keywords you request:

    • Numerator, which identifies the numerator variable of the ratio

    • Denominator, which identifies the denominator variable of the ratio

    • N, which is the number of observations used in the ratio analysis

    • number of Clusters

    • Sum of Weights

    • DF, which is the degrees of freedom for the t test

    • Ratio

    • Std Error of Ratio, which is the standard error of the ratio

    • Var of Ratio, which is the variance of the ratio

    • t Value, for testing H : population RATIO = 0

    • Pr > t , which is the two-sided p -value for the t test

    • 100(1 ˆ’ ± )% CL for Ratio, which are two-sided confidence limits for the Ratio

    • Upper 100(1 ˆ’ ± )% CL for Ratio, which are one-sided upper confidence limits for the Ratio

    • Lower 100(1 ˆ’ ± )% CL for Ratio, which are one-sided lower confidence limits for the Ratio

  • When you use the ODS output statement to create an output data set, if you use labels for your RATIO statement, these labels are saved in a variable Ratio Statement in the output data set.

ODS Table Names

PROC SURVEYMEANS assigns a name to each table it creates. You can use these names to reference the table when using the Output Delivery System (ODS) to select tables and create output data sets. These names are listed in the following table. For more information on ODS, see Chapter 14, 'Using the Output Delivery System.'

Table 70.3: ODS Tables Produced in PROC SURVEYMEANS

ODS Table Name

Description

Statement

Option

ClassVarInfo

Class level information

CLASS

default

Domain

Statistics in domains

DOMAIN

default

Ratio

Statistics for ratios

RATIO

default

Statistics

Statistics

PROC

default

StrataInfo

Stratum information

STRATA

LIST

Summary

Data summary

PROC

default

For example, the following statements create an output data set named MyStrata , which contains the 'StrataInfo' table, and an output data set named MyStat , which contains the 'Statistics' table for the ice cream study discussed in the section 'Stratified Sampling' on page 4318:

  title1 'Analysis of Ice Cream Spending';   title2 'Stratified Simple Random Sample Design';   proc surveymeans data=IceCream total=StudentTotals;   strata Grade / list;   var Spending Group;   weight Weight;   ods output StrataInfo = MyStrata   Statistics = MyStat;   run;  



SAS.STAT 9.1 Users Guide (Vol. 6)
SAS.STAT 9.1 Users Guide (Vol. 6)
ISBN: N/A
EAN: N/A
Year: 2004
Pages: 127

flylib.com © 2008-2017.
If you may any questions please contact us: flylib@qtcs.net