Syntax


The following statements are available in PROC FREQ.

  • PROC FREQ < options > ;

    • BY variables ;

    • EXACT statistic-options < / computation-options > ;

    • OUTPUT < OUT= SAS-data-set > options ;

    • TABLES requests < / options > ;

    • TEST options ;

    • WEIGHT variable < / option > ;

The PROC FREQ statement is the only required statement for the FREQ procedure. If you specify the following statements, PROC FREQ produces a one-way frequency table for each variable in the most recently created data set.

  proc freq;   run;  

The rest of this section gives detailed syntax information for the BY, EXACT, OUTPUT, TABLES, TEST, and WEIGHT statements in alphabetical order after the description of the PROC FREQ statement. Table 2.3 summarizes the basic functions of each statement.

Table 2.3: Summary of PROC FREQ Statements

Statement

Description

BY

calculates separate frequency or crosstabulation tables for each BY group .

EXACT

requests exact tests for specified statistics.

OUTPUT

creates an output data set that contains specified statistics.

TABLES

specifies frequency or crosstabulation tables and requests tests and measures of association.

TEST

requests asymptotic tests for measures of association and agreement.

WEIGHT

identifies a variable with values that weight each observation.

PROC FREQ Statement

  • PROC FREQ < options > ;

The PROC FREQ statement invokes the procedure.

The following table lists the options available in the PROC FREQ statement. Descriptions follow in alphabetical order.

Table 2.4: PROC FREQ Statement Options

Option

Description

DATA=

specifies the input data set.

COMPRESS

begins the next one-way table on the current page

FORMCHAR=

specifies the outline and cell divider characters for the cells of the crosstabulation table.

NLEVELS

displays the number of levels for all TABLES variables

NOPRINT

suppresses all displayed output.

ORDER=

specifies the order for listing variable values.

PAGE

displays one table per page.

You can specify the following options in the PROC FREQ statement.

COMPRESS

  • begins display of the next one-way frequency table on the same page as the preceding one-way table if there is enough space to begin the table. By default, the next one-way table begins on the current page only if the entire table fits on that page. The COMPRESS option is not valid with the PAGE option.

DATA= SAS-data-set

  • names the SAS data set to be analyzed by PROC FREQ. If you omit the DATA= option, the procedure uses the most recently created SAS data set.

FORMCHAR (1,2,7) = formchar-string

  • defines the characters to be used for constructing the outlines and dividers for the cells of contingency tables. The FORMCHAR= option can specify 20 different SAS formatting characters used to display output; however, PROC FREQ uses only the first, second, and seventh formatting characters. Therefore, the proper specification for PROC FREQ is FORMCHAR(1,2,7)= formchar-string . The formchar-string should be three characters long. The characters are used to denote (1) vertical separator, (2) horizontal separator, and (7) vertical-horizontal intersection. You can use any character in formchar-string , including hexadecimal characters. If you use hexadecimal characters, you must put an x after the closing quote. For information on which hexadecimal codes to use for which characters, consult the documentation for your hardware.

  • Specifying all blanks for formchar-string produces tables with no outlines or dividers:

      formchar (1,2,7)='   '  
  • If you do not specify the FORMCHAR= option, PROC FREQ uses the default

      formchar (1,2,7)='-+'  
  • Refer to the CALENDAR, PLOT, and TABULATE procedures in the Base SAS 9.1 Procedures Guide for more information on form characters.

    Table 2.5: Formatting Characters Used by PROC FREQ

    Position

    Default

    Used to Draw

    1

    vertical separators

    2

    -

    horizontal separators

    7

    +

    intersections of vertical and horizontal separators

NLEVELS

  • displays the Number of Variable Levels table. This table provides the number of levels for each variable named in the TABLES statements. See the section Number of Variable Levels Table on page 151 for more information. PROC FREQ determines the variable levels from the formatted variable values, as described in the section Grouping with Formats on page 99.

NOPRINT

  • suppresses the display of all output. Note that this option temporarily disables the Output Delivery System (ODS). For more information, see Chapter 14, Using the Output Delivery System. ( SAS/STAT User s Guide ).

    Note: A NOPRINT option is also available in the TABLES statement. It suppresses display of the crosstabulation tables but allows display of the requested statistics.

ORDER=DATA FORMATTED FREQ INTERNAL

  • specifies the order in which the values of the frequency and crosstabulation table variables are to be reported . The following table shows how PROC FREQ interprets values of the ORDER= option.

    DATA

    orders values according to their order in the input data set.

    FORMATTED

    orders values by their formatted values. This order is operating-environment dependent. By default, the order is ascending .

    FREQ

    orders values by descending frequency count.

    INTERNAL

    orders values by their unformatted values, which yields the same order that the SORT procedure does. This order is operating-environment dependent.

    By default, ORDER=INTERNAL. The ORDER= option does not apply to missing values, which are always ordered first.

PAGE

  • displays only one table per page. Otherwise , PROC FREQ displays multiple tables per page as space permits . The PAGE option is not valid with the COMPRESS option.

BY Statement

  • BY variables ;

You can specify a BY statement with PROC FREQ to obtain separate analyses on observations in groups defined by the BY variables. When a BY statement appears, the procedure expects the input data set to be sorted in order of the BY variables.

If your input data set is not sorted in ascending order, use one of the following alternatives:

  • Sort the data using the SORT procedure with a similar BY statement.

  • Specify the BY statement option NOTSORTED or DESCENDING in the BY statement for the FREQ procedure. The NOTSORTED option does not mean that the data are unsorted but rather that the data are arranged in groups (according to values of the BY variables) and that these groups are not necessarily in alphabetical or increasing numeric order.

  • Create an index on the BY variables using the DATASETS procedure.

For more information on the BY statement, refer to the discussion in SAS Language Reference: Concepts . For more information on the DATASETS procedure, refer to the discussion in the Base SAS 9.1 Procedures Guide .

EXACT Statement

  • EXACT statistic-options < / computation-options > ;

The EXACT statement requests exact tests or confidence limits for the specified statistics. Optionally, PROC FREQ computes Monte Carlo estimates of the exact p -values. The statistic-options specify the statistics for which to provide exact tests or confidence limits. The computation-options specify options for the computation of exact statistics.

CAUTION: PROC FREQ computes exact tests with fast and efficient algorithms that are superior to direct enumeration. Exact tests are appropriate when a data set is small, sparse, skewed, or heavily tied. For some large problems, computation of exact tests may require a large amount of time and memory. Consider using asymptotic tests for such problems. Alternatively, when asymptotic methods may not be sufficient for such large problems, consider using Monte Carlo estimation of exact p -values. See the section Computational Resources on page 145 for more information.

Statistic-Options

The statistic-options specify the statistics for which exact tests or confidence limits are computed. PROC FREQ can compute exact p -values for the following hypothesis tests: chi-square goodness-of-fit test for one-way tables; Pearson chi-square, likelihood -ratio chi-square, Mantel-Haenszel chi-square, Fisher s exact test, Jonckheere-Terpstra test, Cochran-Armitage test for trend, and McNemar s test for two-way tables. PROC FREQ can also compute exact p -values for tests of the following statistics: Pearson correlation coefficient, Spearman correlation coefficient, simple kappa coefficient, weighted kappa coefficient, and common odds ratio. PROC FREQ can compute exact p -values for the binomial proportion test for one-way tables, as well as exact confidence limits for the binomial proportion. Additionally, PROC FREQ can compute exact confidence limits for the odds ratio for 2 — 2 tables, as well as exact confidence limits for the common odds ratio for stratified 2 — 2 tables.

Table 2.6 lists the available statistic-options and the exact statistics computed. Most of the option names are identical to the corresponding options in the TABLES statement and the OUTPUT statement. You can request exact computations for groups of statistics by using options that are identical to the following TABLES statement options: CHISQ, MEASURES, and AGREE. For example, when you specify the CHISQ option in the EXACT statement, PROC FREQ computes exact p -values for the Pearson chi-square, likelihood-ratio chi-square, and Mantel-Haenszel chi-square tests. You request exact p -values for an individual test by specifying one of the statistic-options shown in Table 2.6.

Table 2.6: EXACT Statement Statistic-Options

Option

Exact Statistics Computed

AGREE

McNemar s test for 2 — 2 tables, simple kappa coefficient, and weighted kappa coefficient

BINOMIAL

binomial proportion test for one-way tables

CHISQ

chi-square goodness-of-fit test for one-way tables; Pearson chi-square, likelihood-ratio chi-square, and Mantel-Haenszel chi-square tests for two-way tables

COMOR

confidence limits for the common odds ratio for h — 2 — 2 tables; common odds ratio test

FISHER

Fisher s exact test

JT

Jonckheere-Terpstra test

KAPPA

test for the simple kappa coefficient

LRCHI

likelihood-ratio chi-square test

MCNEM

McNemar s test

MEASURES

tests for the Pearson correlation and the Spearman correlation, and the odds ratio confidence limits for 2 — 2 tables

MHCHI

Mantel-Haenszel chi-square test

OR

confidence limits for the odds ratio for 2 — 2 tables

PCHI

Pearson chi-square test

PCORR

test for the Pearson correlation coefficient

SCORR

test for the Spearman correlation coefficient

TREND

Cochran-Armitage test for trend

WTKAP

test for the weighted kappa coefficient

Computation-Options

The computation-options specify options for computation of exact statistics. You can specify the following computation-options in the EXACT statement. ALPHA= ± specifies the level of the confidence limits for Monte Carlo p -value estimates. The value of the ALPHA= option must be between 0 and 1, and the default is 0.01. A confidence level of ± produces 100(1 ˆ’ ± )% confidence limits. The default of ALPHA=.01 produces 99% confidence limits for the Monte Carlo estimates. The ALPHA= option invokes the MC option.

MAXTIME= value

  • specifies the maximum clock time (in seconds) that PROC FREQ can use to compute an exact p -value. If the procedure does not complete the computation within the specified time, the computation terminates. The value of the MAXTIME= option must be a positive number. The MAXTIME= option is valid for Monte Carlo estimation of exact p -values, as well as for direct exact p -value computation.

    See the section Computational Resources on page 145 for more information.

MC

  • requests Monte Carlo estimation of exact p -values instead of direct exact p -value computation. Monte Carlo estimation can be useful for large problems that require a great amount of time and memory for exact computations but for which asymptotic approximations may not be sufficient. See the section Monte Carlo Estimation on page 146 for more information.

    The MC option is available for all EXACT statistic-options except BINOMIAL, COMOR, MCNEM, and OR. PROC FREQ computes only exact tests or confidence limits for those statistics.

    The ALPHA=, N=, and SEED= options also invoke the MC option.

N= n

  • specifies the number of samples for Monte Carlo estimation. The value of the N= option must be a positive integer, and the default is 10000 samples. Larger values of n produce more precise estimates of exact p -values. Because larger values of n generate more samples, the computation time increases . The N= option invokes the MC option.

POINT

  • requests exact point probabilities for the test statistics.

    The POINT option is available for all the EXACT statement statistic-options except the OR option, which provides exact confidence limits as opposed to an exact test. The POINT option is not available with the MC option.

SEED= number

  • specifies the initial seed for random number generation for Monte Carlo estimation. The value of the SEED= option must be an integer. If you do not specify the SEED= option, or if the SEED= value is negative or zero, PROC FREQ uses the time of day from the computer s clock to obtain the initial seed. The SEED= option invokes the MC option.

Using TABLES Statement Options with the EXACT Statement

If you use only one TABLES statement, you do not need to specify options in the TABLES statement that are identical to options appearing in the EXACT statement. PROC FREQ automatically invokes the corresponding TABLES statement option when you specify the option in the EXACT statement. However, when you use multiple TABLES statements and want exact computations, you must specify options in the TABLES statement to compute the desired statistics. PROC FREQ then performs exact computations for all statistics that are also specified in the EXACT statement.

OUTPUT Statement

  • OUTPUT < OUT= SAS-data-set > options ;

The OUTPUT statement creates a SAS data set containing statistics computed by PROC FREQ. The variables contain statistics for each two-way table or stratum, as well as summary statistics across all strata.

Only one OUTPUT statement is allowed for each execution of PROC FREQ. You must specify a TABLES statement with the OUTPUT statement. If you use multiple TABLES statements, the contents of the OUTPUT data set correspond to the last TABLES statement. If you use multiple table requests in a TABLES statement, the contents of the OUTPUT data set correspond to the last table request.

For more information, see the section Output Data Sets on page 148.

Note that you can use the Output Delivery System (ODS) to create a SAS data set from any piece of PROC FREQ output. For more information, see Table 2.11 on page 159 and Chapter 14, Using the Output Delivery System. ( SAS/STAT User s Guide )

You can specify the following options in an OUTPUT statement.

OUT= SAS-data-set

  • names the output data set. If you omit the OUT= option, the data set is named DATA n , where n is the smallest integer that makes the name unique.

options

  • specify the statistics that you want in the output data set. Available statistics are those produced by PROC FREQ for each one-way or two-way table, as well as the summary statistics across all strata. When you request a statistic, the OUTPUT data set contains that estimate or test statistic plus any associated standard error, confidence limits, p -values, and degrees of freedom. You can output statistics by using group options identical to those specified in the TABLES statement: AGREE, ALL, CHISQ, CMH, and MEASURES. Alternatively, you can request an individual statistic by specifying one of the options shown in the following table.

Table 2.7: OUTPUT Statement Options and Required TABLES Statement Options

Option

Output Data Set Statistics

Required TABLES Statement Option

AGREE

McNemar s test for 2 — 2 tables, simple kappa coefficient, and weighted kappa coefficient; for square tables with more than two response categories, Bowker s test of symmetry; for multiple strata, overall simple and weighted kappa statistics, and tests for equal kappas among strata; for multiple strata with two response categories, Cochran s Q test

AGREE

AJCHI

continuity-adjusted chi-square for 2 — 2 tables

ALL or CHISQ

ALL

all statistics under CHISQ, MEASURES, and CMH, and the number of nonmissing subjects

ALL

BDCHI

Breslow-Day test

ALL or CMH or CMH1 or CMH2

BIN BINOMIAL

for one-way tables, binomial proportion statistics

BINOMIAL

CHISQ

chi-square goodness-of-fit test for one-way tables; for two-way tables, Pearson chi-square, likelihood-ratio chi-square, continuity-adjusted chi-square for 2 — 2 tables, Mantel-Haenszel chi-square, Fisher s exact test for 2 — 2 tables, phi coefficient, contingency coefficient, and Cramer s V

ALL or CHISQ

CMH

Cochran-Mantel-Haenszel correlation, row mean scores (ANOVA), and general association statistics; for 2 — 2 tables, logit and Mantel-Haenszel adjusted odds ratios, relative risks, and Breslow-Day test

ALL or CMH

CMH1

same as CMH, but excludes general association and row mean scores (ANOVA) statistics

ALL or CMH or CMH1

CMH2

same as CMH, but excludes the general association statistic

ALL or CMH or CMH2

CMHCOR

Cochran-Mantel-Haenszel correlation statistic

ALL or CMH or CMH1 or CMH2

CMHGA

Cochran-Mantel-Haenszel general association statistic

ALL or CMH

CMHRMS

Cochran-Mantel-Haenszel row mean scores (ANOVA) statistic

ALL or CMH or CMH2

COCHQ

Cochran s Q

AGREE

CONTGY

contingency coefficient

ALL or CHISQ

CRAMV

Cramer s V

ALL or CHISQ

EQKAP

test for equal simple kappas

AGREE

EQWKP

test for equal weighted kappas

AGREE

FISHER EXACT

Fisher s exact test

ALL or CHISQ []

GAMMA

gamma

ALL or MEASURES

JT

Jonckheere-Terpstra test

JT

KAPPA

simple kappa coefficient

AGREE

KENTB

Kendall s tau- b

ALL or MEASURES

LAMCR

lambda asymmetric ( C R )

ALL or MEASURES

LAMDAS

lambda symmetric

ALL or MEASURES

LAMRC

lambda asymmetric ( R C )

ALL or MEASURES

LGOR

adjusted logit odds ratio

ALL or CMH or CMH1 or CMH2

LGRRC1

adjusted column 1 logit relative risk

ALL or CMH or CMH1 or CMH2

LGRRC2

adjusted column 2 logit relative risk

ALL or CMH or CMH1 or CMH2

LRCHI

likelihood-ratio chi-square

ALL or CHISQ

MCNEM

McNemar s test

AGREE

MEASURES

gamma, Kendall s tau- b , Stuart s tau- c , Somers D ( CR ), Somers D ( RC ), Pearson correlation coefficient, Spearman correlation coefficient, lambda asymmetric ( CR ), lambda asymmetric ( RC ), lambda symmetric, uncertainty coefficient ( CR ), uncertainty coefficient ( R C ), and symmetric uncertainty coefficient; for 2 — 2 tables, odds ratio and relative risks

ALL or MEASURES

MHCHI

Mantel-Haenszel chi-square

ALL or CHISQ

MHOR

adjusted Mantel-Haenszel odds ratio

ALL or CMH or CMH1 or CMH2

MHRRC1

adjusted column 1 Mantel-Haenszel relative risk

ALL or CMH or CMH1 or CMH2

MHRRC2

adjusted column 2 Mantel-Haenszel relative risk

ALL or CMH or CMH1 or CMH2

N

number of nonmissing subjects for the stratum

 

NMISS

number of missing subjects for the stratum

 

OR

odds ratio

ALL or MEASURES or RELRISK

PCHI

chi-square goodness-of-fit test for one-way tables; for two-way tables, Pearson chi-square

ALL or CHISQ

PCORR

Pearson correlation coefficient

ALL or MEASURES

PHI

phi coefficient

ALL or CHISQ

PLCORR

polychoric correlation coefficient

PLCORR

RDIF1

column 1 risk difference (row 1 - row 2)

RISKDIFF

RDIF2

column 2 risk difference (row 1 - row 2)

RISKDIFF

RELRISK

odds ratio and relative risks for 2 — 2 tables

ALL or MEASURES or RELRISK

RISKDIFF

risks and risk differences

RISKDIFF

RISKDIFF1

column 1 risks and risk difference

RISKDIFF

RISKDIFF2

column 2 risks and risk difference

RISKDIFF

RRC1

column 1 relative risk

ALL or MEASURES or RELRISK

RRC2

column 2 relative risk

ALL or MEASURES or RELRISK

RSK1

column 1 risk (overall)

RISKDIFF

RSK11

column 1 risk, for row 1

RISKDIFF

RSK12

column 2 risk, for row 1

RISKDIFF

RSK2

column 2 risk (overall)

RISKDIFF

RSK21

column 1 risk, for row 2

RISKDIFF

RSK22

column 2 risk, for row 2

RISKDIFF

SCORR

Spearman correlation coefficient

ALL or MEASURES

SMDCR

Somers D ( CR )

ALL or MEASURES

SMDRC

Somers D ( RC )

ALL or MEASURES

STUTC

Stuart s tau- c

ALL or MEASURES

TREND

Cochran-Armitage test for trend

TREND

TSYMM

Bowker s test of symmetry

AGREE

U

symmetric uncertainty coefficient

ALL or MEASURES

UCR

uncertainty coefficient ( CR )

ALL or MEASURES

URC

uncertainty coefficient ( RC )

ALL or MEASURES

WTKAP

weighted kappa coefficient

AGREE

Using the TABLES Statement with the OUTPUT Statement

In order to specify that the OUTPUT data set contain a particular statistic, you must have PROC FREQ compute the statistic by using the corresponding option in the TABLES statement or the EXACT statement. For example, you cannot specify the option PCHI (Pearson chi-square) in the OUTPUT statement without also specifying a TABLES statement option or an EXACT statement option to compute the Pearson chi-square. The TABLES statement option ALL or CHISQ computes the Pearson chi-square. Additionally, if you have only one TABLES statement, the EXACT statement option CHISQ or PCHI computes the Pearson chi-square.

TABLES Statement

  • TABLES requests < / options > ;

The TABLES statement requests one-way to n -way frequency and crosstabulation tables and statistics for those tables.

If you omit the TABLES statement, PROC FREQ generates one-way frequency tables for all data set variables that are not listed in the other statements.

The following argument is required in the TABLES statement.

requests

  • specify the frequency and crosstabulation tables to produce. A request is composed of one variable name or several variable names separated by asterisks . To request a one-way frequency table, use a single variable. To request a two-way crosstabulation table, use an asterisk between two variables. To request a multiway table (an n -way table, where n >2), separate the desired variables with asterisks. The unique values of these variables form the rows, columns , and strata of the table.

    For two-way to multiway tables, the values of the last variable form the crosstabulation table columns, while the values of the next-to-last variable form the rows. Each level (or combination of levels) of the other variables forms one stratum. PROC FREQ produces a separate crosstabulation table for each stratum. For example, a specification of A*B*C*D in a TABLES statement produces k tables, where k is the number of different combinations of values for A and B . Each table lists the values for C down the side and the values for D across the top.

    You can use multiple TABLES statements in the PROC FREQ step. PROC FREQ builds all the table requests in one pass of the data, so that there is essentially no loss of efficiency. You can also specify any number of table requests in a single TABLES statement. To specify multiple table requests quickly, use a grouping syntax by placing parentheses around several variables and joining other variables or variable combinations. For example, the following statements illustrate grouping syntax.

    Table 2.8: Grouping Syntax

    Request

    Equivalent to

    tables A*(B C) ;

    tables A*B A*C ;

    tables (A B)*(C D) ;

    tables A*C B*C A*D B*D ;

    tables (A B C)*D ;

    tables A*D B*D C*D ;

    tables A “ “ C ;

    tables A B C ;

    tables (A “ “ C)*D ;

    tables A*D B*D C*D ;

Without Options

If you request a one-way frequency table for a variable without specifying options, PROC FREQ produces frequencies, cumulative frequencies, percentages of the total frequency, and cumulative percentages for each value of the variable. If you request a two-way or an n -way crosstabulation table without specifying options, PROC FREQ produces crosstabulation tables that include cell frequencies, cell percentages of the total frequency, cell percentages of row frequencies, and cell percentages of column frequencies. The procedure excludes observations with missing values from the table but displays the total frequency of missing observations below each table.

Options

The following table lists the options available with the TABLES statement. Descriptions follow in alphabetical order.

Table 2.9: TABLES Statement Options

Option

Description

Control Statistical Analysis

AGREE

requests tests and measures of classification agreement

ALL

requests tests and measures of association produced by CHISQ, MEASURES, and CMH

ALPHA=

sets the confidence level for confidence limits

BDT

requests Tarone s adjustment for the Breslow-Day test

BINOMIAL

requests binomial proportion, confidence limits and test for one-way tables

BINOMIALC

requests BINOMIAL statistics with a continuity correction

CHISQ

requests chi-square tests and measures of association based on chi-square

CL

requests confidence limits for the MEASURES statistics

CMH

requests all Cochran-Mantel-Haenszel statistics

CMH1

requests the CMH correlation statistic, and adjusted relative risks and odds ratios

CMH2

requests CMH correlation and row mean scores (ANOVA) statistics, and adjusted relative risks and odds ratios

CONVERGE=

specifies convergence criterion to compute polychoric correlation

FISHER

requests Fisher s exact test for tables larger than 2 — 2

JT

requests Jonckheere-Terpstra test

MAXITER=

specifies maximum number of iterations to compute polychoric correlation

MEASURES

requests measures of association and their asymptotic standard errors

MISSING

treats missing values as nonmissing

PLCORR

requests polychoric correlation

RELRISK

requests relative risk measures for 2 — 2 tables

RISKDIFF

requests risks and risk differences for 2 — 2 tables

RISKDIFFC

requests RISKDIFF statistics with a continuity correction

SCORES=

specifies the type of row and column scores

TESTF=

specifies expected frequencies for a one-way table chi-square test

TESTP=

specifies expected proportions for a one-way table chi-square test

TREND

requests Cochran-Armitage test for trend

Control Additional Table Information

CELLCHI2

displays each cell s contribution to the total Pearson chi-square statistic

CUMCOL

displays the cumulative column percentage in each cell

DEVIATION

displays the deviation of the cell frequency from the expected value for each cell

EXPECTED

displays the expected cell frequency for each cell

MISSPRINT

displays missing value frequencies

SPARSE

lists all possible combinations of variable levels even when a combination does not occur

TOTPCT

displays percentage of total frequency on n -way tables when n > 2

Control Displayed Output

CONTENTS=

specifies the HTML contents link for crosstabulation tables

CROSSLIST

displays crosstabulation tables in ODS column format

FORMAT=

formats the frequencies in crosstabulation tables

LIST

displays two-way to n -way tables in list format

NOCOL

suppresses display of the column percentage for each cell

NOCUM

suppresses display of cumulative frequencies and cumulative percentages in one-way frequency tables and in list format

NOFREQ

suppresses display of the frequency count for each cell

NOPERCENT

suppresses display of the percentage, row percentage, and column percentage in crosstabulation tables, or percentages and cumulative percentages in one-way frequency tables and in list format

NOPRINT

suppresses display of tables but displays statistics

NOROW

suppresses display of the row percentage for each cell

NOSPARSE

suppresses zero cell frequencies in the list display and in the OUT= data set when ZEROS is specified

NOWARN

suppresses log warning message for the chi-square test

PRINTKWT

displays kappa coefficient weights

SCOROUT

displays the row and the column scores

Create an Output Data Set

OUT=

specifies an output data set to contain variable values and frequency counts

OUTCUM

includes the cumulative frequency and cumulative percentage in the output data set for one-way tables

OUTEXPECT

includes the expected frequency of each cell in the output data set

OUTPCT

includes the percentage of column frequency, row frequency, and two-way table frequency in the output data set

You can specify the following options in a TABLES statement.

AGREE < (WT=FC) >

  • requests tests and measures of classification agreement for square tables. The AGREE option provides McNemar s test for 2 — 2 tables and Bowker s test of symmetry for tables with more than two response categories. The AGREE option also produces the simple kappa coefficient, the weighted kappa coefficient, the asymptotic standard errors for the simple and weighted kappas, and the corresponding confidence limits. When there are multiple strata, the AGREE option provides overall simple and weighted kappas as well as tests for equal kappas among strata. When there are multiple strata and two response categories, PROC FREQ computes Cochran s Q test. For more information, see the section Tests and Measures of Agreement on page 127.

    The (WT=FC) specification requests that PROC FREQ use Fleiss-Cohen weights to compute the weighted kappa coefficient. By default, PROC FREQ uses Cicchetti-Allison weights. See the section Weighted Kappa Coefficient on page 130 for more information. You can specify the PRINTKWT option to display the kappa coefficient weights.

    AGREE statistics are computed only for square tables, where the number of rows equals the number of columns. If your table is not square due to observations with zero weights, you can use the ZEROS option in the WEIGHT statement to include these observations. For more details, see the section Tables with Zero Rows and Columns on page 133.

ALL

  • requests all of the tests and measures that are computed by the CHISQ, MEASURES, and CMH options. The number of CMH statistics computed can be controlled by the CMH1 and CMH2 options.

ALPHA= ±

  • specifies the level of confidence limits. The value of the ALPHA= option must be between 0 and 1, and the default is 0.05. A confidence level of ± produces 100(1 ˆ’ ± )% confidence limits. The default of ALPHA=0.05 produces 95% confidence limits.

    ALPHA= applies to confidence limits requested by TABLES statement options. There is a separate ALPHA= option in the EXACT statement that sets the level of confidence limits for Monte Carlo estimates of exact p -values, which are requested in the EXACT statement.

BDT

  • requests Tarone s adjustment in the Breslow-Day test for homogeneity of odds ratios. (You must specify the CMH option to compute the Breslow-Day test.) See the section Breslow-Day Test for Homogeneity of the Odds Ratios on page 142 for more information.

BINOMIAL < (P= value ) (LEVEL= level-number level-value ) >

  • requests the binomial proportion for one-way tables. The BINOMIAL option also provides the asymptotic standard error, asymptotic and exact confidence intervals, and the asymptotic test for the binomial proportion. To request an exact test for the binomial proportion, use the BINOMIAL option in the EXACT statement.

    To specify the null hypothesis proportion for the test, use P=. If you omit P= value , PROC FREQ uses 0.5 as the default for the test. By default, BINOMIAL computes the proportion of observations for the first variable level that appears in the output. To specify a different level, use LEVEL= level-number or LEVEL= level-value , where level-number is the variable level s number or order in the output, and level-value is the formatted value of the variable level.

    To include a continuity correction in the asymptotic confidence interval and test, use the BINOMIALC option instead of the BINOMIAL option.

    See the section Binomial Proportion on page 118 for more information.

BINOMIALC < (P= value ) (LEVEL= level-number level-value ) >

  • requests the BINOMIAL option statistics for one-way tables, and includes a continuity correction in the asymptotic confidence interval and the asymptotic test. The BINOMIAL option statistics include the binomial proportion, the asymptotic standard error, asymptotic and exact confidence intervals, and the asymptotic test for the binomial proportion. To request an exact test for the binomial proportion, use the BINOMIAL option in the EXACT statement.

    To specify the null hypothesis proportion for the test, use P=. If you omit P= value , PROC FREQ uses 0.5 as the default for the test. By default BINOMIALC computes the proportion of observations for the first variable level that appears in the output. To specify a different level, use LEVEL= level-number or LEVEL= level-value , where level-number is the variable level s number or order in the output, and level-value is the formatted value of the variable level.

    See the section Binomial Proportion on page 118 for more information.

CELLCHI2

  • displays each crosstabulation table cell s contribution to the total Pearson chi-square statistic, which is computed as

    click to expand

    The CELLCHI2 option has no effect for one-way tables or for tables that are displayed with the LIST option.

CHISQ

  • requests chi-square tests of homogeneity or independence and measures of association based on chi-square. The tests include the Pearson chi-square, likelihood-ratio chi-square, and Mantel-Haenszel chi-square. The measures include the phi coefficient, the contingency coefficient, and Cramer s V . For 2 — 2 tables, the CHISQ option includes Fisher s exact test and the continuity-adjusted chi-square. For one-way tables, the CHISQ option requests a chi-square goodness-of-fit test for equal proportions. If you specify the null hypothesis proportions with the TESTP= option, then PROC FREQ computes a chi-square goodness-of-fit test for the specified proportions. If you specify null hypothesis frequencies with the TESTF= option, PROC FREQ computes a chi-square goodness-of-fit test for the specified frequencies. See the section Chi-Square Tests and Statistics on page 103 for more information.

CL

  • requests confidence limits for the MEASURES statistics. If you omit the MEASURES option, the CL option invokes MEASURES. The FREQ procedure determines the confidence coefficient using the ALPHA= option, which, by default, equals 0.05 and produces 95% confidence limits.

    For more information, see the section Confidence Limits on page 109.

CMH

  • requests Cochran-Mantel-Haenszel statistics, which test for association between the row and column variables after adjusting for the remaining variables in a multiway table. In addition, for 2 — 2 tables, PROC FREQ computes the adjusted Mantel-Haenszel and logit estimates of the odds ratios and relative risks and the corresponding confidence limits. For the stratified 2 — 2 case, PROC FREQ computes the Breslow-Day test for homogeneity of odds ratios. (To request Tarone s adjustment for the Breslow-Day test, use the BDT option.) The CMH1 and CMH2 options control the number of CMH statistics that PROC FREQ computes. For more information, see the section Cochran-Mantel-Haenszel Statistics on page 134.

CMH1

  • requests the Cochran-Mantel-Haenszel correlation statistic and, for 2 — 2 tables, the adjusted Mantel-Haenszel and logit estimates of the odds ratios and relative risks and the corresponding confidence limits. For the stratified 2 — 2 case, PROC FREQ computes the Breslow-Day test for homogeneity of odds ratios. Except for 2 — 2 tables, the CMH1 option requires less memory than the CMH option, which can require an enormous amount for large tables.

CMH2

  • requests the Cochran-Mantel-Haenszel correlation statistic, row mean scores (ANOVA) statistic, and, for 2 — 2 tables, the adjusted Mantel-Haenszel and logit estimates of the odds ratios and relative risks and the corresponding confidence limits. For the stratified 2 — 2 case, PROC FREQ computes the Breslow-Day test for homogeneity of odds ratios. Except for tables with two columns, the CMH2 option requires less memory than the CMH option, which can require an enormous amount for large tables.

CONTENTS= link-text

  • specifies the text for the HTML contents file links to crosstabulation tables. For information on HTML output, refer to the SAS Output Delivery System User s Guide . The CONTENTS= option affects only the HTML contents file, and not the HTML body file.

    If you omit the CONTENTS= option, by default, the HTML link text for crosstabulation tables is Cross-Tabular Freq Table.

    Note that links to all crosstabulation tables produced by a single TABLES statement use the same text. To specify different text for different crosstabulation table links, request the tables in separate TABLES statements and use the CONTENTS= option in each TABLES statement.

    The CONTENTS= option affects only links to crosstabulation tables. It does not affect links to other PROC FREQ tables. To specify link text for any other PROC FREQ table, you can use PROC TEMPLATE to create a customized table definition. The CONTENTS “LABEL attribute in the DEFINE TABLE statement of PROC TEMPLATE specifies the contents file link for the table. For detailed information, refer to the chapter titled The TEMPLATE Procedure in the SAS Output Delivery System User s Guide .

CONVERGE= value

  • specifies the convergence criterion for computing the polychoric correlation when you specify the PLCORR option. The value of the CONVERGE= option must be a positive number; by default, CONVERGE=0.0001. Iterative computation of the polychoric correlation stops when the convergence measure falls below the value of the CONVERGE= option or when the number of iterations exceeds the value specified in the MAXITER= option, whichever happens first.

    See the section Polychoric Correlation on page 116 for more information.

CROSSLIST

  • displays crosstabulation tables in ODS column format, instead of the default crosstabulation cell format. In a CROSSLIST table display, the rows correspond to the crosstabulation table cells, and the columns correspond to descriptive statistics such as Frequency, Percent, and so on. See the section Multiway Tables on page 152 for details on the contents of the CROSSLIST table.

    The CROSSLIST table displays the same information as the default crosstabulation table, but uses an ODS column format instead of the table cell format. Unlike the default crosstabulation table, the CROSSLIST table has a table definition that you can customize with PROC TEMPLATE. For more information, refer to the chapter titled The TEMPLATE Procedure in the SAS Output Delivery System User s Guide .

    You can control the contents of a CROSSLIST table with the same options available for the default crosstabulation table. These include the NOFREQ, NOPERCENT, NOROW, and NOCOL options. You can request additional information in a CROSSLIST table with the CELLCHI2, DEVIATION, EXPECTED, MISSPRINT, and TOTPCT options.

    The FORMAT= option and the CUMCOL option have no effect for CROSSLIST tables. You cannot specify both the LIST option and the CROSSLIST option in the same TABLES statement.

    You can use the NOSPARSE option to suppress display of variable levels with zero frequency in CROSSLIST tables. By default for CROSSLIST tables, PROC FREQ displays all levels of the column variable within each level of the row variable, including any column variable levels with zero frequency for that row. And for multiway tables displayed with the CROSSLIST option, the procedure displays all levels of the row variable for each stratum of the table by default, including any row variable levels with zero frequency for the stratum.

CUMCOL

  • displays the cumulative column percentages in the cells of the crosstabulation table.

DEVIATION

  • displays the deviation of the cell frequency from the expected frequency for each cell of the crosstabulation table. The DEVIATION option is valid for contingency tables but has no effect on tables produced with the LIST option.

EXPECTED

  • displays the expected table cell frequencies under the hypothesis of independence (or homogeneity). The EXPECTED option is valid for crosstabulation tables but has no effect on tables produced with the LIST option.

FISHER EXACT

  • requests Fisher s exact test for tables that are larger than 2 — 2. This test is also known as the Freeman-Halton test. For more information, see the section Fisher s Exact Test on page 106 and the EXACT Statement section on page 77.

    If you omit the CHISQ option in the TABLES statement, the FISHER option invokes CHISQ. You can also request Fisher s exact test by specifying the FISHER option in the EXACT statement.

    CAUTION: For tables with many rows or columns or with large total frequency, PROC FREQ may require a large amount of time or memory to compute exact p -values. See the section Computational Resources on page 145 for more information.

FORMAT= format-name

  • specifies a format for the following crosstabulation table cell values: frequency, expected frequency, and deviation. PROC FREQ also uses this format to display the total row and column frequencies for crosstabulation tables.

    You can specify any standard SAS numeric format or a numeric format defined with the FORMAT procedure. The format length must not exceed 24. If you omit FORMAT=, by default, PROC FREQ uses the BEST6. format to display frequencies less than 1E6, and the BEST7. format otherwise.

    To change formats for all other FREQ tables, you can use PROC TEMPLATE. For information on this procedure, refer to the chapter titled The TEMPLATE Procedure in the SAS Output Delivery System User s Guide .

JT

  • performs the Jonckheere-Terpstra test. For more information, see the section Jonckheere-Terpstra Test on page 125.

LIST

  • displays two-way to n -way tables in a list format rather than as crosstabulation tables. PROC FREQ ignores the LIST option when you request statistical tests or measures of association.

MAXITER= number

  • specifies the maximum number of iterations for computing the polychoric correlation when you specify the PLCORR option. The value of the MAXITER= option must be a positive integer; by default, MAXITER=20. Iterative computation of the polychoric correlation stops when the number of iterations exceeds the value of the MAXITER= option, or when the convergence measure falls below the value of the CONVERGE= option, whichever happens first. For more information see the section Polychoric Correlation on page 116.

MEASURES

  • requests several measures of association and their asymptotic standard errors (ASE). The measures include gamma, Kendall s tau- b , Stuart s tau- c , Somers D ( C R ), Somers D ( R C ), the Pearson and Spearman correlation coefficients, lambda (symmetric and asymmetric), uncertainty coefficients (symmetric and asymmetric). To request confidence limits for these measures of association, you can specify the CL option.

    For 2 — 2 tables, the MEASURES option also provides the odds ratio, column 1 relative risk, column 2 relative risk, and the corresponding confidence limits. Alternatively, you can obtain the odds ratio and relative risks, without the other measures of association, by specifying the RELRISK option.

    For more information, see the section Measures of Association on page 108.

MISSING

  • treats missing values as nonmissing and includes them in calculations of percentages and other statistics.

    For more information, see the section Missing Values on page 100.

MISSPRINT

  • displays missing value frequencies for all tables, even though PROC FREQ does not use the frequencies in the calculation of statistics. For more information, see the section Missing Values on page 100.

NOCOL

  • suppresses the display of column percentages in cells of the crosstabulation table.

NOCUM

  • suppresses the display of cumulative frequencies and cumulative percentages for one-way frequency tables and for crosstabulation tables in list format.

NOFREQ

  • suppresses the display of cell frequencies for crosstabulation tables. This also suppresses frequencies for row totals.

NOPERCENT

  • suppresses the display of cell percentages, row total percentages, and column total percentages for crosstabulation tables. For one-way frequency tables and crosstabulation tables in list format, the NOPERCENT option suppresses the display of percentages and cumulative percentages.

NOPRINT

  • suppresses the display of frequency and crosstabulation tables but displays all requested tests and statistics. Use the NOPRINT option in the PROC FREQ statement to suppress the display of all tables.

NOROW

  • suppresses the display of row percentages in cells of the crosstabulation table.

NOSPARSE

  • requests that PROC FREQ not invoke the SPARSE option when you specify the ZEROS option in the WEIGHT statement. The NOSPARSE option suppresses the display of cells with a zero frequency count in the list output, and it also omits them from the OUT= data set. By default, the ZEROS option invokes the SPARSE option, which displays table cells with a zero frequency count in the LIST output and includes them in the OUT= data set. For more information, see the description of the ZEROS option.

    For CROSSLIST tables, the NOSPARSE option suppresses display of variable levels with zero frequency. By default for CROSSLIST tables, PROC FREQ displays all levels of the column variable within each level of the row variable, including any column variable levels with zero frequency for that row. And for multiway tables displayed with the CROSSLIST option, the procedure displays all levels of the row variable for each stratum of the table by default, including any row variable levels with zero frequency for the stratum.

NOWARN

  • suppresses the log warning message that the asymptotic chi-square test may not be valid. By default, PROC FREQ displays this log message when more than 20 percent of the table cells have expected frequencies less than five.

OUT= SAS-data-set

  • names the output data set that contains variable values and frequency counts. The variable COUNT contains the frequencies and the variable PERCENT contains the percentages. If more than one table request appears in the TABLES statement, the contents of the data set correspond to the last table request in the TABLES statement. For more information, see the section Output Data Sets on page 148 and see the following descriptions for the options OUTCUM, OUTEXPECT, and OUTPCT.

OUTCUM

  • includes the cumulative frequency and the cumulative percentage for one-way tables in the output data set when you specify the OUT= option in the TABLES statement. The variable CUM “FREQ contains the cumulative frequency for each level of the analysis variable, and the variable CUM “PCT contains the cumulative percentage for each level. The OUTCUM option has no effect for two-way or multiway tables.

    For more information, see the section Output Data Sets on page 148.

OUTEXPECT

  • includes the expected frequency in the output data set for crosstabulation tables when you specify the OUT= option in the TABLES statement. The variable EXPECTED contains the expected frequency for each table cell. The EXPECTED option is valid for two-way or multiway tables, and has no effect for one-way tables.

    For more information, see the section Output Data Sets on page 148.

OUTPCT

  • includes the following additional variables in the output data set when you specify the OUT= option in the TABLES statement for crosstabulation tables:

    PCT “COL

    the percentage of column frequency

    PCT “ROW

    the percentage of row frequency

    PCT “TABL

    the percentage of stratum frequency, for n -way tables where n > 2

    The OUTPCT option is valid for two-way or multiway tables, and has no effect for one-way tables.

    For more information, see the section Output Data Sets on page 148.

PLCORR

  • requests the polychoric correlation coefficient. For 2 — 2 tables, this statistic is more commonly known as the tetrachoric correlation coefficient, and it is labeled as such in the displayed output. If you omit the MEASURES option, the PLCORR option invokes MEASURES. For more information, see the section Polychoric Correlation on page 116 and the descriptions for the CONVERGE= and MAXITER= options in this list.

PRINTKWT

  • displays the weights PROC FREQ uses to compute the weighted kappa coefficient. You must also specify the AGREE option, which requests the weighted kappa coefficient. You can specify (WT=FC) with the AGREE option to request Fleiss-Cohen weights. By default, PROC FREQ uses Cicchetti-Allison weights.

  • See the section Weighted Kappa Coefficient on page 130 for more information.

RELRISK

  • requests relative risk measures and their confidence limits for 2 — 2 tables. These measures include the odds ratio and the column 1 and 2 relative risks. For more information, see the section Odds Ratio and Relative Risks for 2 x 2 Tables on page 122. You can also obtain the RELRISK measures by specifying the MEASURES option, which produces other measures of association in addition to the relative risks.

RISKDIFF

  • requests column 1 and 2 risks (or binomial proportions), risk differences, and their confidence limits for 2 — 2 tables. See the section Risks and Risk Differences on page 120 for more information.

RISKDIFFC

  • requests the RISKDIFF option statistics for 2 — 2 tables, and includes a continuity correction in the asymptotic confidence limits. The RISKDIFF option statistics include the column 1 and 2 risks (or binomial proportions), risk differences, and their confidence limits. See the section Risks and Risk Differences on page 120 for more information.

SCORES= type

  • specifies the type of row and column scores that PROC FREQ uses with the Mantel-Haenszel chi-square, Pearson correlation, Cochran-Armitage test for trend, weighted kappa coefficient, and Cochran-Mantel-Haenszel statistics, where type is one of the following (the default is SCORE=TABLE):

    • MODRIDIT

    • RANK

    • RIDIT

    • TABLE

  • By default, the row or column scores are the integers 1,2,... for character variables and the actual variable values for numeric variables. Using other types of scores yields nonparametric analyses. For more information, see the section Scores on page 102.

    To display the row and column scores, you can use the SCOROUT option.

SCOROUT

  • displays the row and the column scores. You specify the score type with the SCORES= option. PROC FREQ uses the scores when it calculates the Mantel-Haenszel chi-square, Pearson correlation, Cochran-Armitage test for trend, weighted kappa coefficient, or Cochran-Mantel-Haenszel statistics. The SCOROUT option displays the row and column scores only when statistics are computed for two-way tables. To store the scores in an output data set, use the Output Delivery System.

    For more information, see the section Scores on page 102.

SPARSE

  • lists all possible combinations of the variable values for an n -way table when n > 1, even if a combination does not occur in the data. The SPARSE option applies only to crosstabulation tables displayed in list format and to the OUT= output data set. Otherwise, if you do not use the LIST option or the OUT= option, the SPARSE option has no effect.

    When you specify the SPARSE and LIST options, PROC FREQ displays all combinations of variable variables in the table listing, including those values with a frequency count of zero. By default, without the SPARSE option, PROC FREQ does not display zero-frequency values in list output. When you use the SPARSE and OUT= options, PROC FREQ includes empty crosstabulation table cells in the output data set. By default, PROC FREQ does not include zero-frequency table cells in the output data set.

    For more information, see the section Missing Values on page 100.

TESTF=( values )

  • specifies the null hypothesis frequencies for a one-way chi-square test for specified frequencies. You can separate values with blanks or commas. The sum of the frequency values must equal the total frequency for the one-way table. The number of TESTF= values must equal the number of variable levels in the one-way table. List these values in the order in which the corresponding variable levels appear in the output. If you omit the CHISQ option, the TESTF= option invokes CHISQ.

    For more information, see the section Chi-Square Test for One-Way Tables on page 104.

TESTP=( values )

  • specifies the null hypothesis proportions for a one-way chi-square test for specified proportions. You can separate values with blanks or commas. Specify values in probability form as numbers between 0 and 1, where the proportions sum to 1. Or specify values in percentage form as numbers between 0 and 100, where the percentages sum to 100. The number of TESTP= values must equal the number of variable levels in the one-way table. List these values in the order in which the corresponding variable levels appear in the output. If you omit the CHISQ option, the TESTP= option invokes CHISQ.

    For more information, see the section Chi-Square Test for One-Way Tables on page 104.

TOTPCT

  • displays the percentage of total frequency in crosstabulation tables, for n -way tables where n > 2. This percentage is also available with the LIST option or as the PERCENT variable in the OUT= output data set.

TREND

  • performs the Cochran-Armitage test for trend. The table must be 2 — C or R — 2. For more information, see the section Cochran-Armitage Test for Trend on page 124.

TEST Statement

  • TEST options ;

The TEST statement requests asymptotic tests for the specified measures of association and measures of agreement. You must use a TABLES statement with the TEST statement.

options

  • specify the statistics for which to provide asymptotic tests. The available statistics are those measures of association and agreement listed in Table 2.10. The option names are identical to those in the TABLES statement and the OUTPUT statement. You can request all available tests for groups of statistics by using group options MEASURES or AGREE. Or you can request tests individually by using one of the options shown in Table 2.10.

    Table 2.10: TEST Statement Options and Required TABLES Statement Options

    Option

    Asymptotic Tests Computed

    Required TABLES Statement Option

    AGREE

    simple kappa coefficient and weighted kappa coefficient

    AGREE

    GAMMA

    gamma

    ALL or MEASURES

    KAPPA

    simple kappa coefficient

    AGREE

    KENTB

    Kendall s tau- b

    ALL or MEASURES

    MEASURES

    gamma, Kendall s tau- b , Stuart s tau- c , Somers D ( C R ), Somers D ( R C ), the Pearson correlation, and the Spearman correlation

    ALL or MEASURES

    PCORR

    Pearson correlation coefficient

    ALL or MEASURES

    SCORR

    Spearman correlation coefficient

    ALL or MEASURES

    SMDCR

    Somers D ( C R )

    ALL or MEASURES

    SMDRC

    Somers D ( R C )

    ALL or MEASURES

    STUTC

    Stuart s tau- c

    ALL or MEASURES

    WTKAP

    weighted kappa coefficient

    AGREE

    For each measure of association or agreement that you specify, the TEST statement provides an asymptotic test that the measure equals zero. When you request an asymptotic test, PROC FREQ gives the asymptotic standard error under the null hypothesis, the test statistic, and the p -values. Additionally, PROC FREQ reports the confidence limits for that measure. The ALPHA= option in the TABLES statement determines the confidence level, which, by default, equals 0.05 and provides 95% confidence limits. For more information, see the sections Asymptotic Tests on page 109 and Confidence Limits on page 109, and see Statistical Computations beginning on page 102 for sections describing the individual measures.

    In addition to these asymptotic tests, exact tests for selected measures of association and agreement are available with the EXACT statement. See the section EXACT Statement on page 77 for more information.

WEIGHT Statement

  • WEIGHT variable < / option > ;

The WEIGHT statement specifies a numeric variable with a value that represents the frequency of the observation. The WEIGHT statement is most commonly used to input cell count data. See the Inputting Frequency Counts section on page 98 for more information. If you use the WEIGHT statement, PROC FREQ assumes that an observation represents n observations, where n is the value of variable . The value of the weight variable need not be an integer. When a weight value is missing, PROC FREQ ignores the corresponding observation. When a weight value is zero, PROC FREQ ignores the corresponding observation unless you specify the ZEROS option, which includes observations with zero weights. If a WEIGHT statement does not appear, each observation has a default weight of 1. The sum of the weight variable values represents the total number of observations.

If any value of the weight variable is negative, PROC FREQ displays the frequencies (as measured by the weighted values) but does not compute percentages and other statistics. If you create an output data set using the OUT= option in the TABLES statement, PROC FREQ creates the PERCENT variable and assigns a missing value for each observation. PROC FREQ also assigns missing values to the variables that the OUTEXPECT and OUTPCT options create. You cannot create an output data set using the OUTPUT statement since statistics are not computed when there are negative weights.

Option

ZEROS

  • includes observations with zero weight values. By default, PROC FREQ ignores observations with zero weights.

    If you specify the ZEROS option, frequency and and crosstabulation tables display any levels corresponding to observations with zero weights. Without the ZEROS option, PROC FREQ does not process observations with zero weights, and so does not display levels that contain only observations with zero weights.

    With the ZEROS option, PROC FREQ includes levels with zero weights in the chi-square goodness-of-fit test for one-way tables. Also, PROC FREQ includes any levels with zero weights in binomial computations for one-way tables. This enables computation of binomial estimates and tests when there are no observations with positive weights in the specified level.

    For two-way tables, the ZEROS option enables computation of kappa statistics when there are levels containing no observations with positive weight. For more information, see the section Tables with Zero Rows and Columns on page 133.

    Note that even with the ZEROS option, PROC FREQ does not compute the CHISQ or MEASURES statistics for two-way tables when the table has a zero row or zero column, because most of these statistics are undefined in this case.

    The ZEROS option invokes the SPARSE option in the TABLES statement, which includes table cells with a zero frequency count in the list output and the OUT= data set. By default, without the SPARSE option, PROC FREQ does not include zero frequency cells in the list output or in the OUT= data set. If you specify the ZEROS option in the WEIGHT statement but do not want the SPARSE option, you can specify the NOSPARSE option in the TABLES statement.




Base SAS 9.1.3 Procedures Guide (Vol. 3)
Base SAS 9.1 Procedures Guide, Volumes 1, 2, 3 and 4
ISBN: 1590472047
EAN: 2147483647
Year: 2004
Pages: 74

flylib.com © 2008-2017.
If you may any questions please contact us: flylib@qtcs.net