Chapter 9: The COMPARE Procedure


Overview: COMPARE Procedure

What Does the COMPARE Procedure Do?

The COMPARE procedure compares the contents of two SAS data sets, selected variables in different data sets, or variables within the same data set.

PROC COMPARE compares two data sets: the base data set and the comparison data set . The procedure determines matching variables and matching observations. Matching variables are variables with the same name or variables that you explicitly pair by using the VAR and WITH statements. Matching variables must be of the same type. Matching observations are observations that have the same values for all ID variables that you specify or, if you do not use the ID statement, that occur in the same position in the data sets. If you match observations by ID variables, then both data sets must be sorted by all ID variables.

What Information Does PROC COMPARE Provide?

PROC COMPARE generates the following information about the two data sets that are being compared:

  • whether matching variables have different values

  • whether one data set has more observations than the other

  • what variables the two data sets have in common

  • how many variables are in one data set but not in the other

  • whether matching variables have different formats, labels, or types.

  • a comparison of the values of matching observations.

Further, PROC COMPARE creates two kinds of output data sets that give detailed information about the differences between observations of variables it is comparing.

The following example compares the data sets PROCLIB.ONE and PROCLIB.TWO, which contain similar data about students:

 data proclib.one(label='First Data Set');     input student year $ state $ gr1 gr2;     label year='Year of Birth';     format gr1 4.1;     datalines;  1000 1970 NC 85 87  1042 1971 MD 92 92  1095 1969 PA 78 72  1187 1970 MA 87 94  ;  data proclib.two(label='Second Data Set');     input student $ year $ state $ gr1           gr2 major $;     label state='Home State';     format gr1 5.2;     datalines;  1000 1970 NC 84 87 Math  1042 1971 MA 92 92 History  1095 1969 PA 79 73 Physics  1187 1970 MD 87 74 Dance  1204 1971 NC 82 96 French  ; 

How Can PROC COMPARE Output Be Customized?

PROC COMPARE produces lengthy output. You can use one or more options to determine the kinds of comparisons to make and the degree of detail in the report. For example, in the following PROC COMPARE step, the NOVALUES option suppresses the part of the output that shows the differences in the values of matching variables:

 proc compare base=proclib.one               compare=proclib.two novalues;  run; 
Output 9.1: Comparison of Two Data Sets
start example
 The SAS System                                  1                                  COMPARE Procedure                      Comparison of PROCLIB.ONE with PROCLIB.TWO                                    (Method=EXACT)                                  Data Set Summary  Dataset                Created           Modified   NVar    NObs   Label  PROCLIB.ONE   13MAY98:15:01:42   13MAY98:15:01:42      5       4   First Data Set  PROCLIB.TWO   13MAY98:15:01:44   13MAY98:15:01:44      6       5   Second Data Set                                 Variables Summary                 Number of Variables in Common: 5.                 Number of Variables in PROCLIB.TWO but not in PROCLIB.ONE: 1.                 Number of Variables with Conflicting Types: 1.                 Number of Variables with Differing Attributes: 3.                 Listing of Common Variables with Conflicting Types                        Variable   Dataset      Type   Length                        student    PROCLIB.ONE  Num         8                                   PROCLIB.TWO  Char        8               Listing of Common Variables with Differing Attributes             Variable  Dataset      Type  Length  Format  Label             year      PROCLIB.ONE  Char       8          Year of Birth                       PROCLIB.TWO  Char       8             state     PROCLIB.ONE  Char       8                       PROCLIB.TWO  Char       8          Home State 
 The SAS System                                  2                                  COMPARE Procedure                      Comparison of PROCLIB.ONE with PROCLIB.TWO                                    (Method=EXACT)                Listing of Common Variables with Differing Attributes                         Variable  Dataset      Type  Length  Format  Label              gr1       PROCLIB.ONE  Num        8  4.1                        PROCLIB.TWO  Num        8  5.2                                 Observation Summary                            Observation      Base  Compare                                      First Obs           1        1                            First Unequal       1        1                            Last   Unequal      4        4                            Last   Match        4        4                            Last   Obs          .        5           Number of Observations in Common: 4.           Number of Observations in PROCLIB.TWO but not in PROCLIB.ONE: 1.           Total Number of Observations Read from PROCLIB.ONE: 4.           Total Number of Observations Read from PROCLIB.TWO: 5.           Number of Observations with Some Compared Variables Unequal: 4.           Number of Observations with All Compared Variables Equal: 0. 
 The SAS System                                  3                                  COMPARE Procedure                      Comparison of PROCLIB.ONE with PROCLIB.TWO                                    (Method=EXACT)                                        Values Comparison Summary           Number of Variables Compared with All Observations Equal: 1.           Number of Variables Compared with Some Observations Unequal: 3.           Total Number of Values which Compare Unequal: 6.           Maximum Difference: 20.                            Variables with Unequal Values                 Variable  Type  Len   Compare Label  Ndif     MaxDif                 state     CHAR    8   Home State        2                 gr1       NUM     8                     2      1.000                 gr2       NUM     8                     2     20.000 
end example
 

Procedure Output on page 244 shows the default output for these two data sets. Example 1 on page 255 shows the complete output for these two data sets.




Base SAS 9.1.3 Procedures Guide (Vol. 1)
Base SAS 9.1 Procedures Guide, Volumes 1, 2, 3 and 4
ISBN: 1590472047
EAN: 2147483647
Year: 2004
Pages: 260

flylib.com © 2008-2017.
If you may any questions please contact us: flylib@qtcs.net