Procedure features:
PROC MEANS statement options:
statistic keywords
FW=
VAR statement
This example
specifies the analysis variables
computes the statistics for the specified keywords and displays them in order
specifies the field width of the statistics.
Set the SAS system options. The NODATE option suppresses the display of the date and time in the output. PAGENO= specifies the starting page number. LINESIZE= specifies the output line length, and PAGESIZE= specifies the number of lines on an output page.
options nodate pageno=1 linesize=80 pagesize=60;
Create the CAKE data set. CAKE contains data from a cake-baking contest: each participant s last name , age, score for presentation, score for taste, cake flavor, and number of cake layers. The number of cake layers is missing for two observations. The cake flavor is missing for another observation.
data cake; input LastName $ 1-12 Age 13-14 PresentScore 16-17 TasteScore 19-20 Flavor $ 23-32 Layers 34 ; datalines; Orlando 27 93 80 Vanilla 1 Ramey 32 84 72 Rum 2 Goldston 46 68 75 Vanilla 1 Roe 38 79 73 Vanilla 2 Larsen 23 77 84 Chocolate . Davis 51 86 91 Spice 3 Strickland 19 82 79 Chocolate 1 Nguyen 57 77 84 Vanilla . Hildenbrand 33 81 83 Chocolate 1 Byron 62 72 87 Vanilla 2 Sanders 26 56 79 Chocolate 1 Jaeger 43 66 74 1 Davis 28 69 75 Chocolate 2 Conrad 69 85 94 Vanilla 1 Walters 55 67 72 Chocolate 2 Rossburger 28 78 81 Spice 2 Matthew 42 81 92 Chocolate 2 Becker 36 62 83 Spice 2 Anderson 27 87 85 Chocolate 1 Merritt 62 73 84 Chocolate 1 ;
Specify the analyses and the analysis options. The statistic keywords specify the statistics and their order in the output. FW= uses a field width of eight to display the statistics.
proc means data=cake n mean max min range std fw=8;
Specify the analysis variables. The VAR statement specifies that PROC MEANS calculate statistics on the PresentScore and TasteScore variables.
var PresentScore TasteScore;
Specify the title.
title 'Summary of Presentation and Taste Scores'; run;
PROC MEANS lists PresentScore first because this is the first variable that is specified in the VAR statement. A field width of eight truncates the statistics to four decimal places.
Summary of Presentation and Taste Scores 1 The MEANS Procedure Variable N Mean Maximum Minimum Range Std Dev ------------------------------------------------------------------------------- PresentScore 20 76.1500 93.0000 56.0000 37.0000 9.3768 TasteScore 20 81.3500 94.0000 72.0000 22.0000 6.6116 -------------------------------------------------------------------------------
Procedure features:
PROC MEANS statement option:
MAXDEC=
CLASS statement
TYPES statement
This example
analyzes the data for the two-way combination of class variables and across all observations
limits the number of decimal places for the displayed statistics.
Set the SAS system options. The NODATE option suppresses the display of the date and time in the output. PAGENO= specifies the starting page number. LINESIZE= specifies the output line length, and PAGESIZE= specifies the number of lines on an output page.
options nodate pageno=1 linesize=80 pagesize=60;
Create the GRADE data set. GRADE contains each student s last name, gender, status of either undergraduate (1) or graduate (2), expected year of graduation, class section (A or B), final exam score, and final grade for the course.
data grade; input Name $ 1-8 Gender $ 11 Status Year $ 15-16 Section $ 18 Score 20-21 FinalGrade 23-24; datalines; Abbott F 2 97 A 90 87 Branford M 1 98 A 92 97 Crandell M 2 98 B 81 71 Dennison M 1 97 A 85 72 Edgar F 1 98 B 89 80 Faust M 1 97 B 78 73 Greeley F 2 97 A 82 91 Hart F 1 98 B 84 80 Isley M 2 97 A 88 86 Jasper M 1 97 B 91 93 ;
Generate the default statistics and specify the analysis options. Because no statistics are specified in the PROC MEANS statement, all default statistics (N, MEAN, STD, MIN, MAX) are generated. MAXDEC= limits the displayed statistics to three decimal places.
proc means data=grade maxdec=3;
Specify the analysis variable. The VAR statement specifies that PROC MEANS calculate statistics on the Score variable.
var Score;
Specify subgroups for the analysis. The CLASS statement separates the analysis into subgroups. Each combination of unique values for Status and Year represents a subgroup .
class Status Year;
Specify which subgroups to analyze. The TYPES statement requests that the analysis be performed on all the observations in the GRADE data set as well as the two-way combination of Status and Year, which results in four subgroups (because Status and Year each have two unique values).
types () status*year;
Specify the title.
title 'Final Exam Grades for Student Status and Year of Graduation'; run;
PROC MEANS displays the default statistics for all the observations (_TYPE_=0) and the four class levels of the Status and Year combination (Status=1, Year=97; Status=1, Year=98; Status=2, Year=97; Status=2, Year=98).
Final Exam Grades for Student Status and Year of Graduation 1 The MEANS Procedure Analysis Variable : Score N Obs N Mean Std Dev Minimum Maximum -------------------------------------------------------------------------- 10 10 86.000 4.714 78.000 92.000 -------------------------------------------------------------------------- Analysis Variable : Score N Status Year Obs N Mean Std Dev Minimum Maximum ------------------------------------------------------------------------------ 1 97 3 3 84.667 6.506 78.000 91.000 98 3 3 88.333 4.041 84.000 92.000 2 97 3 3 86.667 4.163 82.000 90.000 98 1 1 81.000 . 81.000 81.000 ------------------------------------------------------------------------------
Procedure features:
PROC MEANS statement option:
statistic keywords
BY statement
CLASS statement
Other features:
SORT procedure
Data set: GRADE on page 561
This example
separates the analysis for the combination of class variables within BY values
shows the sort order requirement for the BY statement
calculates the minimum, maximum, and median.
Set the SAS system options. The NODATE option suppresses the display of the date and time in the output. PAGENO= specifies the starting page number. LINESIZE= specifies the output line length, and PAGESIZE= specifies the number of lines on an output page.
options nodate pageno=1 linesize=80 pagesize=60;
Sort the GRADE data set. PROC SORT sorts the observations by the variable Section. Sorting is required in order to use Section as a BY variable in the PROC MEANS step.
proc sort data=Grade out=GradeBySection; by section; run;
Specify the analyses. The statistic keywords specify the statistics and their order in the output.
proc means data=GradeBySection min max median;
Divide the data set into BY groups. The BY statement produces a separate analysis for each value of Section.
by Section;
Specify the analysis variable. The VAR statement specifies that PROC MEANS calculate statistics on the Score variable.
var Score;
Specify subgroups for the analysis. The CLASS statement separates the analysis by the values of Status and Year. Because there is no TYPES statement in this program, analyses are performed for each subgroup, within each BY group .
class Status Year;
Specify the titles.
title1 'Final Exam Scores for Student Status and Year of Graduation'; title2 ' Within Each Section'; run;
Final Exam Scores for Student Status and Year of Graduation 1 Within Each Section ---------------------------------- Section=A ----------------------------------- The MEANS Procedure Analysis Variable : Score N Status Year Obs Minimum Maximum Median --------------------------------------------------------------------- 1 97 1 85.0000000 85.0000000 85.0000000 98 1 92.0000000 92.0000000 92.0000000 2 97 3 82.0000000 90.0000000 88.0000000 --------------------------------------------------------------------- ---------------------------------- Section=B ----------------------------------- Analysis Variable : Score N Status Year Obs Minimum Maximum Median --------------------------------------------------------------------- 1 97 2 78.0000000 91.0000000 84.5000000 98 2 84.0000000 89.0000000 86.5000000 2 98 1 81.0000000 81.0000000 81.0000000 ---------------------------------------------------------------------
Procedure features:
PROC MEANS statement options:
CLASSDATA=
EXCLUSIVE
FW=
MAXDEC=
PRINTALLTYPES
CLASS statement
Data set: CAKE on page 559
This example
specifies the field width and decimal places of the displayed statistics
uses only the values in CLASSDATA= data set as the levels of the combinations of class variables
calculates the range, median, minimum, and maximum
displays all combinations of the class variables in the analysis.
Set the SAS system options. The NODATE option suppresses the display of the date and time in the output. PAGENO= specifies the starting page number. LINESIZE= specifies the output line length, and PAGESIZE= specifies the number of lines on an output page.
options nodate pageno=1 linesize=80 pagesize=60;
Create the CAKETYPE data set. CAKETYPE contains the cake flavors and number of layers that must occur in the PROC MEANS output.
data caketype; input Flavor $ 1-10 Layers 12; datalines; Vanilla 1 Vanilla 2 Vanilla 3 Chocolate 1 Chocolate 2 Chocolate 3 ;
Specify the analyses and the analysis options. The FW= option uses a field width of seven and the MAXDEC= option uses zero decimal places to display the statistics. CLASSDATA= and EXCLUSIVE restrict the class levels to the values that are in the CAKETYPE data set. PRINTALLTYPES displays all combinations of class variables in the output.
proc means data=cake range median min max fw=7 maxdec=0 classdata=caketype exclusive printalltypes;
Specify the analysis variable. The VAR statement specifies that PROC MEANS calculate statistics on the TasteScore variable.
var TasteScore;
Specify subgroups for analysis. The CLASS statement separates the analysis by the values of Flavor and Layers. Note that these variables, and only these variables, must appear in the CAKETYPE data set.
class flavor layers;
Specify the title.
title 'Taste Score For Number of Layers and Cake Flavor'; run;
PROC MEANS calculates statistics for the 13 chocolate and vanilla cakes. Because the CLASSDATA= data set contains 3 as the value of Layers, PROC MEANS uses 3 as a class value even though the frequency is zero.
Taste Score For Number of Layers and Cake Flavor 1 The MEANS Procedure Analysis Variable : TasteScore N Obs Range Median Minimum Maximum ------------------------------------------------ 13 22 80 72 94 ------------------------------------------------ Analysis Variable : TasteScore N Layers Obs Range Median Minimum Maximum ---------------------------------------------------------- 1 8 19 82 75 94 2 5 20 75 72 92 3 0 . . . . ---------------------------------------------------------- Analysis Variable : TasteScore N Flavor Obs Range Median Minimum Maximum -------------------------------------------------------------- Chocolate 8 20 81 72 92 Vanilla 5 21 80 73 94 -------------------------------------------------------------- Analysis Variable : TasteScore N Flavor Layers Obs Range Median Minimum Maximum -------------------------------------------------------------------------- Chocolate 1 5 6 83 79 85 2 3 20 75 72 92 3 0 . . . . Vanilla 1 3 19 80 75 94 2 2 14 80 73 87 3 0 . . . . --------------------------------------------------------------------------
Procedure features:
PROC MEANS statement options:
statistic keywords
FW=
NONOBS
CLASS statement options:
MLF
ORDER=
TYPES statement
Other features
FORMAT procedure
FORMAT statement
Data set: CAKE on page 559
This example
computes the statistics for the specified keywords and displays them in order
specifies the field width of the statistics
suppresses the column with the total number of observations
analyzes the data for the one-way combination of cake flavor and the two-way combination of cake flavor and participant s age
assigns user -defined formats to the class variables
uses multilabel formats as the levels of class variables
orders the levels of the cake flavors by the descending frequency count and orders the levels of age by the ascending formatted values.
Set the SAS system options. The NODATE option suppresses the display of the date and time in the output. PAGENO= specifies the starting page number. LINESIZE= specifies the output line length, and PAGESIZE= specifies the number of lines on an output page.
options nodate pageno=1 linesize=80 pagesize=64;
Create the $FLVRFMT. and AGEFMT. formats. PROC FORMAT creates user-defined formats to categorize the cake flavors and ages of the participants . MULTILABEL creates a multilabel format for Age. A multilabel format is one in which multiple labels can be assigned to the same value, in this case because of overlapping ranges. Each value is represented in the output for each range in which it occurs.
proc format; value $flvrfmt 'Chocolate'='Chocolate' 'Vanilla'='Vanilla' 'Rum','Spice'='Other Flavor'; value agefmt (multilabel) 15 - 29='below 30 years' 30 - 50='between 30 and 50' 51 - high='over 50 years' 15 - 19='15 to 19' 20 - 25='20 to 25' 25 - 39='25 to 39' 40 - 55='40 to 55' 56 - high='56 and above'; run;
Specify the analyses and the analysis options. FW= uses a field width of six to display the statistics. The statistic keywords specify the statistics and their order in the output. NONOBS suppresses the N Obs column.
proc means data=cake fw=6 n min max median nonobs;
Specify subgroups for the analysis. The CLASS statements separate the analysis by values of Flavor and Age. ORDER=FREQ orders the levels of Flavor by descending frequency count. ORDER=FMT orders the levels of Age by ascending formatted values. MLF specifies that multilabel value formats be used for Age.
class flavor/order=freq; class age /mlf order=fmt;
Specify which subgroups to analyze. The TYPES statement requests the analysis for the one-way combination of Flavor and the two-way combination of Flavor and Age.
types flavor flavor*age;
Specify the analysis variable. The VAR statement specifies that PROC MEANS calculate statistics on the TasteScore variable.
var TasteScore;
Format the output. The FORMAT statement assigns user-defined formats to the Age and Flavor variables for this analysis.
format age agefmt. flavor $flvrfmt.;
Specify the title.
title 'Taste Score for Cake Flavors and Participant''s Age'; run;
The one-way combination of class variables appears before the two-way combination. A field width of six truncates the statistics to four decimal places. For the two-way combination of Age and Flavor, the total number of observations is greater than the one-way combination of Flavor. This situation arises because of the multilabel format for age, which maps one internal value to more than one formatted value.
The order of the levels of Flavor is based on the frequency count for each level. The order of the levels of Age is based on the order of the user-defined formats.
Taste Score for Cake Flavors and Participant's Age 1 The MEANS Procedure Analysis Variable : TasteScore Flavor N Min Max Median ------------------------------------------------- Chocolate 9 72.00 92.00 83.00 Vanilla 6 73.00 94.00 82.00 Other Flavor 4 72.00 91.00 82.00 ------------------------------------------------- Analysis Variable : TasteScore Flavor Age N Min Max Median ---------------------------------------------------------------------- Chocolate 15 to 19 1 79.00 79.00 79.00 20 to 25 1 84.00 84.00 84.00 25 to 39 4 75.00 85.00 81.00 40 to 55 2 72.00 92.00 82.00 56 and above 1 84.00 84.00 84.00 below 30 years 5 75.00 85.00 79.00 between 30 and 50 2 83.00 92.00 87.50 over 50 years 2 72.00 84.00 78.00 Vanilla 25 to 39 2 73.00 80.00 76.50 40 to 55 1 75.00 75.00 75.00 56 and above 3 84.00 94.00 87.00 below 30 years 1 80.00 80.00 80.00 between 30 and 50 2 73.00 75.00 74.00 over 50 years 3 84.00 94.00 87.00 Other Flavor 25 to 39 3 72.00 83.00 81.00 40 to 55 1 91.00 91.00 91.00 below 30 years 1 81.00 81.00 81.00 between 30 and 50 2 72.00 83.00 77.50 over 50 years 1 91.00 91.00 91.00 ----------------------------------------------------------------------
Procedure features:
PROC MEANS statement options:
COMPLETETYPES
FW=
MISSING
NONOBS
CLASS statement options:
EXCLUSIVE
ORDER=
PRELOADFMT
WAYS statement
Other features
FORMAT procedure
FORMAT statement
Data set: CAKE on page 559
This example
specifies the field width of the statistics
suppresses the column with the total number of observations
includes all possible combinations of class variables values in the analysis even if the frequency is zero
considers missing values as valid class levels
analyzes the one-way and two-way combinations of class variables
assigns user-defined formats to the class variables
uses only the preloaded range of user-defined formats as the levels of class variables
orders the results by the value of the formatted data.
Set the SAS system options. The NODATE option suppresses the display of the date and time in the output. PAGENO= specifies the starting page number. LINESIZE= specifies the output line length, and PAGESIZE= specifies the number of lines on an output page.
options nodate pageno=1 linesize=80 pagesize=64;
Create the LAYERFMT. and $FLVRFMT. formats. PROC FORMAT creates user-defined formats to categorize the number of cake layers and the cake flavors. NOTSORTED keeps $FLVRFMT unsorted to preserve the original order of the format values.
proc format; value layerfmt 1='single layer' 2-3='multi-layer' .='unknown'; value $flvrfmt (notsorted) 'Vanilla'='Vanilla' 'Orange','Lemon'='Citrus' 'Spice'='Spice' 'Rum','Mint','Almond'='Other Flavor'; run;
Generate the default statistics and specify the analysis options. FW= uses a field width of seven to display the statistics. COMPLETETYPES includes class levels with a frequency of zero. MISSING considers missing values valid values for all class variables. NONOBS suppresses the N Obs column. Because no specific analyses are requested , all default analyses are performed.
proc means data=cake fw=7 completetypes missing nonobs;
Specify subgroups for the analysis. The CLASS statement separates the analysis by values of Flavor and Layers. PRELOADFMT and EXCLUSIVE restrict the levels to the preloaded values of the user-defined formats. ORDER=DATA orders the levels of Flavor and Layer by formatted data values.
class flavor layers/preloadfmt exclusive order=data;
Specify which subgroups to analyze. The WAYS statement requests one-way and two-way combinations of class variables.
ways 1 2;
Specify the analysis variable. The VAR statement specifies that PROC MEANS calculate statistics on the TasteScore variable.
var TasteScore;
Format the output. The FORMAT statement assigns user-defined formats to the Flavor and Layers variables for this analysis.
format layers layerfmt. flavor $flvrfmt.;
Specify the title.
title 'Taste Score For Number of Layers and Cake Flavors'; run;
The one-way combination of class variables appears before the two-way combination. PROC MEANS reports only the level values that are listed in the preloaded range of user-defined formats even when the frequency of observations is zero (in this case, citrus). PROC MEANS rejects entire observations based on the exclusion of any single class value in a given observation. Therefore, when the number of layers is unknown, statistics are calculated for only one observation. The other observation is excluded because the flavor chocolate was not included in the preloaded user-defined format for Flavor.
The order of the levels is based on the order of the user-defined formats. PROC FORMAT automatically sorted the Layers format and did not sort the Flavor format.
Taste Score For Number of Layers and Cake Flavors 1 The MEANS Procedure Analysis Variable : TasteScore Layers N Mean Std Dev Minimum Maximum --------------------------------------------------------------- unknown 1 84.000 . 84.000 84.000 single layer 3 83.000 9.849 75.000 94.000 multi-layer 6 81.167 7.548 72.000 91.000 --------------------------------------------------------------- Analysis Variable : TasteScore Flavor N Mean Std Dev Minimum Maximum --------------------------------------------------------------- Vanilla 6 82.167 7.834 73.000 94.000 Citrus 0 . . . . Spice 3 85.000 5.292 81.000 91.000 Other Flavor 1 72.000 . 72.000 72.000 --------------------------------------------------------------- Analysis Variable : TasteScore Flavor Layers N Mean Std Dev Minimum Maximum ------------------------------------------------------------------------------- Vanilla unknown 1 84.000 . 84.000 84.000 single layer 3 83.000 9.849 75.000 94.000 multi-layer 2 80.000 9.899 73.000 87.000 Citrus unknown 0 . . . . single layer 0 . . . . multi-layer 0 . . . . Spice unknown 0 . . . . single layer 0 . . . . multi-layer 3 85.000 5.292 81.000 91.000 Other Flavor unknown 0 . . . . single layer 0 . . . . multi-layer 1 72.000 . 72.000 72.000 -------------------------------------------------------------------------------
Procedure features:
PROC MEANS statement options:
ALPHA=
FW=
MAXDEC=
CLASS statement
This example
specifies the field width and number of decimal places of the statistics
computes a two-sided 90 percent confidence limit for the mean values of MoneyRaised and HoursVolunteered for the three years of data.
If this data is representative of a larger population of volunteers, then the confidence limits provide ranges of likely values for the true population means.
Create the CHARITY data set. CHARITY contains information about high-school students volunteer work for a charity. The variables give the name of the high school, the year of the fund-raiser, the first name of each student, the amount of money each student raised, and the number of hours each student volunteered. A DATA step on page 1392 creates this data set.
data charity; input School $ 1-7 Year 9-12 Name $ 14-20 MoneyRaised 22-26 HoursVolunteered 28-29; datalines; Monroe 1992 Allison 31.65 19 Monroe 1992 Barry 23.76 16 Monroe 1992 Candace 21.11 5 ... more data lines ... Kennedy 1994 Sid 27.45 25 Kennedy 1994 Will 28.88 21 Kennedy 1994 Morty 34.44 25 ;
Specify the analyses and the analysis options. FW= uses a field width of eight and MAXDEC= uses two decimal places to display the statistics. ALPHA=0.1 specifies a 90% confidence limit, and the CLM keyword requests two-sided confidence limits. MEAN and STD request the mean and the standard deviation, respectively.
proc means data=charity fw=8 maxdec=2 alpha=0.1 clm mean std;
Specify subgroups for the analysis. The CLASS statement separates the analysis by values of Year.
class Year;
Specify the analysis variables. The VAR statement specifies that PROC MEANS calculate statistics on the MoneyRaised and HoursVolunteered variables.
var MoneyRaised HoursVolunteered;
Specify the titles.
title 'Confidence Limits for Fund Raising Statistics'; title2 '1992-94'; run;
PROC MEANS displays the lower and upper confidence limits for both variables for each year.
Confidence Limits for Fund Raising Statistics 1 1992-94 The MEANS Procedure N Lower 90% Upper 90% Year Obs Variable CL for Mean CL for Mean Mean Std Dev ------------------------------------------------------------------------------ 1992 31 MoneyRaised 25.21 32.40 28.80 11.79 HoursVolunteered 17.67 23.17 20.42 9.01 1993 32 MoneyRaised 25.17 31.58 28.37 10.69 HoursVolunteered 15.86 20.02 17.94 6.94 1994 46 MoneyRaised 26.73 33.78 30.26 14.23 HoursVolunteered 19.68 22.63 21.15 5.96 ------------------------------------------------------------------------------
Procedure features:
PROC MEANS statement option:
NOPRINT
CLASS statement
OUTPUT statement options
statistic keywords
IDGROUP
LEVELS
WAYS
Other features:
PRINT procedure
Data set: GRADE on page 561
This example
suppresses the display of PROC MEANS output
stores the average final grade in a new variable
stores the name of the student with the best final exam scores in a new variable
stores the number of class variables are that are combined in the _WAY_ variable
displays the output data set.
stores the value of the class level in the _LEVEL_ variable
Set the SAS system options. The NODATE option suppresses the display of the date and time in the output. PAGENO= specifies the starting page number. LINESIZE= specifies the output line length, and PAGESIZE= specifies the number of lines on an output page.
options nodate pageno=1 linesize=80 pagesize=60;
Specify the analysis options. NOPRINT suppresses the display of all PROC MEANS output.
proc means data=Grade noprint;
Specify subgroups for the analysis. The CLASS statement separates the analysis by values of Status and Year.
class Status Year;
Specify the analysis variable. The VAR statement specifies that PROC MEANS calculate statistics on the FinalGrade variable.
var FinalGrade;
Specify the output data set options. The OUTPUT statement creates the SUMSTAT data set and writes the mean value for the final grade to the new variable AverageGrade. IDGROUP writes the name of the student with the top exam score to the variable BestScore and the observation number that contained the top score. WAYS and LEVELS write information on how the class variables are combined.
output out=sumstat mean=AverageGrade idgroup (max(score) obs out (name)=BestScore) / ways levels; run;
Print the output data set WORK.SUMSTAT. The NOOBS option suppresses the observation numbers .
proc print data=sumstat noobs; title1 'Average Undergraduate and Graduate Course Grades'; title2 'For Two Years'; run;
The first observation contains the average course grade and the name of the student with the highest exam score over the two-year period. The next four observations contain values for each class variable value. The remaining four observations contain values for the Year and Status combination. The variables _WAY_, _TYPE_, and _LEVEL_ show how PROC MEANS created the class variable combinations. The variable _OBS_ contains the observation number in the GRADE data set that contained the highest exam score.
Average Undergraduate and Graduate Course Grades 1 For Two Years Average Best Status Year _WAY_ _TYPE_ _LEVEL_ _FREQ_ Grade Score _OBS_ 0 0 1 10 83.0000 Branford 2 97 1 1 1 6 83.6667 Jasper 10 98 1 1 2 4 82.0000 Branford 2 1 1 2 1 6 82.5000 Branford 2 2 1 2 2 4 83.7500 Abbott 1 1 97 2 3 1 3 79.3333 Jasper 10 1 98 2 3 2 3 85.6667 Branford 2 2 97 2 3 3 3 88.0000 Abbott 1 2 98 2 3 4 1 71.0000 Crandell 3
Procedure features:
PROC MEANS statement options:
DESCEND
NOPRINT
CLASS statement
OUTPUT statement options:
statistic keywords
Other features:
PRINT procedure
WHERE= data set option
Data set: GRADE on page 561
This example
suppresses the display of PROC MEANS output
stores the statistics for the class level and combinations of class variables that are specified by WHERE= in the output data set
orders observations in the output data set by descending _TYPE_ value
stores the mean exam scores and mean final grades without assigning new variables names
stores the median final grade in a new variable
displays the output data set.
Set the SAS system options. The NODATE option suppresses the display of the date and time in the output. PAGENO= specifies the starting page number. LINESIZE= specifies the output line length, and PAGESIZE= specifies the number of lines on an output page.
options nodate pageno=1 linesize=80 pagesize=60;
Specify the analysis options. NOPRINT suppresses the display of all PROC MEANS output. DESCEND orders the observations in the OUT= data set by descending _TYPE_ value.
proc means data=Grade noprint descend;
Specify subgroups for the analysis. The CLASS statement separates the analysis by values of Status and Year.
class Status Year;
Specify the analysis variables. The VAR statement specifies that PROC MEANS calculate statistics on the Score and FinalGrade variables.
var Score FinalGrade;
Specify the output data set options. The OUTPUT statement writes the mean for Score and FinalGrade to variables of the same name. The median final grade is written to the variable MedianGrade. The WHERE= data set option restricts the observations in SUMDATA. One observation contains overall statistics (_TYPE_=0). The remainder must have a status of 1.
output out=Sumdata (where=(status='1' or _type_=0)) mean= median(finalgrade)=MedianGrade; run;
Print the output data set WORK.SUMDATA.
proc print data=Sumdata; title 'Exam and Course Grades for Undergraduates Only'; title2 'and for All Students'; run;
The first three observations contain statistics for the class variable levels with a status of 1. The last observation contains the statistics for all the observations (no subgroup). Score contains the mean test score and FinalGrade contains the mean final grade.
Exam and Course Grades for Undergraduates Only 1 and for All Students Final Median Obs Status Year _TYPE_ _FREQ_ Score Grade Grade 1 1 97 3 3 84.6667 79.3333 73 2 1 98 3 3 88.3333 85.6667 80 3 1 2 6 86.5000 82.5000 80 4 0 10 86.0000 83.0000 83
Procedure features:
PROC MEANS statement options:
CHARTYPE
NOPRINT
NWAY
CLASS statement options:
ASCENDING
MISSING
ORDER=
OUTPUT statement
Other features:
PRINT procedure
Data set: CAKE on page 559
This example
suppresses the display of PROC MEANS output
considers missing values as valid level values for only one class variable
orders observations in the output data set by the ascending frequency for a single class variable
stores observations for only the highest _TYPE_ value
stores _TYPE_ as binary character values
stores the maximum taste score in a new variable
displays the output data set.
Set the SAS system options. The NODATE option suppresses the display of the date and time in the output. PAGENO= specifies the starting page number. LINESIZE= specifies the output line length, and PAGESIZE= specifies the number of lines on an output page.
options nodate pageno=1 linesize=80 pagesize=60;
Specify the analysis options. NWAY prints observations with the highest _TYPE_ value. NOPRINT suppresses the display of all PROC MEANS output.
proc means data=cake nway noprint;
Specify subgroups for the analysis. The CLASS statements separate the analysis by Flavor and Layers. ORDER=FREQ and ASCENDING order the levels of Flavor by ascending frequency. MISSING uses missing values of Layers as a valid class level value.
class flavor /order=freq ascending; class layers /missing;
Specify the analysis variable. The VAR statement specifies that PROC MEANS calculate statistics on the TasteScore variable.
var TasteScore;
Specify the output data set options. The OUTPUT statement creates the CAKESTAT data set and outputs the maximum value for the taste score to the new variable HighScore.
output out=cakestat max=HighScore; run;
Print the output data set WORK.CAKESTAT.
proc print data=cakestat; title 'Maximum Taste Score for Flavor and Cake Layers'; run;
The CAKESTAT output data set contains only observations for the combination of both class variables, Flavor and Layers. Therefore, the value of _TYPE_ is 3 for all observations. The observations are ordered by ascending frequency of Flavor. The missing value in Layers is a valid value for this class variable. PROC MEANS excludes the observation with the missing flavor because it is an invalid value for Flavor.
Maximum Taste Score for Flavor and Cake Layers 1 High Obs Flavor Layers _TYPE_ _FREQ_ Score 1 Rum 2 3 1 72 2 Spice 2 3 2 83 3 Spice 3 3 1 91 4 Vanilla . 3 1 84 5 Vanilla 1 3 3 94 6 Vanilla 2 3 2 87 7 Chocolate . 3 1 84 8 Chocolate 1 3 5 85 9 Chocolate 2 3 3 92
Procedure features:
CLASS statement
OUTPUT statement options:
statistic keyword
MAXID
Other features:
PRINT procedure
Data set: CHARITY on page 574
This example
identifies the observations with maximum values for two variables
creates new variables for the maximum values
displays the output data set.
Set the SAS system options. The NODATE option suppresses the display of the date and time in the output. PAGENO= specifies the starting page number. LINESIZE= specifies the output line length, and PAGESIZE= specifies the number of lines on an output page.
options nodate pageno=1 linesize=80 pagesize=60;
Specify the analyses. The statistic keywords specify the statistics and their order in the output. CHARTYPE writes the _TYPE_ values as binary characters in the output data set
proc means data=Charity n mean range chartype;
Specify subgroups for the analysis. The CLASS statement separates the analysis by School and Year.
class School Year;
Specify the analysis variables. The VAR statement specifies that PROC MEANS calculate statistics on the MoneyRaised and HoursVolunteered variables.
var MoneyRaised HoursVolunteered;
Specify the output data set options. The OUTPUT statement writes the new variables, MostCash and MostTime, which contain the names of the students who collected the most money and volunteered the most time, respectively, to the PRIZE data set.
output out=Prize maxid(MoneyRaised(name) HoursVolunteered(name))= MostCash MostTime max= ;
Specify the title.
title 'Summary of Volunteer Work by School and Year'; run;
Print the WORK.PRIZE output data set.
proc print data=Prize; title 'Best Results: Most Money Raised and Most Hours Worked'; run;
The first page of output shows the output from PROC MEANS with the statistics for six class levels: one for Monroe High for the years 1992, 1993, and 1994; and one for Kennedy High for the same three years.
Summary of Volunteer Work by School and Year 1 The MEANS Procedure N School Year Obs Variable N Mean Range ------------------------------------------------------------------------------ Kennedy 1992 15 MoneyRaised 15 29.0800000 39.7500000 HoursVolunteered 15 22.1333333 30.0000000 1993 20 MoneyRaised 20 28.5660000 23.5600000 HoursVolunteered 20 19.2000000 20.0000000 1994 18 MoneyRaised 18 31.5794444 65.4400000 HoursVolunteered 18 24.2777778 15.0000000 Monroe 1992 16 MoneyRaised 16 28.5450000 48.2700000 HoursVolunteered 16 18.8125000 38.0000000 1993 12 MoneyRaised 12 28.0500000 52.4600000 HoursVolunteered 12 15.8333333 21.0000000 1994 28 MoneyRaised 28 29.4100000 73.5300000 HoursVolunteered 28 19.1428571 26.0000000 ------------------------------------------------------------------------------
The output from PROC PRINT shows the maximum MoneyRaised and HoursVolunteered values and the names of the students who are responsible for them. The first observation contains the overall results, the next three contain the results by year, the next two contain the results by school, and the final six contain the results by School and Year.
Best Results: Most Money Raised and Most Hours Worked 2 Most Most Money Hours Obs School Year _TYPE_ _FREQ_ Cash Time Raised Volunteered 1 . 00 109 Willard Tonya 78.65 40 2 1992 01 31 Tonya Tonya 55.16 40 3 1993 01 32 Cameron Amy 65.44 31 4 1994 01 46 Willard L.T. 78.65 33 5 Kennedy . 10 53 Luther Jay 72.22 35 6 Monroe . 10 56 Willard Tonya 78.65 40 7 Kennedy 1992 11 15 Thelma Jay 52.63 35 8 Kennedy 1993 11 20 Bill Amy 42.23 31 9 Kennedy 1994 11 18 Luther Che-Min 72.22 33 10 Monroe 1992 11 16 Tonya Tonya 55.16 40 11 Monroe 1993 11 12 Cameron Myrtle 65.44 26 12 Monroe 1994 11 28 Willard L.T. 78.65 33
Procedure features:
PROC MEANS statement option:
NOPRINT
CLASS statement
OUTPUT statement options:
statistic keywords
AUTOLABEL
AUTONAME
IDGROUP
TYPES statement
Other features:
FORMAT procedure
FORMAT statement
PRINT procedure
RENAME = data set option
Data set: CHARITY on page 574
This example
suppresses the display of PROC MEANS output
analyzes the data for the one-way combination of the class variables and across all observations
stores the total and average amount of money raised in new variables
stores in new variables the top three amounts of money raised, the names of the three students who raised the money, the years when it occurred, and the schools the students attended
automatically resolves conflicts in the variable names when names are assigned to the new variables in the output data set
appends the statistic name to the label of the variables in the output data set that contain statistics that were computed for the analysis variable.
assigns a format to the analysis variable so that the statistics that are computed from this variable inherit the attribute in the output data set
renames the _FREQ_ variable in the output data set
displays the output data set and its contents.
Set the SAS system options. The NODATE option suppresses the display of the date and time in the output. PAGENO= specifies the starting page number. LINESIZE= specifies the output line length, and PAGESIZE= specifies the number of lines on an output page.
options nodate pageno=1 linesize=80 pagesize=60;
Create the YRFMT. and $SCHFMT. formats. PROC FORMAT creates user-defined formats that assign the value of All to the missing levels of the class variables.
proc format; value yrFmt . = " All"; value $schFmt ' ' = "All "; run;
Generate the default statistics and specify the analysis options. NOPRINT suppresses the display of all PROC MEANS output.
proc means data=Charity noprint;
Specify subgroups for the analysis. The CLASS statement separates the analysis by values of School and Year.
class School Year;
Specify which subgroups to analyze. The TYPES statement requests the analysis across all the observations and for each one-way combination of School and Year.
types () school year;
Specify the analysis variable. The VAR statement specifies that PROC MEANS calculate statistics on the MoneyRaised variable.
var MoneyRaised;
Specify the output data set options. The OUTPUT statement creates the TOP3LIST data set. RENAME= renames the _FREQ_ variable that contains frequency count for each class level. SUM= and MEAN= specify that the sum and mean of the analysis variable (MoneyRaised) are written to the output data set. IDGROUP writes 12 variables that contain the top three amounts of money raised and the three corresponding students, schools, and years. AUTOLABEL appends the analysis variable name to the label for the output variables that contain the sum and mean. AUTONAME resolves naming conflicts for these variables.
output out=top3list(rename=(_freq_=NumberStudents))sum= mean= idgroup( max(moneyraised) out[3] (moneyraised name school year)=)/autolabel autoname;
Format the output. The LABEL statement assigns a label to the analysis variable MoneyRaised. The FORMAT statement assigns user-defined formats to the Year and School variables and a SAS dollar format to the MoneyRaised variable.
label MoneyRaised='Amount Raised'; format year yrfmt. school $schfmt. moneyraised dollar8.2; run;
Print the output data set WORK.TOP3LIST.
proc print data=top3list; title1 'School Fund Raising Report'; title2 'Top Three Students'; run;
Display information about the TOP3LIST data set. PROC DATASETS displays the contents of the TOP3LIST data set. NOLIST suppresses the directory listing for the WORK data library.
proc datasets library=work nolist; contents data=top3list; title1 'Contents of the PROC MEANS Output Data Set'; run;
The output from PROC PRINT shows the top three values of MoneyRaised, the names of the students who raised these amounts, the schools the students attended, and the years when the money was raised. The first observation contains the overall results, the next three contain the results by year, and the final two contain the results by school. The missing class levels for School and Year are replaced with the value ALL .
The labels for the variables that contain statistics that were computed from MoneyRaised include the statistic name at the end of the label.
School Fund Raising Report 1 Top Three Students Money Money Number Raised_ Raised_ Money Money Money Obs School Year _TYPE_ Students Sum Mean Raised_1 Raised_2 Raised_3 1 All All 0 109 92.75 .29 .65 .22 .44 2 All 1992 1 31 2.92 .80 .16 .76 .63 3 All 1993 1 32 7.92 .37 .44 .33 .23 4 All 1994 1 46 91.91 .26 .65 .22 .87 5 Kennedy All 2 53 75.95 .73 .22 .63 .89 6 Monroe All 2 56 16.80 .87 .65 .44 .87 Obs Name_1 Name_2 Name_3 School_1 School_2 School_3 Year_1 Year_2 Year_3 1 Willard Luther Cameron Monroe Kennedy Monroe 1994 1994 1993 2 Tonya Edward Thelma Monroe Monroe Kennedy 1992 1992 1992 3 Cameron Myrtle Bill Monroe Monroe Kennedy 1993 1993 1993 4 Willard Luther L.T. Monroe Kennedy Monroe 1994 1994 1994 5 Luther Thelma Jenny Kennedy Kennedy Kennedy 1994 1992 1992 6 Willard Cameron L.T. Monroe Monroe Monroe 1994 1993 1994
Contents of the PROC MEANS Output Data Set 2 The DATASETS Procedure Data Set Name WORK.TOP3LIST Observations 6 Member Type DATA Variables 18 Engine V9 Indexes 0 Created 18:59 Thursday, March 14, 2002 Observation Length 144 Last Modified 18:59 Thursday, March 14, 2002 Deleted Observations 0 Protection Compressed NO Data Set Type Sorted NO Label Data Representation WINDOWS Encoding wlatin1 Western (Windows) Engine/Host Dependent Information Data Set Page Size 12288 Number of Data Set Pages 1 First Data Page 1 Max Obs per Page 85 Obs in First Data Page 6 Number of Data Set Repairs 0 File Name filename Release Created 9.0000B0 Host Created WIN_PRO Alphabetic List of Variables and Attributes # Variable Type Len Format Label 7 MoneyRaised_1 Num 8 DOLLAR8.2 Amount Raised 8 MoneyRaised_2 Num 8 DOLLAR8.2 Amount Raised 9 MoneyRaised_3 Num 8 DOLLAR8.2 Amount Raised 6 MoneyRaised_Mean Num 8 DOLLAR8.2 Amount Raised_Mean 5 MoneyRaised_Sum Num 8 DOLLAR8.2 Amount Raised_Sum 10 Name_1 Char 7 11 Name_2 Char 7 12 Name_3 Char 7 4 NumberStudents Num 8 1 School Char 7 $SCHFMT. 13 School_1 Char 7 $SCHFMT. 14 School_2 Char 7 $SCHFMT. 15 School_3 Char 7 $SCHFMT. 2 Year Num 8 YRFMT. 16 Year_1 Num 8 YRFMT. 17 Year_2 Num 8 YRFMT. 18 Year_3 Num 8 YRFMT. 3 _TYPE_ Num 8
See the TEMPLATE procedure in The Complete Guide to the SAS Output Delivery System for an example of how to create a custom table definition for this output data set.