Procedure features:
PLOT statement
plotting symbol in plot request
This example expands on Output 33.1 by specifying a different plotting symbol.
Set the SAS system options. The NODATE option suppresses the display of the date and time in the output. NUMBER enables printing of the page number. PAGENO= specifies the starting page number. LINESIZE= specifies the output line length, and PAGESIZE= specifies the number of lines on an output page.
options nodate number pageno=1 linesize=80 pagesize=35;
Create the DJIA data set. DJIA contains the high and low closing marks for the Dow Jones Industrial Average from 1954 to 1994. A DATA step on page 1397 creates this data set.
data djia; input Year @7 HighDate date7. High @24 LowDate date7. Low; format highdate lowdate date7.; datalines; 1954 31DEC54 404.39 11JAN54 279.87 1955 30DEC55 488.40 17JAN55 388.20 ... more data lines ... 1993 29DEC93 3794.33 20JAN93 3241.95 1994 31JAN94 3978.36 04APR94 3593.35 ;
Create the plot. The plot request plots the values of High on the vertical axis and the values of Year on the horizontal axis. It also specifies an asterisk as the plotting symbol.
proc plot data=djia; plot high*year='*';
Specify the titles.
title 'High Values of the Dow Jones Industrial Average'; title2 'from 1954 to 1994'; run;
PROC PLOT determines the tick marks and the scale of both axes.
High Values of the Dow Jones Industrial Average 1 from 1954 to 1994 Plot of High*Year. Symbol used is '*'. High 4000 + * * * * 3000 + * ** * 2000 + * * ** 1000 + ***** *** *** *** **** * ** * ***** ** 0 + ---+---------+---------+---------+---------+---------+-- 1950 1960 1970 1980 1990 2000 Year
Procedure features:
PLOT statement options:
HAXIS=
VREF=
Data set: DJIA on page 647
This example specifies values for the horizontal axis and draws a reference line from the vertical axis.
Set the SAS system options. The NODATE option suppresses the display of the date and time in the output. PAGENO= specifies the starting page number. LINESIZE= specifies the output line length, and PAGESIZE= specifies the number of lines on an output page.
options nodate pageno=1 linesize=80 pagesize=35;
Create the plot. The plot request plots the values of High on the vertical axis and the values of Year on the horizontal axis. It also specifies an asterisk as the plotting symbol.
proc plot data=djia; plot high*year='*'
Customize the horizontal axis and draw a reference line. HAXIS= specifies that the horizontal axis will show the values 1950 to 1995 in five-year increments . VREF= draws a reference line that extends from the value 3000 on the vertical axis.
/ haxis=1950 to 1995 by 5 vref=3000;
Specify the titles.
title 'High Values of Dow Jones Industrial Average'; title2 'from 1954 to 1994'; run;
High Values of Dow Jones Industrial Average 1 from 1954 to 1994 Plot of High*Year. Symbol used is '*'. High 4000 + * * * * 3000 +----------------------------------------------------------------*--------- * * * 2000 + * * ** 1000 + * ** ** ** * ** * * ** ** ** * ** * ** ** * * * 0 + -+-------+-------+-------+-------+-------+-------+-------+-------+-------+- 1950 1955 1960 1965 1970 1975 1980 1985 1990 1995 Year
Procedure features:
PLOT statement options
BOX
OVERLAY
Data set: DJIA on page 647
This example overlays two plots and puts a box around the plot.
Set the SAS system options. The NODATE option suppresses the display of the date and time in the output. PAGENO= specifies the starting page number. LINESIZE= specifies the output line length, and PAGESIZE= specifies the number of lines on an output page.
options nodate pageno=1 linesize=64 pagesize=30;
Create the plot. The first plot request plots High on the vertical axis, plots Year on the horizontal axis, and specifies an asterisk as a plotting symbol. The second plot request plots Low on the vertical axis, plots Year on the horizontal axis, and specifies an o as a plotting symbol. OVERLAY superimposes the second plot onto the first. BOX draws a box around the plot. OVERLAY and BOX apply to both plot requests .
proc plot data=djia; plot high*year='*' low*year='o' / overlay box;
Specify the titles.
title 'Plot of Highs and Lows'; title2 'for the Dow Jones Industrial Average'; run;
Plot of Highs and Lows 1 for the Dow Jones Industrial Average Plot of High*Year. Symbol used is '*'. Plot of Low*Year. Symbol used is 'o'. ---+---------+---------+---------+---------+---------+--- 4000 + * + * * o *oo High * * * o *oo 2000 + * o + o *o **o ****** ************oo *****oooooo*o o oooooooo *****oooo o o 0 + + ---+---------+---------+---------+---------+---------+--- 1950 1960 1970 1980 1990 2000 Year NOTE: 7 obs hidden.
Procedure features:
PROC PLOT statement options
HPERCENT=
VPERCENT=
Data set: DJIA on page 647
This example puts three plots on one page of output.
Set the SAS system options. The NODATE option suppresses the display of the date and time in the output. PAGENO= specifies the starting page number. LINESIZE= specifies the output line length, and PAGESIZE= specifies the number of lines on an output page.
options nodate pageno=1 linesize=120 pagesize=60;
Specify the plot sizes. VPERCENT= specifies that 50% of the vertical space on the page of output is used for each plot. HPERCENT= specifies that 50% of the horizontal space is used for each plot.
proc plot data=djia vpercent=50 hpercent=50;
Create the first plot. This plot request plots the values of High on the vertical axis and the values of Year on the horizontal axis. It also specifies an asterisk as the plotting symbol.
plot high*year='*';
Create the second plot. This plot request plots the values of Low on the vertical axis and the values of Year on the horizontal axis. It also specifies an asterisk as the plotting symbol.
plot low*year='o';
Create the third plot. The first plot request plots High on the vertical axis, plots Year on the horizontal axis, and specifies an asterisk as a plotting symbol. The second plot request plots Low on the vertical axis, plots Year on the horizontal axis, and specifies an o as a plotting symbol. OVERLAY superimposes the second plot onto the first. BOX draws a box around the plot. OVERLAY and BOX apply to both plot requests.
plot high*year='*' low*year='o' / overlay box;
Specify the titles.
title Plots of the Dow Jones Industrial Average'; title2 'from 1954 to 1994'; run;
Plots of the Dow Jones Industrial Average 1 from 1954 to 1994 Plot of High*Year. Symbol used is '*'. Plot of Low*Year. Symbol used is 'o'. 4000 + * 4000 + * o * o High * Low o ** * oo * o 2000 + * 2000 + oo * o ** ** * *** ooo ******** ** *** o oo ooo oo o o ****** ooo oo o oo oo o o o **** oooo o o 0 + 0 + -+---------+---------+---------+---------+---------+- -+---------+---------+---------+---------+---------+- 1950 1960 1970 1980 1990 2000 1950 1960 1970 1980 1990 2000 Year Year Plot of High*Year. Symbol used is '*'. Plot of Low*Year. Symbol used is 'o'. -+---------+---------+---------+---------+---------+- 4000 + * + * * o *oo High * * * o *oo 2000 + * o + o *o **o ****** ************oo *****oooooo*o o oooooooo *****oooo o o 0 + + -+---------+---------+---------+---------+---------+- 1950 1960 1970 1980 1990 2000 Year NOTE: 7 obs hidden.
Procedure features:
PLOT statement option
HAXIS=
This example uses a DATA step to generate data. The PROC PLOT step shows two plots of the same data: one plot without a horizontal axis specification and one plot with a logarithmic scale specified for the horizontal axis.
Set the SAS system options. The NODATE option suppresses the display of the date and time in the output. PAGENO= specifies the starting page number. LINESIZE= specifies the output line length, and PAGESIZE= specifies the number of lines on an output page.
options nodate pageno=1 linesize=80 pagesize=40;
Create the EQUA data set. EQUA contains values of X and Y. Each value of X is calculated as 10 Y .
data equa; do Y=1 to 3 by .1; X=10**y; output; end; run;
Specify the plot sizes. HPERCENT= makes room for two plots side-by-side by specifying that 50% of the horizontal space is used for each plot.
proc plot data=equa hpercent=50;
Create the plots. The plot requests plot Y on the vertical axis and X on the horizontal axis. HAXIS= specifies a logarithmic scale for the horizontal axis for the second plot.
plot y*x; plot y*x / haxis=10 100 1000;
Specify the titles.
title 'Two Plots with Different'; title2 'Horizontal Axis Specifications'; run;
Two Plots with Different 1 Horizontal Axis Specifications Plot of Y*X. A=1, B=2, etc. Plot of Y*X. A=1, B=2, etc. Y Y 3.0 + A 3.0 + A 2.9 + A 2.9 + A 2.8 + A 2.8 + A 2.7 + A 2.7 + A 2.6 + A 2.6 + A 2.5 + A 2.5 + A 2.4 + A 2.4 + A 2.3 + A 2.3 + A 2.2 + A 2.2 + A 2.1 + A 2.1 + A 2.0 + A 2.0 + A 1.9 + A 1.9 + A 1.8 + A 1.8 + A 1.7 + A 1.7 + A 1.6 + A 1.6 + A 1.5 + A 1.5 + A 1.4 + A 1.4 + A 1.3 + A 1.3 + A 1.2 + A 1.2 + A 1.1 +A 1.1 + A 1.0 +A 1.0 +A -+---------------+---------------+ -+---------------+---------------+ 0 500 1000 10 100 1000 X X
Procedure features:
PLOT statement option
HAXIS=
This example shows how you can specify date values on an axis.
Set the SAS system options. The NODATE option suppresses the display of the date and time in the output. PAGENO= specifies the starting page number. LINESIZE= specifies the output line length, and PAGESIZE= specifies the number of lines on an output page.
options nodate pageno=1 linesize=120 pagesize=40;
Create the EMERGENCY_CALLS data set. EMERGENCY_CALLS contains the number of telephone calls to an emergency help line for each date.
data emergency_calls; input Date : date7. Calls @@; label calls='Number of Calls'; datalines; 1APR94 134 11APR94 384 13FEB94 488 2MAR94 289 21MAR94 201 14MAR94 460 3JUN94 184 13JUN94 152 30APR94 356 4JAN94 179 14JAN94 128 16JUN94 480 5APR94 360 15APR94 350 24JUL94 388 6MAY94 245 15DEC94 150 17NOV94 328 7JUL94 280 16MAY94 240 25AUG94 280 8AUG94 494 17JUL94 499 26SEP94 394 9SEP94 309 18AUG94 248 23NOV94 590 19SEP94 356 24FEB94 201 29JUL94 330 10OCT94 222 25MAR94 183 30AUG94 321 11NOV94 294 26APR94 412 2DEC94 511 27MAY94 294 22DEC94 413 28JUN94 309 ;
Create the plot. The plot request plots Calls on the vertical axis and Date on the horizontal axis. HAXIS= uses a monthly time for the horizontal axis. The notation '1JAN94'd is a date constant. The value '1JAN95'd ensures that the axis will have enough room for observations from December.
proc plot data=emergency_calls; plot calls*date / haxis='1JAN94'd to '1JAN95'd by month;
Format the DATE values. The FORMAT statement assigns the DATE7. format to Date.
format date date7.;
Specify the titles.
title 'Calls to City Emergency Services Number'; title2 'Sample of Days for 1994'; run;
PROC PLOT uses the variables labels on the axes.
Calls to City Emergency Services Number 1 Sample of Days for 1994 Plot of Calls*Date. Legend: A = 1 obs, B = 2 obs, etc. 600 + A A N 500 + A A u A A m A b e A A r 400 + A A A o A A A A f A A A C 300 + A A A A a A A A l l A A A s A 200 + A A A A A A A A A 100 + ---+--------+--------+--------+--------+--------+--------+--------+--------+--------+--------+--------+--------+-- 01JAN94 01FEB94 01MAR94 01APR94 01MAY94 01JUN94 01JUL94 01AUG94 01SEP94 01OCT94 01NOV94 01DEC94 01JAN95 Date
Procedure features:
PLOT statement option
CONTOUR=
This example shows how to represent the values of three variables with a two-dimensional plot by setting one of the variables as the CONTOUR variable. The variables X and Y appear on the axes, and Z is the contour variable. Program statements are used to generate the observations for the plot, and the following equation describes the contour surface:
Set the SAS system options. The NODATE option suppresses the display of the date and time in the output. PAGENO= specifies the starting page number. LINESIZE= specifies the output line length, and PAGESIZE= specifies the number of lines on an output page.
options nodate pageno=1 linesize=64 pagesize=25;
Create the CONTOURS data set.
data contours; format Z 5.1; do X=0 to 400 by 5; do Y=0 to 350 by 10; z=46.2+.09*x-.0005*x**2+.1*y-.0005*y**2+.0004*x*y; output; end; end; run;
Print the CONTOURS data set. The OBS= data set option limits the printing to only the first 5 observations. NOOBS suppresses printing of the observation numbers .
proc print data=contours(obs=5) noobs; title 'CONTOURS Data Set'; title2 'First 5 Observations Only'; run;
CONTOURS contains observations with values of X that range from 0 to 400 by 5 and with values of Y that range from 0 to 350 by 10.
CONTOURS Data Set 1 First 5 Observations Only Z X Y 46.2 0 0 47.2 0 10 48.0 0 20 48.8 0 30 49.4 0 40
Set the SAS system options. The NODATE option suppresses the display of the date and time in the output. PAGENO= specifies the starting page number. LINESIZE= specifies the output line length, and PAGESIZE= specifies the number of lines on an output page. NOOVP ensures that overprinting is not used in the plot.
options nodate pageno=1 linesize=120 pagesize=60 noovp;
Create the plot. The plot request plots Y on the vertical axis, plots X on the horizontal axis, and specifies Z as the contour variable. CONTOUR=10 specifies that the plot will divide the values of Z into ten increments, and each increment will have a different plotting symbol.
proc plot data=contours; plot y*x=z / contour=10;
Specify the title.
title 'A Contour Plot'; run;
The shadings associated with the values of Z appear at the bottom of the plot. The plotting symbol # shows where high values of Z occur.
A Contour Plot 1 Contour plot of Y*X. Y 350 + ======++++++OOOOOOOOXXXXXXXXXXXWWWWWWWWWWWWWWWWWWWWWWWWWWWWWWWXXXXXXXXXXXOOOOOOOO 340 + ====++++++OOOOOOOXXXXXXXXXXWWWWWWWWWWWWWWWWWWWWWWWWWWWWWWWWWWWWWXXXXXXXXXXOOOOOOO 330 + =++++++OOOOOOOXXXXXXXXXWWWWWWWWWWWWWWWWWWWWWWWWWWWWWWWWWWWWWWWWWWWWXXXXXXXXXOOOOO 320 + +++++OOOOOOOXXXXXXXXWWWWWWWWWWWWWW********************WWWWWWWWWWWWWWXXXXXXXXXOOOO 310 + +++OOOOOOXXXXXXXXWWWWWWWWWWWW*****************************WWWWWWWWWWWXXXXXXXXOOOO 300 + =OOOOOOXXXXXXXXWWWWWWWWWW***********************************WWWWWWWWWWXXXXXXXXOOO 290 + OOOOOXXXXXXXWWWWWWWWWW****************************************WWWWWWWWWXXXXXXXOOO 280 + OOOXXXXXXXWWWWWWWWW********************####********************WWWWWWWWWXXXXXXXOO 270 + OXXXXXXXWWWWWWWWW**************##################***************WWWWWWWWXXXXXXXOO 260 + XXXXXXWWWWWWWW*************#########################************WWWWWWWWXXXXXXXOO 250 + XXXXWWWWWWWW************#############################************WWWWWWWWXXXXXXOO 240 + XXXWWWWWWW***********#################################***********WWWWWWWWXXXXXXOO 230 + XWWWWWWWW**********####################################**********WWWWWWWXXXXXXXOO 220 + WWWWWWW**********######################################**********WWWWWWWXXXXXXOOO 210 + WWWWWW*********########################################**********WWWWWWWXXXXXXOOO 200 + WWWWW*********#########################################*********WWWWWWWXXXXXXOOOO 190 + WWW**********##########################################*********WWWWWWWXXXXXXOOOO 180 + WW*********###########################################*********WWWWWWWXXXXXXOOOOO 170 + W*********############################################*********WWWWWWWXXXXXXOOOOO 160 + W*********###########################################*********WWWWWWWXXXXXXOOOOO+ 150 + *********###########################################*********WWWWWWWXXXXXXOOOOO++ 140 + ********###########################################*********WWWWWWWXXXXXXOOOOO+++ 130 + ********##########################################*********WWWWWWWXXXXXXOOOOO++++ 120 + ********########################################**********WWWWWWWXXXXXXOOOOO+++++ 110 + ********#######################################**********WWWWWWWXXXXXXOOOOO+++++= 100 + ********#####################################**********WWWWWWWXXXXXXOOOOOO+++++== 90 + ********###################################**********WWWWWWWWXXXXXXOOOOO+++++==== 80 + *********################################***********WWWWWWWXXXXXXXOOOOO+++++==== 70 + **********############################************WWWWWWWWXXXXXXOOOOOO+++++====- 60 + ************######################**************WWWWWWWWXXXXXXXOOOOO+++++=====-- 50 + ***************###############***************WWWWWWWWWXXXXXXXOOOOOO+++++====---- 40 + W******************************************WWWWWWWWWXXXXXXXOOOOOO+++++=====---- 30 + WW**************************************WWWWWWWWWWXXXXXXXOOOOOO+++++=====---- 20 + WWWW********************************WWWWWWWWWWWXXXXXXXXOOOOOO++++++====----- . 10 + WWWWWW**************************WWWWWWWWWWWWWXXXXXXXXOOOOOO++++++=====---- ... 0 + WWWWWWWWWW*****************WWWWWWWWWWWWWWWXXXXXXXXOOOOOOO++++++=====---- .... ---+---------+---------+---------+---------+---------+---------+---------+---------+- 0 50 100 150 200 250 300 350 400 X Symbol z Symbol z Symbol z Symbol z Symbol z ..... 2.2 - 8.1 ----- 14.0 - 19.9 +++++ 25.8 - 31.7 XXXXX 37.6 - 43.5 ***** 49.4 - 55.4 8.1 - 14.0 ===== 19.9 - 25.8 OOOOO 31.7 - 37.6 WWWWW 43.5 - 49.4 ##### 55.4 - 61.3
Procedure features:
PLOT statement option
HREF=
Other features:
BY statement
This example shows BY group processing in PROC PLOT.
Set the SAS system options. The NODATE option suppresses the display of the date and time in the output. PAGENO= specifies the starting page number. LINESIZE= specifies the output line length, and PAGESIZE= specifies the number of lines on an output page.
options nodate pageno=1 linesize=80 pagesize=35;
Create the EDUCATION data set. EDUCATION contains educational data [*] about some U.S. states. DropoutRate is the percentage of high school dropouts. Expenditures is the dollar amount the state spends on each pupil . MathScore is the score of eighth -grade students on a standardized math test. Not all states participated in the math test. A DATA step on page 1398 creates this data set.
data education; input State . +1 Code $ DropoutRate Expenditures MathScore Region $; label dropout='Dropout Percentage - 1989' expend='Expenditure Per Pupil - 1989' math='8th Grade Math Exam - 1990'; datalines; Alabama AL 22.3 3197 252 SE Alaska AK 35.8 7716 . W more data lines New York NY 35.0 . 261 NE North Carolina NC 31.2 3874 250 SE North Dakota ND 12.1 3952 281 MW Ohio OH 24.4 4649 264 MW ;
Sort the EDUCATION data set. PROC SORT sorts EDUCATION by Region so that Region can be used as the BY variable in PROC PLOT.
proc sort data=education; by region; run;
Create a separate plot for each BY group. The BY statement creates a separate plot for each value of Region.
proc plot data=education; by region;
Create the plot with a reference line. The plot request plots Expenditures on the vertical axis, plots DropoutRate on the horizontal axis, and specifies an asterisk as the plotting symbol. HREF= draws a reference line that extende from 28.6 on the horizontal axis. The reference line represents the national average.
plot expenditures*dropoutrate='*' / href=28.6;
Specify the title.
title 'Plot of Dropout Rate and Expenditure Per Pupil'; run;
PROC PLOT produces a plot for each BY group. Only the plots for Midwest and Northeast are shown.
Plot of Dropout Rate and Expenditure Per Pupil 1 ---------------------------------- Region=MW ----------------------------------- Plot of Expenditures*DropoutRate. Symbol used is '*'. Expenditures 5500 + * 5000 + * * * 4500 + * * ** * 4000 + * 3500 + ---+------------+------------+------------+------------+-- 10 15 20 25 30 Dropout Percentage - 1989
Plot of Dropout Rate and Expenditure Per Pupil 2 ---------------------------------- Region=NE ----------------------------------- Plot of Expenditures*DropoutRate. Symbol used is '*'. Expenditures 8000 + * 7000 + * 6000 + * * * 5000 + * * 4000 + ---+------------+------------+------------+------------+-- 15 20 25 30 35 Dropout Percentage - 1989 NOTE: 1 obs had missing values.
Procedure features:
PLOT statement
label variable in plot request
Data set: EDUCATION on page 662
This example shows how to modify the plot request to label points on the plot with the values of variables. This example adds labels to the plot shown in Example 8 on page 662.
Set the SAS system options. The NODATE option suppresses the display of the date and time in the output. PAGENO= specifies the starting page number. LINESIZE= specifies the output line length, and PAGESIZE= specifies the number of lines on an output page.
options nodate pageno=1 linesize=80 pagesize=35;
Sort the EDUCATION data set. PROC SORT sorts EDUCATION by Region so that Region can be used as the BY variable in PROC PLOT.
proc sort data=education; by region; run;
Create a separate plot for each BY group. The BY statement creates a separate plot for each value of Region.
proc plot data=education; by region;
Create the plot with a reference line and a label for each data point. The plot request plots Expenditures on the vertical axis, plots DropoutRate on the horizontal axis, and specifies an asterisk as the plotting symbol. The label variable specification ( $ state ) in the PLOT statement labels each point on the plot with the name of the corresponding state. HREF= draws a reference line that extends from 28.6 on the horizontal axis. The reference line represents the national average.
plot expenditures*dropoutrate='*' $ state / href=28.6;
Specify the title.
title 'Plot of Dropout Rate and Expenditure Per Pupil'; run;
PROC PLOT produces a plot for each BY group. Only the plots for Midwest and Northeast are shown.
Plot of Dropout Rate and Expenditure Per Pupil 1 ---------------------------------- Region=MW ----------------------------------- Plot of Expenditures*DropoutRate$State. Symbol used is '*'. Expenditures 5500 + Michigan * 5000 + * Illinois * Minnesota * Ohio 4500 + * Nebraska * Kansas Iowa ** Indiana *Missouri 4000 + * North Dakota 3500 + ---+------------+------------+------------+------------+-- 10 15 20 25 30 Dropout Percentage - 1989
Plot of Dropout Rate and Expenditure Per Pupil 2 ---------------------------------- Region=NE ----------------------------------- Plot of Expenditures*DropoutRate$State. Symbol used is '*'. Expenditures 8000 + * New Jersey 7000 + * Connecticut 6000 + *Massachusetts * Maryland * Delaware 5000 + * Maine * New Hampshire 4000 + ---+------------+------------+------------+------------+-- 15 20 25 30 35 Dropout Percentage - 1989 NOTE: 1 obs had missing values.
Procedure features:
PROC PLOT statement option
NOMISS
Data set: EDUCATION on page 662
This example shows how missing values affect the calculation of the axes.
Set the SAS system options. The NODATE option suppresses the display of the date and time in the output. PAGENO= specifies the starting page number. LINESIZE= specifies the output line length, and PAGESIZE= specifies the number of lines on an output page.
options nodate pageno=1 linesize=80 pagesize=35;
Sort the EDUCATION data set. PROC SORT sorts EDUCATION by Region so that Region can be used as the BY variable in PROC PLOT.
proc sort data=education; by region; run;
Exclude data points with missing values. NOMISS excludes observations that have a missing value for either of the axis variables.
proc plot data=education nomiss;
Create a separate plot for each BY group. The BY statement creates a separate plot for each value of Region.
by region;
Create the plot with a reference line and a label for each data point. The plot request plots Expenditures on the vertical axis, plots DropoutRate on the horizontal axis, and specifies an asterisk as the plotting symbol. The label variable specification ( $ state ) in the PLOT statement labels each point on the plot with the name of the corresponding state. HREF= draws a reference line extending from 28.6 on the horizontal axis. The reference line represents the national average.
plot expenditures*dropoutrate='*' $ state / href=28.6;
Specify the title.
title 'Plot of Dropout Rate and Expenditure Per Pupil'; run;
PROC PLOT produces a plot for each BY group. Only the plot for the Northeast is shown. Because New York has a missing value for Expenditures, the observation is excluded and PROC PLOT does not use the value 35 for DropoutRate to calculate the horizontal axis. Compare the horizontal axis in this output with the horizontal axis in the plot for Northeast in Example 9 on page 665.
Plot of Dropout Rate and Expenditure Per Pupil 1 ---------------------------------- Region=NE ----------------------------------- Plot of Expenditures*DropoutRate$State. Symbol used is '*'. Expenditures 8000 + * New Jersey 7000 + * Connecticut 6000 + Massachusetts * * Maryland Delaware * 5000 + * Maine * New Hampshire 4000 + --+--------+--------+--------+--------+--------+--------+--------+- 16 18 20 22 24 26 28 30 Dropout Percentage - 1989 NOTE: 1 obs had missing values.
Procedure features:
PLOT statement options
label variable in plot request
LIST=
PLACEMENT=
Other features:
RUN group processing
This example illustrates the default placement of labels and how to adjust the placement of labels on a crowded plot. The labels are values of variable in the data set. [*]
This example also shows RUN group processing in PROC PLOT.
Set the SAS system options. The NODATE option suppresses the display of the date and time in the output. PAGENO= specifies the starting page number. LINESIZE= specifies the output line length, and PAGESIZE= specifies the number of lines on an output page.
options nodate pageno=1 linesize=120 pagesize=37;
Create the CENSUS data set. CENSUS contains the variables CrimeRate and Density for selected states. CrimeRate is the number of crimes per 100,000 people. Density is the population density per square mile in the 1980 census. A DATA step on page 1391 creates this data set.
data census; input Density CrimeRate State $ 14-27 PostalCode $ 29-30; datalines; 263.3 4575.3 Ohio OH 62.1 7017.1 Washington WA more data lines 111.6 4665.6 Tennessee TN 120.4 4649.9 North Carolina NC ;
Create the plot with a label for each data point. The plot request plots Density on the vertical axis, CrimeRate on the horizontal axis, and uses the first letter of the value of State as the plotting symbol. This makes it easier to match the symbol with its label. The label variable specification ( $ state ) in the PLOT statement labels each point with the corresponding state name.
proc plot data=census; plot density*crimerate=state $ state /
Specify plot options. BOX draws a box around the plot. LIST= lists the labels that have penalties greater than or equal to 1. HAXIS= and VAXIS= specify increments only. PROC PLOT uses the data to determine the range for the axes.
box list=1 haxis=by 1000 vaxis=by 250;
Specify the title.
title 'A Plot of Population Density and Crime Rates'; run;
The labels Tennessee , South Carolina , Arkansas , Minnesota , and South Dakota have penalties. The default placement states do not provide enough possibilities for PROC PLOT to avoid penalties given the proximity of the points. Seven label characters are hidden.
A Plot of Population Density and Crime Rates 1 Plot of Density*CrimeRate$State. Symbol is value of State. ---+------------+------------+------------+------------+------------+------------+------------+--- Density 500 + + M Maryland D Delaware P Pennsylvania O Ohio 250 + + I Illinois F Florida North Carolina C California TennNssee Georgia N New Hampshire T S South Garolina W West Virginia A Alabama Mississippi M Vermont V M Missouri Washington W T Texas MinneAoArkMnsas O Oklahoma North Dakota I Idaho O Oregon 0 + S Nouth Dakota N Nevada + ---+------------+------------+------------+------------+------------+------------+------------+--- 2000 3000 4000 5000 6000 7000 8000 9000 CrimeRate NOTE: 7 label characters hidden.
A Plot of Population Density and Crime Rates 2 List of Point Locations, Penalties, and Placement States Vertical Horizontal Starting Vertical Horizontal Label Axis Axis Penalty Position Lines Shift Shift Tennessee 111.60 4665.6 2 Center 1 1 -1 South Carolina 103.40 5161.9 2 Right 1 0 2 Arkansas 43.90 4245.2 6 Right 1 0 2 Minnesota 51.20 4615.8 7 Left 1 0 -2 South Dakota 9.10 2678.0 11 Right 1 0 2
Request a second plot. Because PROC PLOT is interactive, the procedure is still running at this point in the program. It is not necessary to restart the procedure to submit another plot request. LIST=1 produces no output because there are no penalties of 1 or greater.
plot density*crimerate=state $ state / box list=1 haxis=by 1000 vaxis=by 250
Specify placement options. PLACEMENT= gives PROC PLOT more placement states to use to place the labels. PLACEMENT= contains three expressions. The first expression specifies the preferred positions for the label. The first expression resolves to placement states centered above the plotting symbol, with the label on one or two lines. The second and third expressions resolve to placement states that enable PROC PLOT to place the label in multiple positions around the plotting symbol.
placement=((v=2 1 : l=2 1) ((l=2 2 1 : v=0 1 0) * (s=right left : h=2 -2)) (s=center right left * l=2 1 * v=0 1 -1 2 * h=0 1 to 5 by alt));
Specify the title.
title 'A Plot of Population Density and Crime Rates'; run;
No collisions occur in the plot.
A Plot of Population Density and Crime Rates 3 Plot of Density*CrimeRate$State. Symbol is value of State. ---+------------+------------+------------+------------+------------+------------+------------+--- Density 500 + + Maryland M Delaware D Pennsylvania Ohio P O 250 + + Illinois I Florida F North Carolina California New South C West Hampshire Alabama N Carolina Virginia N T S G Georgia W Mississippi A Tennessee Washington Texas M Vermont V M Missouri Oklahoma W T South Arkansas A M Minnesota O Oregon Dakota I Idaho Nevada O 0 + S N North Dakota N + ---+------------+------------+------------+------------+------------+------------+------------+--- 2000 3000 4000 5000 6000 7000 8000 9000 CrimeRate
Procedure features:
PLOT statement options
label variable in plot request
PLACEMENT=
Data set: CENSUS on page 671
This example illustrates the default placement of labels and uses a macro to adjust the placement of labels. The labels are values of a variable in the data set.
Set the SAS system options. The NODATE option suppresses the display of the date and time in the output. PAGENO= specifies the starting page number. LINESIZE= specifies the output line length, and PAGESIZE= specifies the number of lines on an output page.
options nodate pageno=1 linesize=120 pagesize=37;
Use conditional logic to determine placement. The %PLACE macro provides an alternative to using the PLACEMENT= option. The higher the value of n , the more freedom PROC PLOT has to place labels.
%macro place(n); %if &n > 13 %then %let n = 13; placement=(%if &n <= 0 %then (s=center); %else (h=2 -2 : s=right left); %if &n = 1 %then (v=1 * h=0 -1 to -2 by alt); %else %if &n = 2 %then (v=1 -1 * h=0 -1 to -5 by alt); %else %if &n > 2 %then (v=1 to 2 by alt * h=0 -1 to -10 by alt); %if &n > 3 %then (s=center right left * v=0 1 to %eval(&n - 2) by alt * h=0 -1 to %eval(-3 * (&n - 2)) by alt * l=1 to %eval(2 + (10 * &n - 35) / 30));) %if &n > 4 %then penalty(7)=%eval((3 * &n) / 2); %mend;
Create the plot. The plot request plots Density on the vertical axis, CrimeRate on the horizontal axis, and uses the first letter of the value of State as the plotting symbol. The label variable specification ( $ state ) in the PLOT statement t labels each point with the corresponding state name.
proc plot data=census; plot density*crimerate=state $ state /
Specify plot options. BOX draws a box around the plot. LIST= lists the labels that have penalties greater than or equal to 1. HAXIS= and VAXIS= specify increments only. PROC PLOT uses the data to determine the range for the axes. The PLACE macro determines the placement of the labels.
box list=1 haxis=by 1000 vaxis=by 250 %place(4);
Specify the title.
title 'A Plot of Population Density and Crime Rates'; run;
No collisions occur in the plot.
A Plot of Population Density and Crime Rates 1 Plot of Density*CrimeRate$State. Symbol is value of State. ---+------------+------------+------------+------------+------------+------------+------------+-- Density 500 + + M Maryland D Delaware P Pennsylvania O Ohio 250 + + I Illinois F Florida North Carolina C California N Tennessee N New Hampshire T S G Georgia W West Virginia Alabama A South Carolina Mississippi M Vermont V M Missouri Washington W T Texas Arkansas A M Minnesota O Oklahoma South Dakota I Idaho O Oregon 0 + S N North Dakota N Nevada + ---+------------+------------+------------+------------+------------+------------+------------+-- 2000 3000 4000 5000 6000 7000 8000 9000 CrimeRate
Procedure features:
PLOT statement option
PENALTIES=
Data set: CENSUS on page 671
This example demonstrates how changing a default penalty affects the placement of labels. The goal is to produce a plot that has labels that do not detract from how the points are scattered .
Set the SAS system options. The NODATE option suppresses the display of the date and time in the output. PAGENO= specifies the starting page number. LINESIZE= specifies the output line length, and PAGESIZE= specifies the number of lines on an output page.
options nodate pageno=1 linesize=120 pagesize=37;
Create the plot. The plot request plots Density on the vertical axis, CrimeRate on the horizontal axis, and uses the first letter of the value of State as the plotting symbol. The label variable specification ( $ state ) in the PLOT statement labels each point with the corresponding state name.
proc plot data=census; plot density*crimerate=state $ state /
Specify the placement. PLACEMENT= specifies that the preferred placement states are 100 columns to the left and the right of the point, on the same line with the point.
placement=(h=100 to 10 by alt * s=left right)
Change the default penalty. PENALTIES(4)= changes the default penalty for a free horizontal shift to 500, which removes all penalties for a horizontal shift. LIST= shows how far PROC PLOT shifted the labels away from their respective points.
penalties(4)=500 list=0
Customize the axes. HAXIS= creates a horizontal axis long enough to leave space for the labels on the sides of the plot. VAXIS= specifies that the values on the vertical axis be in increments of 100.
haxis=0 to 13000 by 1000 vaxis=by 100;
Specify the title.
title 'A Plot of Population Density and Crime Rates'; run;
A Plot of Population Density and Crime Rates 1 Plot of Density*CrimeRate$State. Symbol is value of State. Density 500 + M Maryland 400 + 300 + D Delaware P O Pennsylvania Ohio 200 + I Illinois Florida F C California T North Carolina Tennessee 100 +Georgia N S G New Hampshire South Carolina W A M Alabama Missouri West Virginia Washington Texas M V M W T Vermont Minnesota Mississippi Oklahoma A O Arkansas Oregon I O Idaho 0 + S N N North Dakota South Dakota Nevada ---+-------+-------+-------+-------+-------+-------+-------+-------+-------+-------+-------+-------+-------+-- 0 1000 2000 3000 4000 5000 6000 7000 8000 9000 10000 11000 12000 13000 CrimeRate NOTE: 1 obs hidden.
A Plot of Population Density and Crime Rates 2 List of Point Locations, Penalties, and Placement States Vertical Horizontal Starting Vertical Horizontal Label Axis Axis Penalty Position Lines Shift Shift Maryland 428.70 5477.6 0 Right 1 0 55 Delaware 307.60 4938.8 0 Right 1 0 59 Pennsylvania 264.30 3163.2 0 Right 1 0 65 Ohio 263.30 4575.3 0 Right 1 0 66 Illinois 205.30 5416.5 0 Right 1 0 56 Florida 180.00 8503.2 0 Left 1 0 -64 California 151.40 6506.4 0 Right 1 0 45 Tennessee 111.60 4665.6 0 Right 1 0 61 North Carolina 120.40 4649.9 0 Right 1 0 46 New Hampshire 102.40 3371.7 0 Right 1 0 52 South Carolina 103.40 5161.9 0 Right 1 0 52 Georgia 94.10 5792.0 0 Left 1 0 -42 West Virginia 80.80 2190.7 0 Right 1 0 76 Alabama 76.60 4451.4 0 Right 1 0 41 Missouri 71.20 4707.5 0 Right 1 0 47 Mississippi 53.40 3438.6 0 Right 1 0 68 Vermont 55.20 4271.2 0 Right 1 0 44 Minnesota 51.20 4615.8 0 Right 1 0 49 Washington 62.10 7017.1 0 Left 1 0 -49 Texas 54.30 7722.4 0 Left 1 0 -49 Arkansas 43.90 4245.2 0 Right 1 0 65 Oklahoma 44.10 6025.6 0 Left 1 0 -43 Idaho 11.50 4156.3 0 Right 1 0 69 Oregon 27.40 6969.9 0 Left 1 0 -53 South Dakota 9.10 2678.0 0 Right 1 0 67 North Dakota 9.40 2833.0 0 Right 1 0 52 Nevada 7.30 6371.4 0 Right 1 0 50
[*] Source: U.S. Department of Education.
[*] Source: U.S. Bureau of the Census and the 1987 Uniform Crime Reports, FBI.