WHERE Statement


Selects observations from SAS data sets that meet a particular condition

Valid: in DATA and PROC steps

Category: Action

Type: Declarative

Syntax

WHERE where-expression-1

  • < logical-operator where-expression-n >;

Arguments

where-expression

  • is an arithmetic or logical expression that generally consists of a sequence of operands and operators.

  • Tip: The operands and operators described in the next several sections are also valid for the WHERE= data set option.

  • Tip: You can specify multiple where-expressions.

logical-operator

  • can be AND, AND NOT, OR, or OR NOT.

Details

General Information Using the WHERE statement may improve the efficiency of your SAS programs because SAS is not required to read all observations from the input data set.

The WHERE statement cannot be executed conditionally; that is, you cannot use it as part of an IF-THEN statement.

WHERE statements can contain multiple WHERE expressions that are joined by logical operators.

Note: Using indexed SAS data sets can significantly improve performance when you use WHERE expressions to access a subset of the observations in a SAS data set. See Understanding SAS Indexes in the SAS Data Files section of SAS Language Reference: Concepts for a complete discussion of WHERE-expression processing with indexed data sets and a list of guidelines to consider before you index your SAS data sets.

In DATA Steps The WHERE statement applies to all data sets in the preceding SET, MERGE, MODIFY, or UPDATE statement, and variables that are used in the WHERE statement must appear in all of those data sets. You cannot use the WHERE statement with the POINT= option in the SET and MODIFY statements.

You can apply OBS= and FIRSTOBS= processing to WHERE processing. For more information, see Processing a Segment of Data That is Conditionally Selected in the WHERE-Expression Processing section of SAS Language Reference: Concepts .

You cannot use the WHERE statement to select records from an external file that contains raw data, nor can you use the WHERE statement within the same DATA step in which you read in-stream data with a DATALINES statement.

For each iteration of the DATA step, the first operation SAS performs in each execution of a SET, MERGE, MODIFY, or UPDATE statement is to determine whether the observation in the input data set meets the condition of the WHERE statement. The WHERE statement takes effect immediately after the input data set options are applied and before any other statement in the DATA step is executed. If a DATA step combines observations using a WHERE statement with a MERGE, MODIFY, or UPDATE statement, SAS selects observations from each input data set before it combines them.

WHERE and BY in a DATA Step If a DATA step contains both a WHERE statement and a BY statement, the WHERE statement executes before BY groups are created. Therefore, BY groups reflect groups of observations in the subset of observations that are selected by the WHERE statement, not the actual BY groups of observations in the original input data set.

For a complete discussion of BY-group processing, see BY- Group Processing in SAS Programs in SAS Language Reference: Concepts .

In PROC Steps You can use the WHERE statement with any SAS procedure that reads a SAS data set. The WHERE statement is useful in order to subset the original data set for processing by the procedure. The Base SAS Procedures Guide documents the action of the WHERE statement only in those procedures for which you can specify more than one data set. In all other cases, the WHERE statement performs as documented here.

Use of Indexes A DATA or PROC step attempts to use an available index to optimize the selection of data when an indexed variable is used in combination with one of the following:

  • the BETWEEN-AND operator

  • the comparison operators, with or without the colon modifier

  • the CONTAINS operator

  • the IS NULL and IS NOT NULL operators

  • the LIKE operator

  • the TRIM function

  • the SUBSTR function, in some cases.

SUBSTR requires the following arguments:

 where substr(  variable  ,  position  ,  length  )   ='  character-string  '; 

An index is used in processing when the arguments of the SUBSTR function meet all of the following conditions:

  • position is equal to 1

  • length is less than or equal to the length of variable

  • length is equal to the length of character-string .

Operands Used in WHERE Expressions Operands include

  • constants

  • time and date values

  • values of variables that are obtained from the SAS data sets

  • values created within the WHERE expression itself.

You cannot use variables that are created within the DATA step (for example, FIRST. variable , LAST. variable , _N_, or variables that are created in assignment statements) in a WHERE expression because the WHERE statement is executed before the SAS System brings observations into the DATA or PROC step. When WHERE expressions contain comparisons, the unformatted values of variables are compared.

Use operands in WHERE statements as in the following examples:

  • where score>50;

  • where date>= 01jan1999 d and time>= 9:00 t;

  • where state= Mississippi ;

As in other SAS expressions, the names of numeric variables can stand alone. SAS treats values of 0 or missing as false; other values are true. These examples are WHERE expressions that contain the numeric variables EMPNUM and SSN:

  • where empnum;

  • where empnum and ssn;

Character literals or the names of character variables can also stand alone in WHERE expressions. If you use the name of a character variable by itself as a WHERE expression, SAS selects observations where the value of the character variable is not blank.

Operators Used in the WHERE Expression You can include both SAS operators and special WHERE-expression operators in the WHERE statement. For a complete list of the operators, see Table 7.11 on page 1421. For the rules SAS follows when it evaluates WHERE expressions, see WHERE-Expression Processing in SAS Language Reference: Concepts .

Table 7.11: WHERE Statement Operators

Operator Type

Symbol or Mnemonic

Description

Arithmetic

   
 

*

multiplication

 

/

division

 

+

addition

 

subtraction

 

**

exponentiation

Comparison [ 4]

   
 

= or EQ

equal to

 

^=, =, ~=, or NE [ 1]

not equal to

 

> or GT

greater than

 

< or LT

less than

 

>= or GE

greater than or equal to

 

<= or LE

less than or equal to

 

IN

equal to one of a list

Logical (Boolean)

   
 

& or AND

logical and

 

or OR [ 2]

logical or [ 1]

 

~,^ , , or NOT [ 1]

logical not

Other

   
 

[ 3]

concatenation of character variables

 

()

indicate order of evaluation

 

+ prefix

positive number

 

“ prefix

negative number

WHERE Expression Only

   
 

BETWEEN “AND

an inclusive range

 

? or CONTAINS

a character string

 

IS NULL or IS MISSING

missing values

 

LIKE

match patterns

 

=*

sounds-like

 

SAME-AND

add clauses to an existing WHERE statement without retyping original one

[ 4] You can use the colon modifier (:) with any of the comparison operators in order to compare only a specified prefix of a character string.

[ 1] The caret (^), tilde (~), and the not sign () all indicate a logical not. Use the character available on your keyboard, or use the mnemonic equivalent.

[ 2] The OR symbol (), broken vertical bar (), and exclamation point (!) all indicate a logical or. Use the character available on your keyboard, or use the mnemonic equivalent.

[ 3] Two OR symbols ( ), two broken vertical bars ( ), or two exclamation points (!!) indicate concatenation. Use the character available on your keyboard.

Comparisons

  • You can use the WHERE command in SAS/FSP software to subset data for editing and browsing. You can use both the WHERE statement and WHERE= data set option in windowing procedures and in conjunction with the WHERE command.

  • To select observations from individual data sets when a SET, MERGE, MODIFY, or UPDATE statement specifies more than one data set, apply a WHERE= data set option to each data set. In the DATA step, if a WHERE statement and a WHERE= data set option apply to the same data set, SAS uses the data set option and ignores the statement.

  • The most important differences between the WHERE statement in the DATA step and the subsetting IF statement are as follows:

    • The WHERE statement selects observations before they are brought into the program data vector, making it a more efficient programming technique. The subsetting IF statement works on observations after they are read into the program data vector.

    • The WHERE statement can produce a different data set from the subsetting IF when a BY statement accompanies a SET, MERGE, or UPDATE statement. The different data set occurs because SAS creates BY groups before the subsetting IF statement selects but after the WHERE statement selects.

    • The WHERE statement cannot be executed conditionally as part of an IF statement, but the subsetting IF statement can.

    • The WHERE statement selects observations in SAS data sets only, whereas the subsetting IF statement selects observations from an existing SAS data set or from observations that are created with an INPUT statement.

    • The subsetting IF statement cannot be used in SAS windowing procedures to subset observations for browsing or editing.

  • Do not confuse the WHERE statement with the DROP or KEEP statement. The DROP and KEEP statements select variables for processing. The WHERE statement selects observations.

Examples

Example 1: Basic WHERE Statement Usage

This DATA step produces a SAS data set that contains only observations from data set CUSTOMER in which the value for NAME begins with Mac and the value for CITY is Charleston or Atlanta .

 data testmacs;     set customer;     where substr(name,1,3)='Mac' and           (city='Charleston' or city='Atlanta');  run; 

Example 2: Using Operators Available Only in the WHERE Statement

  • Using BETWEEN-AND:

     where empnum between 500 and 1000; 
  • Using CONTAINS:

     where company ? 'bay';  where company contains 'bay'; 
  • Using IS NULL and IS MISSING:

     where name is null;  where name is missing; 
  • Using LIKE to select all names that start with the letter D:

     where name like 'D%'; 
  • Using LIKE to match patterns from a list of the following names:

    • Diana

    • Diane

    • Dianna

    • Dianthus

    • Dyan

    WHERE Statement

    Name Selected

    where name like 'D_an';

    Dyan

    where name like 'D_an_';

    Diana, Diane

    where name like 'D_an__';

    Dianna

    where name like 'D_an%';

    all names from list

  • Using the Sounds-like Operator to select names that sound like Smith :

     where lastname=*'Smith'; 
  • Using SAME-AND:

     where year>1991;    more SAS statements    where same and year<1999; 
  • In this example, the second WHERE statement is equivalent to the following WHERE statement:

     where year>1991 and year<1999; 

See Also

Data Set Option:

  • WHERE= Data Set Option on page 61

Statement:

  • IF Statement, Subsetting on page 1212

SAS SQL Query Window User s Guide

SAS/IML User s Guide

Base SAS Procedures Guide

SAS Indexes in SAS Language Reference: Concepts

WHERE-Expression Processing in SAS Language Reference: Concepts

BY-Group Processing in SAS Language Reference: Concepts

Beatrous, S. & Clifford, W. (1998), Sometimes You Do Get What You Want: SAS I/O Enhancements in Version 7, Proceedings of the Twenty “third Annual SAS Users Group International Conference , 23.




SAS 9.1 Language Reference Dictionary, Volumes 1, 2 and 3
SAS 9.1 Language Reference Dictionary, Volumes 1, 2 and 3
ISBN: N/A
EAN: N/A
Year: 2004
Pages: 704

flylib.com © 2008-2017.
If you may any questions please contact us: flylib@qtcs.net