Kinds of Data


Definitions

data values
   

are character or numeric values.

numeric value
   

contains only numbers, and sometimes a decimal point, a minus sign, or both. When they are read into a SAS data set, numeric values are stored in the floating-point format native to the operating environment. Nonstandard numeric values can contain other characters as numbers ; you can use formatted input to enable SAS to read them.

character value
   

is a sequence of characters.

standard data
   

are character or numeric values that can be read with list, column, formatted, or named input. Examples of standard data include:

  • ARKANSAS

  • 1166.42

nonstandard data
   

is data that can be read only with the aid of informats. Examples of nonstandard data include numeric values that contain commas, dollar signs, or blanks; date and time values; and hexadecimal and binary values.

Numeric Data

Numeric data can be represented in several ways. SAS can read standard numeric values without any special instructions. To read nonstandard values, SAS requires special instructions in the form of informats. Table 21.1 on page 359 shows standard, nonstandard, and invalid numeric data values and the special tools, if any, that are required to read them. For complete descriptions of all SAS informats, see SAS Language Reference: Dictionary .

Table 21.1: Reading Different Types of Numeric Data

Example of Numeric Data

Description

Solution Required to Read

     

Standard Numeric Data

 
   

23

input right aligned

None needed

 

23

 

input not aligned

None needed

23

   

input left aligned

None needed

00023

   

input with leading zeroes

None needed

23.0

   

input with decimal point

None needed

2.3E1

   

in E-notation, 2.30 (ss1)

None needed

230E-1

   

in E-notation, 230x10 (ss-1)

None needed

-23

   

minus sign for negative numbers

None needed

     

Nonstandard Numeric Data

 

23

   

embedded blank

COMMA. or BZ. informat

-23

   

embedded blank

COMMA. or BZ. informat

2,341

   

comma

COMMA. informat

(23)

   

parentheses

COMMA. informat

C4A2

   

hexadecimal value

HEX. informat

1MAR90

   

date value

DATE. informat

     

Invalid Numeric Data

 

23 ˆ’

   

minus sign follows number

Put minus sign before number or solve programmatically. [1]

..

   

double instead of single periods

Code missing values as a single period or use the ?? modifier in the INPUT statement to code any invalid input value as a missing value.

J23

   

not a number

Read as a character value, or edit the raw data to change it to a valid number.

[1] It might be possible to use the S370FZDT w.d informat, but positive values require the trailing plus sign (+).

Remember the following rules for reading numeric data:

  • Parentheses or a minus sign preceding the number (without an intervening blank) indicates a negative value.

  • Leading zeros and the placement of a value in the input field do not affect the value assigned to the variable. Leading zeros and leading and trailing blanks are not stored with the value. Unlike some languages, SAS does not read trailing blanks as zeros by default. To cause trailing blanks to be read as zeros, use the BZ. informat described in SAS Language Reference: Dictionary .

  • Numeric data can have leading and trailing blanks but cannot have embedded blanks (unless they are read with a COMMA. or BZ. informat).

  • To read decimal values from input lines that do not contain explicit decimal points, indicate where the decimal point belongs by using a decimal parameter with column input or an informat with formatted input. See the full description of the INPUT statement in SAS Language Reference: Dictionary for more information. An explicit decimal point in the input data overrides any decimal specification in the INPUT statement.

Character Data

A value that is read with an INPUT statement is assumed to be a character value if one of the following is true:

  • A dollar sign ($) follows the variable name in the INPUT statement.

  • A character informat is used.

  • The variable has been previously defined as character: for example, in a LENGTH statement, in the RETAIN statement, by an assignment statement, or in an expression.

Input data that you want to store in a character variable can include any character. Use the guidelines in the following table when your raw data includes leading blanks and semicolons.

Table 21.2: Reading Instream Data and External Files Containing Leading Blanks and Semicolons

Characters in the Data

What to Use

Reason

leading or trailing blanks that you want to preserve

formatted input and the $CHAR w. informat

list input trims leading and trailing blanks from a character value before the value is assigned to a variable.

semicolons in instream data

DATALINES4 or CARDS4 statements and four semicolons (;;;;) to mark the end of the data

with the normal DATALINES and CARDS statements, a semicolon in the data prematurely signals the end of the data.

delimiters, blank characters, or quoted strings

DSD option, with DELIMITER = option on the INFILE statement

it enables SAS to read a character value that contains a delimiter within a quoted string; it can also treat two consecutive delimiters as a missing value and remove quotation marks from character values.

Remember the following when reading character data:

  • In a DATA step, when you place a dollar sign ($) after a variable name in the INPUT statement, character data that is read from data lines remains in its original case. If you want SAS to read data from data lines as uppercase, use the CAPS system option or the $UPCASE informat.

  • If the value is shorter than the length of the variable, SAS adds blanks to the end of the value to give the value the specified length. This process is known as padding the value with blanks.




SAS 9.1.3 Language Reference. Concepts
SAS 9.1.3 Language Reference: Concepts, Third Edition, Volumes 1 and 2
ISBN: 1590478401
EAN: 2147483647
Year: 2004
Pages: 258

flylib.com © 2008-2017.
If you may any questions please contact us: flylib@qtcs.net