Defines elements of an array
Valid: in a DATA step
Category: Information
Type: Declarative
ARRAY array- name { subscript }<$>< length >
< array-elements ><( initial-value-list )>;
array-name
names the array.
Restriction: Array-name must be a SAS name that is not the name of a SAS variable in the same DATA step.
CAUTION:
Using the name of a SAS function as an array name can cause unpredictable results. If you inadvertently use a function name as the name of the array, SAS treats parenthetical references that involve the name as array references, not function references, for the duration of the DATA step. A warning message is written to the SAS log.
{ subscript }
describes the number and arrangement of elements in the array by using an asterisk, a number, or a range of numbers . Subscript has one of these forms:
{ dimension- size (s) }
indicates the number of elements in each dimension of the array. Dimension-size is a numeric representation of either the number of elements in a one-dimensional array or the number of elements in each dimension of a multidimensional array.
Tip: You can enclose the subscript in braces ({}), brackets ( [ ])or parentheses (( )).
Example: An array with one dimension can be defined as
array simple{3} red green yellow;
This ARRAY statement defines an array that is named SIMPLE that groups together three variables that are named RED, GREEN, and YELLOW.
Example: An array with more than one dimension is known as a multidimensional array. You can have any number of dimensions in a multidimensional array. For example, a two-dimensional array provides row and column arrangement of array elements. This statement defines a two-dimensional array with five rows and three columns :
array x{5,3} score1-score15;
SAS places variables into a two-dimensional array by filling all rows in order, beginning at the upper-left corner of the array (known as row-major order).
{< lower :> upper <, ...< lower :> upper >}
are the bounds of each dimension of an array, where lower is the lower bound of that dimension and upper is the upper bound.
Range: In most explicit arrays, the subscript in each dimension of the array ranges from 1 to n , where n is the number of elements in that dimension.
Example: In the following example, the value of each dimension is by default the upper bound of that dimension.
array x{5,3} score1-score15;
As an alternative, the following ARRAY statement is a longhand version of the previous example:
array x{1:5,1:3} score1-score15;
Tip: For most arrays, 1 is a convenient lower bound; thus, you do not need to specify the lower and upper bounds. However, specifying both bounds is useful when the array dimensions have a convenient beginning point other than 1.
Tip: To reduce the computational time that is needed for subscript evaluation, specify a lower bound of 0.
{ * }
indicates that SAS is to determine the subscript by counting the variables in the array. When you specify the asterisk, also include array-elements .
Restriction: You cannot use the asterisk with _TEMPORARY_ arrays or when you define a multidimensional array.
$
indicates that the elements in the array are character elements.
Tip: The dollar sign is not necessary if the elements have been previously defined as character elements.
length
specifies the length of elements in the array that have not been previously assigned a length.
array-elements
names the elements that make up the array. Array-elements must be either all numeric or all character, and they can be listed in any order. The elements can be
variables
lists variable names.
Range: The names must be either variables that you define in the ARRAY statement or variables that SAS creates by concatenating the array name and a number. For instance, when the subscript is a number (not the asterisk), you do not need to name each variable in the array. Instead, SAS creates variable names by concatenating the array name and the numbers 1, 2, 3, n .
Tip: These SAS variable lists enable you to reference variables that have been previously defined in the same DATA step:
_NUMERIC_
indicates all numeric variables.
_CHARACTER_
indicates all character variables.
_ALL_
indicates all variables.
Restriction: If you use _ALL_, all the previously defined variables must be of the same type.
Featured in: Example 1 on page 1104
_TEMPORARY_
creates a list of temporary data elements.
Range: Temporary data elements can be numeric or character.
Tip: Temporary data elements behave like DATA step variables with these exceptions:
They do not have names. Refer to temporary data elements by the array name and dimension.
They do not appear in the output data set.
You cannot use the special subscript asterisk (*) to refer to all the elements.
Temporary data element values are always automatically retained, rather than being reset to missing at the beginning of the next iteration of the DATA step.
Tip: Arrays of temporary elements are useful when the only purpose for creating an array is to perform a calculation. To preserve the result of the calculation, assign it to a variable. You can improve performance time by using temporary data elements.
( initial-value-list )
gives initial values for the corresponding elements in the array. The values for elements can be numbers or character strings. You must enclose all character strings in quotation marks. To specify one or more initial values directly, use the following format:
( initial-value(s) )
To specify an iteration factor and nested sublists for the initial values, use the following format:
< constant-iter-value *> <(> constant value constant-sublist <)>
Restriction: If you specify both an initial-value-list and array-elements , then array-elements must be listed before initial-value-list in the ARRAY statement.
Tip: You can assign initial values to both variables and temporary data elements.
Tip: Elements and values are matched by position. If there are more array elements than initial values, the remaining array elements receive missing values and SAS issues a warning.
Featured in: Example 2 on page 1104, and Example 3 on page 1104
Tip: You can separate the values in the initial value list with either a comma or a blank space.
Tip: You can also use a shorthand notation for specifying a range of sequential integers. The increment is always +1.
Tip: If you have not previously specified the attributes of the array elements (such as length or type), the attributes of any initial values that you specify are automatically assigned to the corresponding array element.
Note: Initial values are retained until a new value is assigned to the array element.
Tip: When any (or all) elements are assigned initial values, all elements behave as if they were named on a RETAIN statement.
Examples: The following examples show how to use the iteration factor and nested sublists. All of these ARRAY statements contain the same initial value list:
ARRAY x{10} x1-x10 (10*5);
ARRAY x{10} x1-x10 (5*(5 5));
ARRAY x{10} x1-x10 (5 5 3*(5 5) 5 5);
ARRAY x{10} x1-x10 (2*(5 5) 5 5 2*(5 5));
ARRAY x{10} x1-x10 (2*(5 2*(5 5)));
The ARRAY statement defines a set of elements that you plan to process as a group . You refer to elements of the array by the array name and subscript. Because you usually want to process more than one element in an array, arrays are often referenced within DO groups.
Arrays in the SAS language are different from those in many other languages. A SAS array is simply a convenient way of temporarily identifying a group of variables. It is not a data structure, and array-name is not a variable.
An ARRAY statement defines an array. An array reference uses an array element in a program statement.
array rain {5} janr febr marr aprr mayr;
array days{7} d1-d7;
array month{*} jan feb jul oct nov;
array x{*} _NUMERIC_;
array qbx{10};
array meal{3};
array test{4} t1 t2 t3 t4 (90 80 70 70);
array test{4} t1-t4 (90 80 2*70);
array test{4} _TEMPORARY_ (90 80 70 70);
array test2{*} a1 a2 a3 ( ' a ' , ' b ' , ' c ' );
array new{2:5} green jacobs denato fetzer;
array x{5,3} score1-score15;
array test{3:4,3:7} test1-test10;
array temp{0:999} _TEMPORARY_;
ARRAY x{10} (2*1:5);
The following example shows that you can create a range of variable names that have leading zeroes. Each variable name has a length of three characters , and the names sort correctly (A01, A02, ... A10). Without leading zeroes, the variable names would sort in the following order: A1, A10, A2, ... A9.
options pageno=1 nodate ps=64 ls=80; data test (drop=i); array a(10) A01-A10; do i=1 to 10; a(i)=i; end; run; proc print noobs data=test; run;
The SAS System 1 A01 A02 A03 A04 A05 A06 A07 A08 A09 A10 1 2 3 4 5 6 7 8 9 10
Statement:
'Array Reference Statement' on page 1105
'Array Processing' in SAS Language Reference: Concepts