E.4 Macro Processing | ItaniumR Architecture for Programmers. Understanding 64-Bit Processors and EPIC Principles

The principles of good programming and the motivation to enhance programmer productivity and accuracy emphasize such concepts as abstraction, encapsulation, and controlled adaptation. Facilities accessible to the assembly language programmer include subroutines, procedures, functions, repeat blocks, macros, and lexical preprocessing.

In Chapter 7, we studied the design and uses of subroutines, procedures, and functions at the assembly language level. Only one copy of such a routine exists in the linked executable program. Control passes to the routine whenever it is called, and it produces effects that can be mediated by the particular parameters passed. Because these routines conform to a prescribed calling standard and exist in only one copy, subroutines, procedures, and functions are sometimes called closed routines (see Levy and Eckhouse).

Earlier in this appendix, we also studied the design and uses of repeat blocks, which merely replicate code segments with, at most, one substitutable parameter (using .irp or .irpc).

Macros offer capabilities midway between those of callable routines and those of repeat blocks. Like callable routines, macros may have numerous parameters. Like repeat blocks, they insert explicit code expansion inline. Because macros expand into explicit machine instructions at the place of invocation and require no linkage mechanism to pass values or control, macros are sometimes called open routines (see Levy and Eckhouse).

Macros can be very simple, consisting of just a few lines, or they can be highly complex, encompassing dozens of parameters and hundreds of lines with nesting. The extensive applications of macros include the following purposes:

to make the program code more readable the same motivation as for some of the proceduralization in high-level languages;
to implement a "missing instruction" through emulation in an appropriate and potentially conditional sequence of actual machine instructions supported by the architecture;
to define an emulation of another machine's instructions (this may be rather difficult); and
to create data structures and initialize them statically.

We will illustrate these and other applications throughout the text portion of this chapter and its exercises.

E.4.1 Defining a Macro

A macro must be defined before it can be used. A macro definition begins with the .macro directive and ends with the .endm directive. All intervening lines comprise the body of the macro, which may contain arbitrarily many occurrences of formal parameters that are specified with the .macro directive:

 .macro  name     formal_parameter_list ... < sequence of statements containing the parameters > ... .endm

where name is any legal symbol and the formal parameter list consists of symbols, separated by commas, which are to be replaced one-for-one by actual parameters when the macro is used or invoked. Although the macro name is optional on the .endm directive, the assembler provides this facility to verify and enforce the correct nesting of macros.

The range of lines bracketed by .macro and .endm will be subjected to text substitution of the actual parameters wherever their counterpart formal parameters occur in any field of instructions or directives. Each occurrence is subjected to the same text substitution. The process resembles the find and replace operation within a word processor.

When the assembler comes upon a .macro directive, it adds the macro name to an internal table and stores the verbatim text of the macro body up to the matching .endm directive. Further processing is deferred until the macro is expanded when invoked.

Macro names do not need to be unique from other symbols in a program, because the assembler stores them in a dedicated internal table. Similarly, the formal parameter names are associated with a particular macro name and thus do not need to be unique from other symbols or from the parameters for other macros.

When macros are nested, the inner ones are not defined until the outer one is actually invoked and expanded with text-substitution. If a macro invokes itself (recursively), its body must contain conditional tests and one or more .mexit directives that will prevent infinite recursion.

E.4.2 Invoking a Macro

A macro is used, or invoked, by putting its name in the operator field along with values called actual parameters in the specifier field, like this:

 name    actual_parameter_list

where the specified actual parameters will be substituted for the counterpart formal parameters. This substitution proceeds much like the find and replace operation of a word processor.

Before presenting further details about macro parameters and the text substitution process, we show a simple example of a macro. Suppose that an experienced programmer has been accustomed to architectures having clr as a machine instruction for clearing registers i.e., ensuring that a register contains all 0 bits. Although the Itanium architecture lacks such an instruction mnemonic, we have already encountered situations in our sample programs where registers needed to be initialized to 0. At first we used mov rn=0. Then we learned that mov is only a pseudoinstruction for which the assembler adapts some appropriate machine instruction, perhaps based upon the properties of register r0. One such macro would be:

 .macro  clr     REG         add     \REG = r0,r0     // \REG <-- 0 .endm

A particular invocation of this macro would be

 clr     r14

which would expand to the Itanium instruction

 add     r14 = r0,r0     // r14 <-- 0

where the text substitution of the actual parameter r14 has occurred for the formal parameter REG in this case.

We can elaborate this macro, in order to zero more than one register, by allowing the parameter to be a list processed by an inner indefinite repeat block:

 .macro    clr     REGLIST .irp      REG, \REGLIST           add     \REG = r0,r0     // \REG <-- 0 .endr .endm

The stored body of the macro would be a repeat block that takes the form

 .irp    REG, \REGLIST         add     \REG = r0,r0     // \REG <-- 0 .endr

When we invoke this somewhat more complicated macro as, for instance,

 clr     "r14,r15,r16"

the formal parameter REGLIST is given the text string "r14,r15,r16" (without the quotation marks) as an actual value. The repeat block will then iterate its range as its single parameter REG takes on successive values in the specified set (first r14, then r15, and finally r16):

 add     r14 = r0,r0      // r14 <-- 0 add     r15 = r0,r0      // r15 <-- 0 add     r16 = r0,r0      // r16 <-- 0

If macros or repeat blocks are nested, invoking the outermost macro or repeat block causes all of the inner structures to expand in accordance with parameter substitution and subject to any conditionals that they may contain.

E.4.3 Processing of Positional Parameters

Formal parameters in a macro definition and actual parameters in a macro call may be separated by commas or spaces. The actual parameter values supplied when a macro is invoked are substituted strictly in accord with their positional relationships to the corresponding formal parameters in the macro definition. The first actual parameter replaces all occurrences of the first formal parameter, the second replaces all occurrences of the second, etc. Null values can be passed by simply using adjacent commas, like this:

 .macro  count   ONE, TWO, THREE  // definition         count   25, , 30         // invocation

In this schematic call, ONE will be text substituted with 25, THREE will be text substituted with 30, and TWO will be text substituted with an empty string.

Null parameters

Considerable care must be taken with parameters that may be null, unless default values are provided (Section E.4.4). In the example just given, an empty string must be acceptable to the assembler in all the positions where it "substitutes" for formal parameter TWO. For instance, if TWO were used as the specifier for a data8 directive in the macro body, then no data entry would be created by an expansion that yielded data8 with no value list, and the location counter in the data segment would not advance.

E.4.4 Processing of Default Values and Keyword Parameters

When a macro has numerous parameters, and especially when some of these are optional or usually have default values, the positional formalism becomes awkward. Keyword formalism sometimes offers greater convenience.

Default values

Default values are specified simply by putting "=value" after the name of any formal parameter in a macro definition. Suppose that we modify the simple clr macro:

 .macro  clr     REG=r14         add     \REG = r0,r0     // \REG <-- 0 .endm

With this modification, invoking the macro without and with an explicit actual parameter,

 clr clr     r16

would result in these expansions:

 add     r14 = r0,r0      // r14 <-- 0 add     r16 = r0,r0      // r16 <-- 0

In summary, the capability to associate a default value with one or more of the formal parameters when a macro is defined leads to simplification of the macro call but results in the same expansion as if the default value had been explicitly given at the time of the macro call. While the use of such defaults offers convenience and enforces a certain level of standardization, the default values are "invisible" if one is only reading the macro invocation statement itself.

Keyword parameters

Keyword parameters are specified by putting "=value" after the name of any formal parameter in a macro invocation. Values for parameters can be specified in any order when keywords are used, like this:

 .macro  count  ONE, TWO, THREE  // definition         count  THREE=30, ONE=25 // invocation

In this schematic call, ONE will be text substituted with 25, THREE will be text substituted with 30, but TWO will be text substituted with an empty string. Again here it is important to ensure that any null values will have desired results.

Keyword and positional parameters may be mixed, but this practice may reduce the readability of a program. Usually the first few positional parameters are mandatory, and then numerous additional parameters may be optionally designated by keyword (in any order) when the macro is used.

E.4.5 Processing of String Parameters

Simple strings consisting exclusively of alphanumeric characters can be passed as actual macro parameters without special concern. The GCC assembler preserves upper- or lowercase alphabetic characters.

Strings that contain any of the characters that normally separate actual macro parameters (commas, tabs, spaces) must be enclosed within quotation marks when they are being passed as actual parameters to a macro. For example, a macro defined as:

 .macro  asc     STR         stringz "\STR" .endm

could be called in these ways:

 asc     Abc asc     "d,e f"

with these results:

 stringz "Abc" stringz "d,e f"

That is, the quotation marks enclosing an argument in a call are removed. The quotation marks within the macro definition are needed for the syntax of the stringz storage directive.