3.4 Assembler Statement Types

Learning a new language whether a human language or a computer language involves, at the outset, a large amount of new terminology. When we first introduced the SQUARES program (Figure 1-3), we left many details unexplained. Although we will now proceed somewhat more systematically, you still need to have some patience when we introduce several new concepts all at once and then return to discuss them individually at a later time.

We can think of each line in an assembly language program as being a statement that may be imperative, declarative, or controlling:

Imperative statements represent machine instructions in symbolic form. For example, add means an addition instruction and mov denotes a data transfer instruction in SQUARES (Figure 1-3).
Declarative statements control allocation of storage or perform various naming functions. That is, such statements are not actual machine instructions; rather, they reserve space, define symbols, or assign particular initial contents to memory locations. For example, sq1: .skip 8 in SQUARES reserves room in memory to store a quad word (8 bytes) and relates that storage to the symbolic address sq1.
Control statements allow the programmer to have some control over certain portions of the assembly process. For example, .align in SQUARES informs the assembler that the following data or instructions should begin at the next appropriate addressing boundary.

Declarative statements, which usually have names beginning with a dot (.), are only directives to the assembler program; they do not generate machine instructions to be executed at runtime. Control statements may indirectly produce machine instructions (perhaps to save or restore register contents).

3.4.1 Statement Format

Unlike high-level languages, where the syntax for various statement types may differ sharply, assembly language uses the same general format for every line, with only a few elements in a standard order: label, predicate, operator, specifiers, comment. The style of assembly language statements differs from one architecture to another indeed sometimes from one programming environment to another. For example, the punctuation character to separate in-line comments varies, just as it does among high-level languages. For Itanium assembly language, the generic statement format is as follows:

 Label:  (Predicate)  Operator  Specifier1, ...  // Comment

where

Label is a symbolic address in the form of a character string that most assemblers expect to be terminated by a colon (:). Some assemblers can use a double colon (::) to define the symbol as global.
(Predicate) optionally names one of the Itanium predicate registers (Appendix D.3) whose current Boolean value (0 or 1) controls whether or not this instruction is executed.
Operator names a specific statement, which may be one of the three types. This is usually a single word, although many Itanium assembly language instructions use completers, e.g., .ret in the final branch instruction in SQUARES. A space or tab character marks the end of this field.
Specifier is a symbolic name, value, or expression that fills out the statement. For a machine instruction, a specifier is an operand, such as the name of a processor register or a symbolic memory address. For a directive, it is some kind of argument or parameter. Multiple specifiers are separated by commas, with the last specifier being followed by the end of the line, or preferably one or more spaces and then the double slashes that begin a comment. A double semicolon is used in Itanium assembly language to mark breaks where subsequent instructions may not be able to execute in parallel with the current instruction because of resource conflicts and dependencies.
Comment is a human-language description of the line. A comment begins with a double slash (//) and ends at the end of the line.

The label, predicate, and comment fields are syntactically optional. The number of specifiers depends on the statement type. With these few restrictions, the statement format is free form and not bound to specific columns (in contrast to COBOL). Spaces and tabs are generally interchangeable. We encourage keeping the fields neatly lined up for legibility.

In this book we will introduce only some of the capabilities of assemblers available for the Itanium architecture. Additional details are given in the sources cited at the end of this and later chapters.

3.4.2 Symbolic Addresses

Labels assign symbolic names to data locations like sq3 in SQUARES (Figure 1-3). The assembler and linker keep track of the numeric values corresponding to such symbolic names. We can also assign symbolic names to particular points in a program, such as the statement marked with the label done. Doing so defines logical flow and makes those names accessible during a debugging session. When symbolic names are referenced in other statements, such as branches or procedure calls, the assembler and linker keep track of the actual numeric addresses.

We should perhaps emphasize that the labels that we assign to data locations represent the address of the storage location, not the value stored there. That is, the st8 instruction really means "store the specified quad word value into the specified quad word location in memory."

3.4.3 Classes of Assembly Language Operators

Our sample program illustrates several of the particular items that can appear in the operator field of a line in an Itanium assembly language program. We organize these into various classes in Table 3-1, i.e., opcodes, pseudo-operations, and assembler directives.

Names for opcodes like add are usually chosen as mnemonic references to the various fundamental instructions supported by an architecture. Assemblers also predefine certain pseudo-operation (pseudo-op) codes like mov which may resemble fundamental operations in other architectures, but which can be accomplished as special cases of certain Itanium machine instructions and thus need not be redundantly implemented at the hardware level. Directives to assemblers frequently have names that begin with a dot character. Some programming environments also provide system-defined macros whose names may conventionally begin with a dollar sign.

Table 3-1. Classes of Operators in SQUARES
Class	Operator	Purpose
Itanium opcodes	`add`	Quad word addition
	`addl`	Addition of a 22-bit constant
	`st8`	Copy quad word from register to memory
	`br`	Branch, call, or return
Itanium pseudo-operations	`mov`	Copy item on right into register on left
Assembler directives	`.data`	Switch to memory region for data
	`.skip`	Allocate bytes of memory storage
	`.text`	Switch to memory region for instructions
	`.align`	Round up to a specified address granularity
	`.global`	Make a symbol globally accessible
	`.proc`	Mark entry of a procedure
	`.body`	Mark the program body
	`.endp`	Mark end of procedure coding

3.4.1 Statement Format

3.4.2 Symbolic Addresses

3.4.3 Classes of Assembly Language Operators

Table 3-1. Classes of Operators in SQUARES