8.6 Predication Based on Floating-Point Values

Comparisons between pairs of data values lie close to the heart of truly useful CPU operations from a programmer's perspective, as we said in Chapter 5. The outcome of a comparison is a Boolean true or false condition that directs the selective execution of portions of an algorithm that require choices, or branching.

We saw that there is a close relationship between comparisons and the concept called predication. The Itanium architecture implements predication by storing the Boolean result of a comparison in a pair of predicate registers. In general, the first predicate register is set to 1 and the other to 0; the logically opposite outcome would instead set the first to 0 and the second to 1. In special cases where only one predicate register is essential, the permanently true predicate register Pr0 can be substituted for the other position.

The Itanium ISA provides a full complement of instruction variants that compare pairs of full-precision floating-point numbers in register format; these instructions set a pair of predicate registers as outlined. The Itanium ISA also provides parallel forms of compare instructions that make multiple simultaneous comparisons of data elements packed as single-precision floating-point numbers; those instructions capture the multiple independent Boolean true or false outcomes into a floating-point register.

We also include in this context the Itanium floating-point class instruction, which can distinguish among floating-point contents of negative, NaN, NaTVal, and other values.

8.6.1 Floating-Point Compare Instruction

Almost all modern computer architectures provide not only for detecting whether two floating-point quantities are equal, but also for determining how two unequal quantities actually differ. In algebraic notation, there are six useful cases altogether: equal (=), not equal (!=), less than (<), less than or equal (<=), greater than or equal (>=), and greater than (>).

The syntax and behavior of fcmp, the Itanium floating-point compare instruction, is similar to cmp, the integer comparison instruction, but with only one basic form and more comparison types:

 fcmp.fcrel.fctype p1,p2=f2,f3  // always two registers 

where two predicate registers (Pr0 Pr63) must always be specified, and Pr0 (the permanently true predicate register) may be used in either position. In their most common use, these instructions can be read in pseudo-English from left to right.

The fcmp instruction sets p1 true and p2 false if f2 fcrel f3 is true; otherwise, it sets p1 false and p2 true.

The most frequently used codes for fcrel (the conditional relationship completer) are obvious and easily remembered: eq, ne, lt, le, ge, and gt, in the same order as above. Itanium assemblers implement additional completers for the negated relations: nlt, nle, nge, and ngt.

Comparisons involving plus or minus infinity execute as expected, but comparisons involving zero ignore the sign: +0.0 tests as equal to 0.0.

Additionally, the IEEE standard defines a special "unordered" relation that is true if one or both operands are NaN (not a number). The Itanium floating-point compare instruction uses the completer unord (unordered) to test for this relationship and the completer ord (ordered) for the Boolean opposite.

All these comparisons can be implemented as variations of only a few fundamental digital logic operations at the hardware level. The designers of the Itanium architecture chose eq, lt, le, and unord as the necessary and sufficient set.

Two choices of fctype (the comparison type) are offered. The default choice of none at all gives a standard comparison as just described. The other choice, unc, causes an unconditional comparison, where both predicate outcomes are set to 0 if the compare instruction itself is subject to a predication of false or are set to 1 and 0 as already described if the compare instruction itself is subject to a predication of true.

8.6.2 Floating-Point Class Instruction

Since the Itanium floating-point registers can contain a variety of represented quantities, the fclass instruction provides a way for a program to determine the nature of the current contents of any floating-point register:

 fclass.fcrel.fctype p1,p2=f2,fclass9  // is  f2 as expected? 

where two predicate registers (Pr0 Pr63) must always be specified, and fclass9 is a bit-encoded pattern of characteristics to be sought in the actual contents of register f2.

The fclass instruction sets p1 true and p2 false if f2 fcrel fclass9 is true; otherwise, it sets p1 false and p2 true.

Two choices are offered for fcrel (the conditional relationship completer): m (is a member) and nm (is not a member).

The classification pattern fclass9 may be specified to an assembler with the mnemonics for each bit position given in Table 8-3; those mnemonics can be ORed together using the | operator (Table 3-4).

Appropriate AND and OR relations are applied behind the scenes. A number is said to agree with the pattern fclass9 if one of three conditions is true: it is NaTVal, and @nat was sought; it is not a number, and @qnan or @snan is sought as appropriate; or its sign agrees with @pos or @neg, if specified, and the type of the number agrees with the remaining specification (@zero OR @unorm OR @norm OR @inf).

A value of 0x1ff for fclass9 is therefore equivalent to a test whether the current value in the floating-point register corresponds to any supported type. A test for zero can simply use @zero alone in order to reflect both ±0.0 possibilities.

Table 8-3. Mnemonic Specifiers for Floating-Point Classes

Floating-Point Class

Assembler Mnemonic

Bit Value in fclass9

NaTVal

@nat

0x100

Quiet NaN

@qnan

0x080

Signaling NaN

@snan

0x040

Positive

@pos

0x001

Negative

@neg

0x002

Zero

@zero

0x004

Un-normalized

@unorm

0x008

Normalized

@norm

0x010

Infinity

@inf

0x020

Only two choices are offered for fctype (the comparison type completer). The default choice of none at all gives a standard comparison as just described. The other choice, unc, causes an unconditional comparison, where both predicate outcomes are set to 0 if the compare instruction itself is subject to a predication of false or are set to 1 and 0 as already described if the compare instruction itself is subject to a predication of true.



ItaniumR Architecture for Programmers. Understanding 64-Bit Processors and EPIC Principles
ItaniumR Architecture for Programmers. Understanding 64-Bit Processors and EPIC Principles
ISBN: N/A
EAN: N/A
Year: 2003
Pages: 223

flylib.com © 2008-2017.
If you may any questions please contact us: flylib@qtcs.net