Appendix C. Itanium Instruction Set

In the body of this book, we have discussed only a carefully chosen subset of the Itanium instruction set. We have elected to present those instructions that have analogues in other computer architectures, as well as selected instructions that are distinctive to this ISA. We have biased our selection towards the manipulation of full-width data, both integer and floating-point.

We acknowledge that this ISA also offers a rich set of SIMD (single instruction, multiple data) instructions for byte, word, double word, and single-precision floating-point data. Using such instructions effectively requires a degree of algorithm development and analysis that likely deserves a more specialized book.

In addition to native instructions, the initial Itanium processor implementations include the capability to execute the full IA32 instruction set equivalent to Pentium III, MMX, and Streaming SIMD Extension instructions. We do not detail the IA32 instruction set in this book.

In this appendix, we tabulate essentially all of the native Itanium instructions in two alphabetic listings, by function and by assembler opcode.

For each opcode, we indicate its class (A, B, F, I, M, X), which must be considered by the assembler when bundling the instruction with others, using an appropriate template. Instruction classes are discussed in Section 4.1.3 of the main text.

We mark with a dagger () in the Class column those instructions that can be executed only with elevated privileges. Such instructions are used within operating systems. We have seldom discussed these in the body of this book, since our focus has been on the understanding of basic principles, with illustrative programs in user mode.

For each instruction that we have chosen to discuss in any detail, we indicate the section in the body of this book where you may find further information about its syntax and uses.

The comprehensive descriptions in "Instruction Set Reference," volume 3 of the Intel Itanium Architecture Software Developer's Manual, provide information about assembler syntax, completers and special cases, and a pseudocode explanation of the execution details of each instruction.

Here we list basic information for Itanium instructions, indexed by function (Table C-1) and by opcode (Table C-2).

Table C-1. Instructions Listed by Function
Function	Assembler Opcode	Class^[*]	Section
Add	`add, addl, adds`	A	4.2.1
Add Pointer	`addp4`	A	4.6.3
Allocate Stack Frame	`alloc`^[]	M	7.3.3
And Complement	`andcm`	A	6.1.2
Bank Switch	`bsw`^[]	B^[]
Branch	`br`	B	5.3.1
Branch Long	`brl`	X	5.3.1
Branch Predict	`brp`^[]	B	13.3.6
Break	`break.b`	B
Break	`break.f`	F
Break	`break.i`	I
Break	`break.m`	M
Break	`break.x`	X
Break (pseudo-op)	`break`
Clear RRB	`clrrrb`^[]	B
Compare	`cmp`	A	5.2
Compare Double Words	`cmp4`	A	5.2
Compare and Exchange	`cmpxchg1, cmpxchg2, cmpxchg4, cmpxchg8, cmp8xchg16`	M	12.6.2
Compute Zero Index	`czx1, czx2`	I	12.2
Convert Floating-Point to Integer	`fcvt.fx`	F	8.7.1
Convert Parallel Floating-Point to Integer	`fpcvt.fx, fpcvt.fxu`	F	12.5
Convert Signed Integer to Floating-Point	`fcvt.xf`	F	8.7.1
Convert Unsigned Integer to Floating-Point	`fcvt.xuf`	F	8.7.1
Counted Branch	`br.cloop`^[]	B	5.6
Cover Stack Frame	`cover`^[]	B
Deposit	`dep, dep.z`	I	6.3.4
Enter Privileged Code	`epc`^[]	B
Exchange (byte, word, double word, quad word)	`xchg1, xchg2, xchg4, xchg8`	M	12.6.2
Exclusive Or	`xor`	A	6.1.2
Extract	`extr, extr.u`	I	6.3.4
Fetch and Add Immediate	`fetchadd4, fetchadd8`	M	12.6.2
Fixed-Point Multiply (pseudo-op)	`xmpy.l, xmpy.h`		8.7.2
Fixed-Point Multiply Add (low and high parts)	`xma.l, xma.h`	M	8.7.2
Floating-Point Absolute Value (pseudo-op)	`fabs`		8.3.4
Floating-Point Absolute Maximum	`famax`	F	8.4.4
Floating-Point Absolute Minimum	`famin`	F	8.4.4
Floating-Point Add (pseudo-op)	`fadd`		8.4.1
Floating-Point And Complement	`fandcm`	F	8.7.4
Floating-Point Check Flags	`fchkf`	F
Floating-Point Class	`fclass`	F	8.6.2
Floating-Point Clear Flags	`fclrf`	F
Floating-Point Compare	`fcmp`	F	8.6.1
Floating-Point Exclusive Or	`fxor`	F	8.7.4
Floating-Point Load	`ldfd, ldfe, ldfs`	M	8.3.2
Floating-Point Load (with fill)	`ldf.fill`	M	8.3.2
Floating-Point Load (integer)	`ldf8`	M	8.3.2
Floating-Point Load Pair	`ldfpd, ldfps`	M	8.3.3
Floating-Point Load Pair (integer)	`ldfp8`	M	8.3.3
Floating-Point Logical And	`fand`	F	8.7.4
Floating-Point Logical Or	`for`	F	8.7.4
Floating-Point Maximum	`fmax`	F	8.4.4
Floating-Point Merge	`fmerge`	F	8.3.5
Floating-Point Minimum	`fmin`	F	8.4.4
Floating-Point Mix	`fmix`	F	12.5
Floating-Point Multiply (pseudo-op)	`fmpy`		8.4.1
Floating-Point Multiply Add	`fma`	F	8.4.2
Floating-Point Multiply Subtract	`fms`	F	8.4.2
Floating-Point Negate (pseudo-op)	`fneg`		8.3.4
Floating-Point Negate Absolute Value (pseudo-op)	`fnegabs`		8.3.4
Floating-Point Negative Multiply (pseudo-op)	`fnmpy`		8.4.1
Floating-Point Negative Multiply Add	`fnma`	F	8.4.2
Floating-Point Normalize (pseudo-op)	`fnorm`		8.4.3
Floating-Point Pack	`fpack`	F	12.5
Floating-Point Parallel Absolute Maximum	`fpamax`	F	12.5
Floating-Point Parallel Absolute Minimum	`fpamin`	F	12.5
Floating-Point Parallel Absolute Value (pseudo-op)	`fpabs`		12.5
Floating-Point Parallel Compare	`fpcmp`	F	12.5
Floating-Point Parallel Maximum	`fpmax`	F	12.5
Floating-Point Parallel Merge	`fpmerge`	F	12.5
Floating-Point Parallel Minimum	`fpmin`	F	12.5
Floating-Point Parallel Multiply (pseudo-op)	`fpmpy`		12.5
Floating-Point Parallel Multiply Add	`fpma`	F	12.5
Floating-Point Parallel Multiply Subtract	`fpms`	F	12.5
Floating-Point Parallel Negate (pseudo-op)	`fpneg`		12.5
Floating-Point Parallel Negate Absolute Value (pseudo-op)	`fpnegabs`		12.5
Floating-Point Parallel Negative Multiply (pseudo-op)	`fpnmpy`		12.5
Floating-Point Parallel Negative Multiply Add	`fpnma`	F	12.5
Floating-Point Parallel Reciprocal Approximation	`fprcpa`	F	12.5
Floating-Point Parallel Reciprocal Square Root Approximation	`fprsqrta`	F	12.5
Floating-Point Reciprocal Approximation	`frcpa`	F	8.8.1
Floating-Point Reciprocal Square Root Approximation	`frsqrta`	F	8.8.2
Floating-Point Select	`fselect`	F	8.7.4
Floating-Point Set Controls	`fsetc`	F
Floating-Point Sign Extend	`fsxt`	F	12.5
Floating-Point Store	`stfd, stfe, stfs`	F	8.3.1
Floating-Point Store (with spill)	`stf.spill`	F	8.3.1
Floating-Point Subtract (pseudo-op)	`fsub`		8.4.1
Floating-Point Swap	`fswap`	F	12.5
Flush Cache	`fc, fc.i`	M
Flush Register Stack	`flushrs`^[]	M
Flush Write Buffers	`fwb`	M
Get Floating-Point Exponent	`getf.exp`	M	8.7.1
Get Floating-Point Significand	`getf.sig`	M	8.7.1
Get Floating-Point Value	`getf.d, getf.s`	M	8.7.1
Hint	`hint.b`	B
Hint	`hint.f`	F
Hint	`hint.i`	I
Hint	`hint.m`	M
Hint	`hint.x`	X
Insert Translation Cache	`itc`	M^[]
Insert Translation Register	`itr`	M^[]
Invalidate ALAT	`invala`	M	10.3.2
Line Prefetch	`lfetch`	M	10.6.5
Load (1, 2, 4, 8, 16 bytes)	`ld1, ld2, ld4, ld8, ld16`	M	4.5.3
Load (with fill)	`ld8.fill`	M	4.5.3
Load Register Stack	`loadrs`^[]	M
Logical And	`and`	A	6.1.2
Logical Or	`or`	A	6.1.2
Memory Fence	`mf`	M
Memory Synchronization	`sync.i`	M
Mix	`mix1, mix2, mix4`	I	12.2
Move (pseudo-op)	`mov`		3.4.3
Move Application Register	`mov.i`	I	4.5.6
Move Application Register	`mov.m`	M	4.5.6
Move Branch Register	`(use mov)`	I	5.3.5
Move Branch Register (return)	`mov.ret`	I	4.5.6
Move Control Register	`mov cr`	M^[]	4.5.6
Move Floating-Point Register (pseudo-op)	`mov fr`		8.3.4
Move General Register (pseudo-op)	`mov gr`		3.4.3
Move Immediate (pseudo-op)	`mov imm`
Move Indirect Register	`mov indirect`	M	4.5.6
Move Instruction Pointer	`mov ip`	I	4.5.6
Move Long Immediate	`movl`	X	4.5.4
Move Predicates	`mov pr`	I	10.5
Move Processor Status Register	`mov psr`	M^[]	4.5.6
Move User Mask	`mov um`	M	4.5.6
Mux (bytes, words)	`mux1, mux2`	I	12.2
No Operation (pseudo-op)	`nop`		3.5.7
No Operation	`nop.b`	B	10.3.1
No Operation	`nop.f`	F	10.3.1
No Operation	`nop.i`	I	10.3.1
No Operation	`nop.m`	M	10.3.1
No Operation	`nop.x`	X	10.3.1
Pack (into smaller units)	`pack2, pack4`	I	12.2
Parallel Add (bytes, words, double words)	`padd1, padd2, padd4`	A	12.2
Parallel Average (bytes, words)	`pavg1, pavg2`	A	12.2
Parallel Average Subtract (bytes, words)	`pavgsub1, pavgsub2`	A	12.2
Parallel Compare (bytes, words, double words)	`pcmp1, pcmp2, pcmp4`	A	12.2
Parallel Maximum (bytes, words)	`pmax1, pmax2`	I	12.2
Parallel Minimum (bytes, words)	`pmin1, pmin2`	I	12.2
Parallel Multiply (words)	`pmpy2`	I	4.2.5
Parallel Multiply and Shift Right (words)	`pmpyshr2`	I	12.2
Parallel Shift Left (words, double words)	`pshl2, pshl4`	I	12.2
Parallel Shift Left and Add (words)	`pshladd2`	A	12.2
Parallel Shift Right (words, double words)	`pshr2, pshr4`	I	12.2
Parallel Shift Right and Add (words)	`pshradd2`	A	12.2
Parallel Subtract (bytes, words, double words)	`psub1, psub2, psub4`	A	12.2
Parallel Sum of Absolute Differences (bytes)	`psad1`	I	12.2
Population Count	`popcnt`	I	12.2
Probe Access	`probe`	M
Purge Global Translation Cache	`ptc.g`	M^[]
Purge Local Translation Cache	`ptc.l`	M^[]
Purge Translation Cache Entry	`ptc.e`	M^[]
Purge Translation Register	`ptr`	M^[]
Reset System Mask	`rsm`	M^[]
Reset User Mask	`rum`	M
Return from Interruption	`rfi`^[]	B^[]
Serialize	`srlz`	M
Set Floating-Point Exponent	`setf.exp`	M	8.7.1
Set Floating-Point Significand	`setf.sig`	M	8.7.1
Set Floating-Point Value (double)	`setf.d`	M	8.7.1
Set Floating-Point Value (single)	`setf.s`	M	8.7.1
Set System Mask	`ssm`	M^[]
Set User Mask	`sum`	M
Shift Left	`shl`	I	6.3.1
Shift Left and Add	`shladd`	A	4.2.3
Shift Left and Add Pointer	`shladdp4`	A	4.6.3
Shift Right (arithmetic)	`shr`	I	6.3.1
Shift Right (logical)	`shr.u`	I	6.3.1
Shift Right Pair	`shrp`	I	6.3.3
Sign Extend (byte, word, double word)	`sxt1, sxt2, sxt4`	I	4.6.1
Speculation Check (data)	`chk.a`	M	10.3.2
Speculation Check (pseudo-op)	`chk.s`		10.3.3
Speculation Check (data)	`chk.s.i`	I	10.3.3
Speculation Check (data)	`chk.s.m`	M	10.3.3
Store (1, 2, 4, 8, 16 bytes)	`st1, st2, st4, st8, st16`	M	4.5.2
Store with spill	`st8.spill`	M	4.5.2
Store Floating-Point (single, double, extended)	`stfs, stfd, stfe`	M	8.3.1
Store Floating-Point (integer)	`stf8`	M	8.3.1
Store Floating-Point with spill	`stf.spill`	M	8.3.1
Subtract	`sub`	A	4.2.1
Test Bit	`tbit`	I	6.1.4
Test NaT	`tnat`	I	10.3.3
Translate to Physical Address	`tpa`	M^[]
Translation Access Key	`tak`	M^[]
Translation Hashed Entry Access	`thash`	M
Translation Hashed Entry Tag	`ttag`	M
Unpack (bytes, words, double words)	`unpack1, unpack2, unpack4`	I	12.2
Zero Extend (byte, word, double word)	`zxt1, zxt2, zxt4`	I	4.6.2

^[*] Specific Itanium processor implementations may require that certain instructions be targeted to a specific port, such as I0 rather than any available I-unit.

^[] The instruction cannot be predicated (does not take a qualifying predicate).

^[] The instruction can only be executed at a privileged level.

Table C-2. Instructions Listed by Assembler Opcode
Assembler Opcode	Function	Class^[*]	Section
`add`	Add	A	4.2.1
`addl`	Add (imm22)	A	4.2.1
`addp4`	Add Pointer	A	4.6.3
`adds`	Add (imm14)	A	4.2.1
`alloc`^[]	Allocate Stack Frame	M	7.3.3
`and`	Logical And	A	6.1.2
`andcm`	And Complement	A	6.1.2
`br`	Branch	B	5.3.1
`br.cloop`^[]	Counted Branch	B	5.6
`break`	Break (pseudo-op)
`break.b`	Break	B
`break.f`	Break	F
`break.i`	Break	I
`break.m`	Break	M
`break.x`	Break	X
`brl`	Branch Long	X	5.3.1
`brp`^[]	Branch Predict	B	13.3.6
`bsw`^[]	Bank Switch	B^[]
`chk.a`	Speculation Check (data)	M	10.3.2
`chk.s`	Speculation Check (pseudo-op)		10.3.3
`chk.s.i`	Speculation Check (control)	I	10.3.3
`chk.s.m`	Speculation Check (control)	M	10.3.3
`clrrrb`^[]	Clear RRB	B
`cmp`	Compare	A	5.2
`cmp4`	Compare Double Words	A	5.2
`cmpxchg1`	Compare and Exchange	M	12.6.2
`cmpxchg2`	Compare and Exchange	M	12.6.2
`cmpxchg4`	Compare and Exchange	M	12.6.2
`cmpxchg8`	Compare and Exchange	M	12.6.2
`cmp8xchg16`	Compare and Exchange	M	12.6.2
`cover`^[]	Cover Stack Frame	B
`czx1`	Compute Zero Index	I	12.2
`czx2`	Compute Zero Index	I	12.2
`dep`	Deposit	I	6.3.4
`dep.z`	Deposit (zero form)	I	6.3.4
`epc`^[]	Enter Privileged Code	B
`extr`	Extract	I	6.3.4
`extr.u`	Extract (unsigned)	I	6.3.4
`fabs`	Floating-Point Absolute Value (pseudo-op)		8.3.4
`fadd`	Floating-Point Add (pseudo-op)		8.4.1
`famax`	Floating-Point Absolute Maximum	F	8.4.4
`famin`	Floating-Point Absolute Minimum	F	8.4.4
`fand`	Floating-Point Logical And	F	8.7.4
`fandcm`	Floating-Point And Complement	F	8.7.4
`fc, fc.i`	Flush Cache	M
`fchkf`	Floating-Point Check Flags	F
`fclass`	Floating-Point Class	F	8.6.2
`fclrf`	Floating-Point Clear Flags	F
`fcmp`	Floating-Point Compare	F	8.6.1
`fcvt.fx`	Convert Floating-Point to Integer	F	8.7.1
`fcvt.xf`	Convert Signed Integer to Floating-Point	F	8.7.1
`fcvt.xuf`	Convert Unsigned Integer to Floating-Point	F	8.7.1
`fetchadd4`	Fetch and Add Immediate	M	12.6.2
`fetchadd8`	Fetch and Add Immediate	M	12.6.2
`flushrs`^[]	Flush Register Stack	M
`fma`	Floating-Point Multiply Add	F	8.4.2
`fmax`	Floating-Point Maximum	F	8.4.4
`fmerge`	Floating-Point Merge	F	8.3.5
`fmin`	Floating-Point Minimum	F	8.4.4
`fmix`	Floating-Point Mix	F	12.5
`fmpy`	Floating-Point Multiply (pseudo-op)		8.4.1
`fms`	Floating-Point Multiply Subtract	F	8.4.2
`fneg`	Floating-Point Negate (pseudo-op)		8.3.4
`fnegabs`	Floating-Point Negate Absolute Value (pseudo-op)		8.3.4
`fnma`	Floating-Point Negative Multiply Add	F	8.4.2
`fnmpy`	Floating-Point Negative Multiply (pseudo-op)		8.4.1
`fnorm`	Floating-Point Normalize (pseudo-op)		8.4.3
`for`	Floating-Point Logical Or	F	8.7.4
`fpabs`	Floating-Point Parallel Absolute Value (pseudo-op)		12.5
`fpack`	Floating-Point Pack	F	12.5
`fpamax`	Floating-Point Parallel Absolute Maximum	F	12.5
`fpamin`	Floating-Point Parallel Absolute Minimum	F	12.5
`fpcmp`	Floating-Point Parallel Compare	F	12.5
`fpcvt.fx`	Convert Parallel Floating-Point to Integer	F	12.5
`fpcvt.fxu`	Convert Parallel Floating-Point to Integer (unsigned)	F	12.5
`fpma`	Floating-Point Parallel Multiply Add	F	12.5
`fpmax`	Floating-Point Parallel Maximum	F	12.5
`fpmerge`	Floating-Point Parallel Merge	F	12.5
`fpmin`	Floating-Point Parallel Minimum	F	12.5
`fpmpy`	Floating-Point Parallel Multiply (pseudo-op)		12.5
`fpms`	Floating-Point Parallel Multiply Subtract	F	12.5
`fpneg`	Floating-Point Parallel Negate (pseudo-op)		12.5
`fpnegabs`	Floating-Point Parallel Negate Absolute Value (pseudo-op)		12.5
`fpnma`	Floating-Point Parallel Negative Multiply Add	F	12.5
`fpnmpy`	Floating-Point Parallel Negative Multiply (pseudo-op)		12.5
`fprcpa`	Floating-Point Parallel Reciprocal Approximation	F	12.5
`fprsqrta`	Floating-Point Parallel Reciprocal Square Root Approximation	F	12.5
`frcpa`	Floating-Point Reciprocal Approximation	F	8.8.1
`frsqrta`	Floating-Point Reciprocal Square Root Approximation	F	8.8.2
`fselect`	Floating-Point Select	F	8.7.4
`fsetc`	Floating-Point Set Controls	F
`fsub`	Floating-Point Subtract (pseudo-op)		8.4.1
`fswap`	Floating-Point Swap	F	12.5
`fsxt`	Floating-Point Sign Extend	F	12.5
`fwb`	Flush Write Buffers	M
`fxor`	Floating-Point Exclusive Or	F	8.7.4
`getf.d`	Get Floating-Point Value (double)	F	8.7.1
`getf.exp`	Get Floating-Point Exponent	F	8.7.1
`getf.s`	Get Floating-Point Value (single)	F	8.7.1
`getf.sig`	Get Floating-Point Significand	F	8.7.1
`hint.b`	Hint	B
`hint.f`	Hint	F
`hint.i`	Hint	I
`hint.m`	Hint	M
`hint.x`	Hint	X
`invala`	Invalidate ALAT	M	10.3.2
`itc`	Insert Translation Cache	M^[]
`itr`	Insert Translation Register	M^[]
`ld1`	Load (byte)	M	4.5.3
`ld2`	Load (word)	M	4.5.3
`ld4`	Load (double word)	M	4.5.3
`ld8`	Load (quad word)	M	4.5.3
`ld8.fill`	Load (with fill)	M	4.5.3
`ld16`	Load (16-byte form)	M	12.6.2
`ldf8`	Floating-Point Load (integer)	M	8.3.2
`ldfd`	Floating-Point Load (double)	M	8.3.2
`ldfe`	Floating-Point Load (extended)	M	8.3.2
`ldfs`	Floating-Point Load (single)	M	8.3.2
`ldf.fill`	Floating-Point Load (with fill)	M	8.3.2
`ldfp8`	Floating-Point Load Pair (integer)	M	8.3.3
`ldfpd`	Floating-Point Load Pair (double)	M	8.3.3
`ldfps`	Floating-Point Load Pair (single)	M	8.3.3
`lfetch`	Line Prefetch	M	10.6.5
`loadrs`^[]	Load Register Stack	M
`mf`	Memory Fence	M
`mix1`	Mix (bytes)	I	12.2
`mix2`	Mix (words)	I	12.2
`mix4`	Mix (double words)	I	12.2
`mov`	Move (pseudo-op)		3.4.3
`mov.i`	Move Application Register	I	4.5.6
`mov.m`	Move Application Register	M	4.5.6
`mov.ret`	Move Branch Register (return)	I	4.5.6
`mov cr`	Move Control Register	M^[]	4.5.6
`mov fr`	Move Floating-Point Register (pseudo-op)		8.3.4
`mov gr`	Move General Register (pseudo-op)		3.4.3
`mov imm`	Move Immediate (pseudo-op)
`mov indirect`	Move Indirect Register	M	4.5.6
`mov ip`	Move Instruction Pointer	I	4.5.6
`mov pr`	Move Predicates	I	10.5
`mov psr`	Move Processor Status Register	M^[]	4.5.6
`mov um`	Move User Mask	M	4.5.6
`movl`	Move Long Immediate	X	4.5.4
`mux1`	Mux (bytes)	I	12.2
`mux2`	Mux (words)	I	12.2
`nop`	No Operation (pseudo-op)		3.5.7
`nop.b`	No Operation	B	10.3.1
`nop.f`	No Operation	F	10.3.1
`nop.i`	No Operation	I	10.3.1
`nop.m`	No Operation	M	10.3.1
`nop.x`	No Operation	X	10.3.1
`or`	Logical Or	A	6.1.2
`pack2`	Pack (double words to words)	I	12.2
`pack4`	Pack (words to bytes)	I	12.2
`padd1`	Parallel Add (bytes)	A	12.2
`padd2`	Parallel Add (words)	A	12.2
`padd4`	Parallel Add (double words)	A	12.2
`pavg1`	Parallel Average (bytes)	A	12.2
`pavg2`	Parallel Average (words)	A	12.2
`pavgsub1`	Parallel Average Subtract (bytes)	A	12.2
`pavgsub2`	Parallel Average Subtract (words)	A	12.2
`pcmp1`	Parallel Compare (bytes)	A	12.2
`pcmp2`	Parallel Compare (words)	A	12.2
`pcmp4`	Parallel Compare (double words)	A	12.2
`pmax1`	Parallel Maximum (bytes)	I	12.2
`pmax2`	Parallel Maximum (words)	I	12.2
`pmin1`	Parallel Minimum (bytes)	I	12.2
`pmin2`	Parallel Minimum (words)	I	12.2
`pmpy2`	Parallel Multiply (words)	I	4.2.5
`pmpyshr2`	Parallel Multiply and Shift Right (words)	I	12.2
`popcnt`	Population Count	I	12.2
`probe`	Probe Access	M
`psad1`	Parallel Sum of Absolute Differences (bytes)	I	12.2
`pshl2`	Parallel Shift Left (words)	I	12.2
`pshl4`	Parallel Shift Left (double words)	I	12.2
`pshladd2`	Parallel Shift Left and Add (words)	A	12.2
`pshr2`	Parallel Shift Right (words)	I	12.2
`pshr4`	Parallel Shift Right (double words)	I	12.2
`pshradd2`	Parallel Shift Right and Add (words)	A	12.2
`psub1`	Parallel Subtract (bytes)	A	12.2
`psub2`	Parallel Subtract (words)	A	12.2
`psub4`	Parallel Subtract (double words)	A	12.2
`ptc.e`	Purge Translation Cache Entry	M^[]
`ptc.g`	Purge Global Translation Cache	M^[]
`ptc.l`	Purge Local Translation Cache	M^[]
`ptr`	Purge Translation Register	M^[]
`rfi`^[]	Return from Interruption	B^[]
`rsm`	Reset System Mask	M^[]
`rum`	Reset User Mask	M
`setf.d`	Set Floating-Point Value (double)	M	8.7.1
`setf.exp`	Set Floating-Point Exponent	M	8.7.1
`setf.s`	Set Floating-Point Value (single)	M	8.7.1
`setf.sig`	Set Floating-Point Significand	M	8.7.1
`shl`	Shift Left	I	6.3.1
`shladd`	Shift Left and Add	A	4.2.3
`shladdp4`	Shift Left and Add Pointer	A	4.6.3
`shr`	Shift Right (arithmetic)	I	6.3.1
`shr.u`	Shift Right (logical)	I	6.3.1
`shrp`	Shift Right Pair	I	6.3.3
`srlz`	Serialize	M
`ssm`	Set System Mask	M^[]
`st1`	Store (byte)	M	4.5.2
`st2`	Store (word)	M	4.5.2
`st4`	Store (double word)	M	4.5.2
`st8`	Store (quad word)	M	4.5.2
`st8.spill`	Store with spill	M	4.5.2
`st16`	Store (16byte form)	M	12.6.2
`stf.spill`	Store Floating-Point with spill	M	8.3.1
`stf8`	Store Floating-Point (integer)	M	8.3.1
`stfd`	Store Floating-Point (double)	M	8.3.1
`stfe`	Store Floating-Point (extended)	M	8.3.1
`stfs`	Store Floating-Point (single)	M	8.3.1
`sub`	Subtract	A	4.2.1
`sum`	Set User Mask	M
`sxt1`	Sign Extend (byte)	I	4.6.1
`sxt2`	Sign Extend (word)	I	4.6.1
`sxt4`	Sign Extend (double word)	I	4.6.1
`sync.i`	Memory Synchronization	M
`tak`	Translation Access Key	M^[]
`tbit`	Test Bit	I	6.1.4
`thash`	Translation Hashed Entry Address	M
`tnat`	Test NaT	I	10.3.3
`tpa`	Translate to Physical Address	M^[]
`ttag`	Translation Hashed Entry Tag	M
`unpack1`	Unpack (bytes)	I	12.2
`unpack2`	Unpack (words)	I	12.2
`unpack4`	Unpack (double words)	I	12.2
`xchg1`	Exchange (byte)	M	12.6.2
`xchg2`	Exchange (word)	M	12.6.2
`xchg4`	Exchange (double word)	M	12.6.2
`xchg8`	Exchange (quad word)	M	12.6.2
`xma.h`	Fixed-Point Multiply Add (high part)	F	8.7.2
`xma.l`	Fixed-Point Multiply Add (low part)	F	8.7.2
`xmpy.h`	Fixed-Point Multiply (pseudo-op) (high part)		8.7.2
`xmpy.l`	Fixed-Point Multiply (pseudo-op) (low part)		8.7.2
`xor`	Exclusive Or	A	6.1.2
`zxt1`	Zero Extend (byte)	I	4.6.2
`zxt2`	Zero Extend (word)	I	4.6.2
`zxt4`	Zero Extend (double word)	I	4.6.2

^[*] Specific Itanium processor implementations may require that certain instructions be targeted to a specific port, such as I0 rather than any available I-unit.

^[] The instruction cannot be predicated (does not take a qualifying predicate).

^[] The instruction can only be executed at a privileged level.

Table C-1. Instructions Listed by Function

Table C-2. Instructions Listed by Assembler Opcode