Alpha is a little-endian CPU with alignment requirements for quadword, longword, and word size memory references. If the memory references are not aligned, the CPU throws a data-alignment exception. Extensive information regarding the Alpha CPU and Alpha assembler can be obtained from this book's Web site ( www. wiley .com/compbooks/koziol ) in the folders on this chapter. There you will find the Alpha Architecture Handbook and the Tru64 Unix Assembly Language Programmer's Guide .
This chapter will not require you to know complex details about the Alpha CPU; basic concepts like memory alignment, simple instruction set, and endianness are enough for developing exploits for the Tru64 OS. The following sections describe all the necessary bits and pieces of the Alpha CPU that we will need to know.
Alpha CPUs have two types of registers ” integer registers and floating-point registers . We will concentrate only on the integer registers since we can use them for pointer operations (pointer operations are the main vector of obtaining execution control). Floating-point registers are not used for shellcode development or system call invocation. They might be handy from time to time for esoteric shellcode tricks, but are mostly useless in exploit development (see http://archives.neohapsis.com/archives/vuln-dev/2003-q2/0334.html ).
There are 32 integer registers, each one 64 bits wide. We will refer to the integer registers as general registers or general purpose registers, which are common terms for many RISC CPUs. The general registers are named from $0 to $31 . As a shortcut, if we include the <alpha/regdef.h> header file in assembly language programs, we can use special names for certain general registers such as the $sp (stack pointer), $ra (return address), and $fp (frame pointer).
Table 12.1 lists the general purpose registers, their symbolic (software) names and most common or mandatory usage.
Register(s) | SYMBOLIC NAME | USAGE |
---|---|---|
$0 | v0 | Holds the return value upon function return. (Result from the invoked function.) |
$1 “8 | t0 “t7 | Temporary registers. |
$9 “14 | s0 “s5 | Saved registers. Preserved across procedure calls. |
$15 | fp or s6 | Frame pointer (if used) or seventh saved register. |
$16 “21 | a0 “a5 | These registers are used to pass the first six arguments to functions (such as the incoming registers in SPARC). |
$22 “25 | t8 “t11 | Temporary registers. |
$26 | ra | Return address. Preserved across procedure calls. |
$27 | t12 | Contains procedure value (loader specific). |
$28 | AT | Reserved for assembler. |
$29 | gp | Global pointer. |
$30 | sp | Stack pointer. |
$31 | zero | Hardwired zero value. |
Table 12.2 presents the Alpha instructions we used to assemble our payload components . We used only an abbreviated number of the total instruction set; these are enough for which to set up stack frames , pointers, and invoke system calls. In reality, compiler-generated code for a similar payload would not be much different from our handcrafted payload.
Common Instruction | DESCRIPTION OF INSTRUCTION | |
---|---|---|
addq addl | sreg, val, dreg sreg1, sreg2, dreg . . . | Compute the sum of two quadword (or longword) values and place it in dreg (destination register). |
Stq Stl stw, stb | sreg, address sreg, address | Stores the contents of the sreg (source register) in the memory location specified by the effective address. (stX -> quadword, longword, word, byte . . .) |
mov | sreg, dreg val, dreg | Moves the content of the sreg or the value into dreg. |
bis | sreg, val, dreg sreg1, sreg2, dreg . . . | Computes the logical OR of two values. Logical sum of two values. |
bic | sreg, val, dreg sreg1, sreg2, dreg . . . | Computes the logical ANDNOT. Good for things like addr and ~(PAGESZ-1) |
subq subl | sreg, val, dreg . . . | Compute the difference of two quadword (or longword) values and place it in dreg. |
beq bne | sreg, label sreg, label | Branch if the content of the sreg is equal to zero. Branch if the content of the sreg is not equal to zero. |
blt bgt ble, bge | sreg, label sreg, label | Branch if the content of the sreg is less than zero. Branch if the content of the sreg is greater than zero. |
bsr | label dreg, label | Branch unconditionally to the label and store return address in dreg. If dreg is not specified store in ra register. |
Common Instruction | DESCRIPTION OF INSTRUCTION | |
lda | dreg, address | Loads the dreg with the effective address of the referenced data item. |
ldq ldl ldw, ldb | dreg, address dreg, address | Load the dreg with the contents of the quadword (longword, word, byte) specified by the effective address. |
xor | sreg, val, dreg sreg1, sreg2, dreg . . . | Computes the logical difference of two values. Good for "zeroing out" a register. |
sll | sreg, val, dreg sreg1, sreg2, dreg | Shift the contents of the register left, place zero on the vacated bits. |
srl | sreg, val, dreg sreg1, sreg2, dreg | Shift the contents of the register right, place zero on the vacated bits. |
PAL_callsys | System call invocation (we will cover this instruction later). | |
PAL_imb | l-cache flush (we will cover this instruction later). |
Calling conventions are an important concept for stack-based overflows that overwrite return addresses. Because we have already covered the calling conventions for the x86 and SPARC architectures in previous chapters, we will not go into details of the Alpha stack frame layout for procedure calls; it is similar enough to x86 calling conventions (see Chapters 2 and 3). The only difference we will encounter is that Alpha has no saved frame pointer in the stack; it might be thought of as an x86 calling convention for programs compiled with omit-frame-pointer. The quadword just above the stack pointer ( above as in higher memory) is the saved return address. In the prolog of a nonleaf (function with stack usage) function, the return address (the ra register) is saved on the stack with the following instruction:
stq ra, 0(sp)
On function epilog, the return address is loaded from the stack and the ret instruction takes the execution flow back to the caller.
ldq ra, 0(sp) ... ret zero, (ra), 1
This simplistic calling convention (unlike the SPARC CPU) makes it trivial to exploit stack buffer overflows, because the return address is located right above the stack storage. On the other hand, it makes exploitation quite interesting in the case of off-by-one vulnerabilities. In these vulnerabilities, the least significant byte of the return address is overwritten, and the callee returns to a different location than the caller expected, which often leads to interesting exploitation scenarios. This makes off-by-ones in Tru64 completely application specific, and most likely it will not be all that feasible to achieve anything interesting in terms of exploitation.