The Alpha Architecture | Hacking Ubuntu: Serious Hacks Mods and Customizations (ExtremeTech)

Alpha is a little-endian CPU with alignment requirements for quadword, longword, and word size memory references. If the memory references are not aligned, the CPU throws a data-alignment exception. Extensive information regarding the Alpha CPU and Alpha assembler can be obtained from this book's Web site ( www. wiley .com/compbooks/koziol ) in the folders on this chapter. There you will find the Alpha Architecture Handbook and the Tru64 Unix Assembly Language Programmer's Guide .

This chapter will not require you to know complex details about the Alpha CPU; basic concepts like memory alignment, simple instruction set, and endianness are enough for developing exploits for the Tru64 OS. The following sections describe all the necessary bits and pieces of the Alpha CPU that we will need to know.

Alpha Registers

Alpha CPUs have two types of registers ” integer registers and floating-point registers . We will concentrate only on the integer registers since we can use them for pointer operations (pointer operations are the main vector of obtaining execution control). Floating-point registers are not used for shellcode development or system call invocation. They might be handy from time to time for esoteric shellcode tricks, but are mostly useless in exploit development (see http://archives.neohapsis.com/archives/vuln-dev/2003-q2/0334.html ).

There are 32 integer registers, each one 64 bits wide. We will refer to the integer registers as general registers or general purpose registers, which are common terms for many RISC CPUs. The general registers are named from $0 to $31 . As a shortcut, if we include the <alpha/regdef.h> header file in assembly language programs, we can use special names for certain general registers such as the $sp (stack pointer), $ra (return address), and $fp (frame pointer).

Table 12.1 lists the general purpose registers, their symbolic (software) names and most common or mandatory usage.

Table 12.1: Tru64 General Purpose Registers
Register(s)	SYMBOLIC NAME	USAGE
$0	v0	Holds the return value upon function return. (Result from the invoked function.)
$1 “8	t0 “t7	Temporary registers.
$9 “14	s0 “s5	Saved registers. Preserved across procedure calls.
$15	fp or s6	Frame pointer (if used) or seventh saved register.
$16 “21	a0 “a5	These registers are used to pass the first six arguments to functions (such as the incoming registers in SPARC).
$22 “25	t8 “t11	Temporary registers.
$26	ra	Return address. Preserved across procedure calls.
$27	t12	Contains procedure value (loader specific).
$28	AT	Reserved for assembler.
$29	gp	Global pointer.
$30	sp	Stack pointer.
$31	zero	Hardwired zero value.

Instruction Set

Table 12.2 presents the Alpha instructions we used to assemble our payload components . We used only an abbreviated number of the total instruction set; these are enough for which to set up stack frames , pointers, and invoke system calls. In reality, compiler-generated code for a similar payload would not be much different from our handcrafted payload.

Table 12.2: Alpha Instructions for Assembling Payload Components
Common Instruction		DESCRIPTION OF INSTRUCTION
addq addl	sreg, val, dreg sreg1, sreg2, dreg . . .	Compute the sum of two quadword (or longword) values and place it in dreg (destination register).
Stq Stl stw, stb	sreg, address sreg, address	Stores the contents of the sreg (source register) in the memory location specified by the effective address. (stX -> quadword, longword, word, byte . . .)
mov	sreg, dreg val, dreg	Moves the content of the sreg or the value into dreg.
bis	sreg, val, dreg sreg1, sreg2, dreg . . .	Computes the logical OR of two values. Logical sum of two values.
bic	sreg, val, dreg sreg1, sreg2, dreg . . .	Computes the logical ANDNOT. Good for things like addr and ~(PAGESZ-1)
subq subl	sreg, val, dreg . . .	Compute the difference of two quadword (or longword) values and place it in dreg.
beq bne	sreg, label sreg, label	Branch if the content of the sreg is equal to zero. Branch if the content of the sreg is not equal to zero.
blt bgt ble, bge	sreg, label sreg, label	Branch if the content of the sreg is less than zero. Branch if the content of the sreg is greater than zero.
bsr	label dreg, label	Branch unconditionally to the label and store return address in dreg. If dreg is not specified store in ra register.
Common Instruction		DESCRIPTION OF INSTRUCTION
lda	dreg, address	Loads the dreg with the effective address of the referenced data item.
ldq ldl ldw, ldb	dreg, address dreg, address	Load the dreg with the contents of the quadword (longword, word, byte) specified by the effective address.
xor	sreg, val, dreg sreg1, sreg2, dreg . . .	Computes the logical difference of two values. Good for "zeroing out" a register.
sll	sreg, val, dreg sreg1, sreg2, dreg	Shift the contents of the register left, place zero on the vacated bits.
srl	sreg, val, dreg sreg1, sreg2, dreg	Shift the contents of the register right, place zero on the vacated bits.
PAL_callsys		System call invocation (we will cover this instruction later).
PAL_imb		l-cache flush (we will cover this instruction later).

Calling Conventions

Calling conventions are an important concept for stack-based overflows that overwrite return addresses. Because we have already covered the calling conventions for the x86 and SPARC architectures in previous chapters, we will not go into details of the Alpha stack frame layout for procedure calls; it is similar enough to x86 calling conventions (see Chapters 2 and 3). The only difference we will encounter is that Alpha has no saved frame pointer in the stack; it might be thought of as an x86 calling convention for programs compiled with omit-frame-pointer. The quadword just above the stack pointer ( above as in higher memory) is the saved return address. In the prolog of a nonleaf (function with stack usage) function, the return address (the ra register) is saved on the stack with the following instruction:

 stq     ra, 0(sp)

On function epilog, the return address is loaded from the stack and the ret instruction takes the execution flow back to the caller.

 ldq     ra, 0(sp) ... ret     zero, (ra), 1

This simplistic calling convention (unlike the SPARC CPU) makes it trivial to exploit stack buffer overflows, because the return address is located right above the stack storage. On the other hand, it makes exploitation quite interesting in the case of off-by-one vulnerabilities. In these vulnerabilities, the least significant byte of the return address is overwritten, and the callee returns to a different location than the caller expected, which often leads to interesting exploitation scenarios. This makes off-by-ones in Tru64 completely application specific, and most likely it will not be all that feasible to achieve anything interesting in terms of exploitation.