We can rewrite the DECNUM program (Figure 6-5) using a stack that stores each digit from right to left as individual bytes (most significant digit on the bottom of the stack). Each element is computed as a remainder, converted into an ASCII character, and then transferred byte-by-byte into the order (left to right) required for printing. The new DECNUM2 program is shown in Figure 7-3. Figure 7-3 DECNUM2: An illustration of stack usage// DECNUM2 Convert integer to ASCII // This program converts the positive number at X2 // into a string of ASCII-encoded decimal digits at A2 W = 64 // W = number width LEN = 20 // Allowance for 20 digits DOT8 = 0xcccccccccccccccd // 0.8 (approx.) .data // Declare storage .align 8 // Desired alignment X2: data8 0x12345678 // Number to convert A2: .skip 80 // ASCII output area STACK: .skip LEN // User-defined stack .text // Section for code .align 32 // Desired alignment .global main // These three lines .proc main // mark the mandatory main: // 'main' program entry .prologue// Leaf procedure can save .save ar.lc, r31 // the caller's ar.lc mov r31 = ar.lc // in a scratch register .body // Now we really begin... first: add r11 = @gprel(STACK),gp // r11 -> stack movl r21 = DOT8 // 0.8 reciprocal factor new: add r15 = @gprel(X2),gp;;// r15 -> input ld8 r9 = [r15] // Get number to convert st1 [r11] = r0,1;; // Push 0 as a flag again: mov r20 = r9 // Save previous quotient mov ar.lc = W-1 // Traversals minus one booth: mov r19 = 0 // Set bit n-1 to zero mov r8 = r21 // Get reciprocal factor (R) mov r9 = 0;; // Set L to zero cycle: and r22 = 0x1,r8;; // Isolate lowest bit of R xor r23 = r19,r22;; // r23 = whether to act cmp.ne p6,p0 = 0,r23 // p6 = whether to act mov r19 = r22;; // Bit n-1 for next time (p6) cmp.eq.unc p7,p8 = 0,r22;; // Add, subtract, nop? (p7) add r9 = r9,r20 // Add X to L (p8) sub r9 = r9,r20;; // Subtract X from L shrp r8 = r9,r8,1 // New R of shifted LR shr r9 = r9,1 // New L of shifted LR br.cloop.sptk.few cycle;; // More cycles? nosign: add r9 = r9,r20;; // Add X to L shr.u r9 = r9,3;; // r9 = quotient = r17/10 shladd r3 = r9,2,r9;; // r3 = 5*quotient add r3 = r3,r3;; // r3 = 10*quotient now sub r3 = r20,r3;; // r3 = remainder now or r3 = 0x30,r3;; // Make into ASCII char st1 [r11] = r3,1 // Store the character cmp.ne p6,p0 = r9,r0 // If quotient not zero, (p6) br.cond.sptk.few again;; // repeat the cycle add r11 = -2,r11 // Skip latest char add r16 = @gprel(A2),gp;; // r16 -> output move: st1 [r16] = r3,1 // Move one character ld1 r3 = [r11],-1;; // Get another character cmp.ne p6,p0 = r3,r0 // If it is not 0 flag (p6) br.cond.sptk.few move // then move it to A2 area st1 [r16] = r0 // Make A2 into stringz add r11 = 1,r11 // Fix stack pointer done: mov r8 = 0 // Signal all is normal mov ar.lc = r31 // Restore caller's ar.lc br.ret.sptk.many b0;; // Back to command line .endp main // Mark end of procedure We have incorporated a stack that builds toward increasing memory addresses using register r11 as its pointer. The loop is designed for reuse: At its end, it resets this pointer so that a repeat call could return to the label new. Since the magnitude of the number is not known ahead of time, we could determine the number of digits pushed onto the stack with a counter. Alternatively, we might preserve the value of the stack pointer before the loop is entered. Then the stack-relinquishing loop would be controlled by an address comparison, requiring one register to hold the original pointer value, an unsigned compare instruction, and a predicated branch instruction. A more efficient choice would involve a special flag that is never part of the actual data. String data uses the NUL character (0x00) to terminate a string. Such a technique with a flag often requires fewer registers and instructions for loop control. The algorithm of DECNUM2 is unchanged from DECNUM, except that each generated ASCII character is now placed onto a memory stack. A small byte-move loop has been added after the loop where modulo computations take place. The expected answer is 30541989610 for 0x12345678. We can use the debugger to verify the correct functioning of this program. Inspection with gdb would occur as follows: (gdb) break done Breakpoint 1 at 0x4000000000000640 (gdb) run Starting program: /house/user/decnum2 Breakpoint 1, 0x4000000000000640 in done () (gdb) x/s &A2 0x6000000000000808 <A2>: "305419896" (gdb) We could extend this program to use the C input/output routines from Section 6.7 to display the formatted number. When the program executes, it builds the sequence of ASCII characters going forward from label STACK in the order NUL-6-9-8-9-1-4-5-0-3, where each character is formed from the computed remainder of a modulo 10 operation. In the output loop, these characters are peeled off in a last-in first-out fashion to form the string 305419896, with the ASCII code NUL (0x00) serving as the stop indicator. This application hints that stacks are extraordinarily versatile data structures with numerous applications, such as parsing and evaluating arithmetic expressions that contain parentheses. For example, the values for an arithmetic expression are temporarily held in a stack in an RPN (reverse Polish notation) calculator until the operators are entered. In assembly language programming, stacks can hold copies of the data in registers so that those registers can be reused. When a procedure is called, a stack can be used to preserve the caller's context, to furnish the called routine's requirements for temporary storage, and to pass arguments when they may outnumber the available registers. |