1.7 SQUARES: A First Programming Example

Beginnings are hard. Just think back to the very first program that you wrote in any computer language. It was probably oversimplified, and the hardest task may have been getting data in or out. In this book, each illustrative example in assembly language is kept as simple as possible in order to help you focus your reading and study on the matter at hand, though you must also attend to the details.

In this section, we present a simple but complete program that has served well as a teaching example. We are going to express the algorithm in three high-level languages and in assembly language for the Itanium architecture

Statement of the problem:

Write a program that will produce in memory a table of the squares of the first three integers without using multiplication instructions.

Presentation of the algorithm:

We begin by writing down the first several integers, N, their squares, N 2, and finally the first and second tabular differences in Figure 1-2.

Figure 1-2. Computation of squares by tabular differences

N

N2

1st tabular difference

2nd tabular difference

1

1

  
  

3

 

2

4

 

2

  

5

 

3

9

 

2

  

7

 

4

16

 

2

  

9

 

5

25

  

Successive values of the first tabular difference are computed by adding the constant second tabular difference each time. Then successive values of the squares can be computed by adding the appropriate value of the first difference to the already known previous square.

1.7.1 C, FORTRAN, and COBOL

The enumeration for computing successive squares is readily expressed in a standard 3GL programming language such as C:

 #include <stdio.h> main() {     long long sq1, sq2, sq3;     long long temp, diff1, diff2;     diff1 = 1;     diff2 = 2;     temp = 1;     sq1 = temp;     diff1 = diff2 + diff1;     temp = diff1 + temp;     sq2 = temp;     diff1 = diff2 + diff1;     temp = diff1 + temp;     sq3 = temp;     printf("%lld\t%lld\t%lld\n", sq1, sq2, sq3);     return 0; } 

or FORTRAN:

         PROGRAM SQUARES         INTEGER*8 SQ1, SQ2, SQ3         INTEGER*8 TEMP, DIFF1, DIFF2         DIFF1 = 1         DIFF2 = 2         TEMP = 1         SQ1 = TEMP         DIFF1 = DIFF2 + DIFF1         TEMP = DIFF1 + TEMP         SQ2 = TEMP         DIFF1 = DIFF2 + DIFF1         TEMP = DIFF1 + TEMP         SQ3 = TEMP         PRINT *, SQ1, SQ2, SQ3         END 

or COBOL:

         IDENTIFICATION DIVISION.         PROGRAM-ID. SQUARES.       *         DATA DIVISION.         WORKING-STORAGE SECTION.          01 OUTPUT-FIELD.       * Note that COMP for 18 digits equates to a quad word.       * CR1,2,3 are 0x0d = 13 = CR, for on-screen display             05  SQ1     PIC  9(18)        VALUE 0.             05  CR1     PIC  X(1)         VALUE X"0D".             05  SQ2     PIC  9(18)        VALUE 0.             05  CR2     PIC  X(1)         VALUE X"0D".             05  SQ3     PIC  9(18)        VALUE 0.             05  CR3     PIC  X(1)         VALUE X"0D".          01 CALCULATION-FIELD.             05  DIFF1   PIC  9(18)        VALUE 1.             05  DIFF2   PIC  9(18)        VALUE 2.             05  TEMP    PIC  9(18)        VALUE 1.       *        PROCEDURE DIVISION.        CALCULATE-SQUARES SECTION.             MOVE TEMP TO SQ1.             ADD DIFF2 TO DIFF1 GIVING DIFF1.             ADD DIFF1 TO TEMP GIVING TEMP.             MOVE TEMP TO SQ2.             ADD DIFF2 TO DIFF1 GIVING DIFF1.             ADD DIFF1 TO TEMP GIVING TEMP.             MOVE TEMP TO SQ3.        DISPLAY-RESULTS SECTION.             DISPLAY OUTPUT-FIELD.       *             EXIT PROGRAM.        END PROGRAM SQUARES. 

It should be evident that the pattern of first adjusting diff1, then using diff1 to adjust temp, and finally storing temp as the next square could be iterated any desired number of times to compute sq4, etc.

1.7.2 Assembly Language for Itanium Architecture

Now let us transform the expression of this algorithm from a 3GL implementation into a 2GL equivalent in Itanium assembly language. This listing will appear quite new to you even if you are familiar with the IA-32 or PA-RISC architectures. Stark differences are very common when attempting to move from one assembly language to another.

The algorithm for the SQUARES program (Figure 1-2), as written in Itanium assembly language, is shown in Figure 1-3. We will not fully explain the language elements used here. For the present, it is enough to appreciate that the three columns at the left make up the actual program instructions. The phrases in mixed case to the right of two slash characters (//) are explanatory comments that annotate the programmer's intended relationship between those instructions and the algorithm.

We shall show how to run SQUARES in Chapter 3, after we have introduced the symbolic debugger. For now, you may focus your attention on the substance of the algorithm from first to done. In later chapters we shall also explain the purpose of the lines preceding main and following done.

The Itanium add instruction reads left to right like an algebraic expression in a high-level language, but with a comma instead of a plus sign. An Itanium processor has 128 integer registers, Gr0 … Gr127, that are addressed as r0r127 in assembly language.

Figure 1-3 SQUARES program for Itanium architecture
 // SQUARES       Table of Squares          .data                    // Declare storage          .align  8                // Desired alignment sq1:     .skip   8                // To store 1 squared sq2:     .skip   8                // To store 2 squared sq3:     .skip   8                // To store 3 squared                                   // etc.          .text                    // Section for code          .align  32               // Desired alignment          .global main             // These three lines          .proc   main             //  mark the mandatory main:                             //   'main' program entry          .body                    // Now we really begin... first:   mov     r21 = 1;;        // Gr21 = first difference          mov     r22 = 2;;        // Gr22 = 2nd difference          mov     r20 = 1;;        // Gr20 = first square          addl    r14 = @gprel(sq1),gp;;  // Point to storage          st8     [r14] = r20;;    //         for sq1          add     r21 = r22,r21;;  // Adjust first difference          add     r20 = r21,r20;;  // Gr20 = second square          addl    r14 = @gprel(sq2),gp;;  // Point to storage          st8     [r14] = r20;;    //         for sq2          add     r21 = r22,r21;;  // Adjust first difference          add     r20 = r21,r20;;  // Gr20 = third square          addl    r14 = @gprel(sq3),gp;;  // Point to storage          st8     [r14] = r20;;    //         for sq3                                   // etc. done:    mov     r8 = 0;;         // Signal all is normal          br.ret.sptk.many b0;;    // Back to command line          .endp   main             // Mark end of procedure 

Unlike many older assembly languages, Itanium assembly language does not support direct symbolic addressing of a data location, such as sq1 where we want to store the first computed square. It instead requires two steps. First, we calculate the address of sq1 in register r14 by adding an offset @gprel(sq1) computed by the assembler onto the address contained in register gp, the global pointer. This pointer gets a value when the system loads the program. We then store the computed 8byte value in register r20 into memory at the address given by register r14, using the assembler syntax [r14] with a store (st8) instruction. Similarly, we copy the value of each successive square computed in register r20 into the appropriate memory location.

The double semicolons shown in Figure 1-3 mark stops, which inform an Itanium assembler that we have not analyzed potential timing interdependencies among the machine instructions. The assembler can produce from this format a valid program free from such complications, as we shall show later in this book. This simple SQUARES program does not illustrate the parallelism or predication features of the EPIC architecture.



ItaniumR Architecture for Programmers. Understanding 64-Bit Processors and EPIC Principles
ItaniumR Architecture for Programmers. Understanding 64-Bit Processors and EPIC Principles
ISBN: N/A
EAN: N/A
Year: 2003
Pages: 223

flylib.com © 2008-2017.
If you may any questions please contact us: flylib@qtcs.net