Exercises

1:

What are the chief causes of pipeline stalling?

2:

What is meant by instruction-level parallelism? What are the characteristics of modern approaches towards its attainment?

3:

The initial Itanium processor implementation contains only two M-units and two I-units. Which of the Itanium templates (Table 10-2) cannot be paired as the two bundles of instructions that would execute completely in parallel?

4:

Prepare an illustration similar to Figure 10-2 showing how an Itanium 2 processor would execute the four bundles of instructions leading up to label done in the SCANFILE program as prepared by the particular assembler that you are using.

5:

Show how to use the tnat instruction and predication in order to pull the recovery load for the first control speculation illustration in Section 10.3.3 inline, avoiding the backward branch.

6:

What are the advantages and disadvantages of traditional loop unrolling? Describe the alternative approach that the Itanium architecture makes possible.

7:

Express as functions of N, the number of traversals of the programmer's loop, the estimated number of machine cycles for programs using the programming techniques of DOTCLOOP (Figure 5-2), DOTCTOP (Figure 10-3), and DOTCTOP2 (Figure 10-4). Assume hits in L1 D-cache, assume Itanium 2 processor latencies, and exclude the additional timing effects when the respective branch instructions are taken.

8:

Recast the DOTCTOP program to use the br.wtop instruction on the assumption that the dimensionality N is not explicitly given, but instead an address pointer to the word address immediately following the highest-subscripted component of V is known. If you had the choice, would you prefer this method of loop control or that in DOTCTOP? Explain your preference.

9:

In what ways does the Itanium architecture call into question some of the historically important factors affecting program optimization?

10:

The HEXNUM program (Figure 4-3) is highly sequential i.e., most instruction groups in the loop contain only one useful instruction. Explore rewriting this program to take better advantage of the full parallelism of Itanium processors.

11:

How many internal self-calls of FIB1 are required to compute the value of the sixth member of the sequence of Fibonacci numbers?

12:

Modify the FIB1 function by allocating an information unit in the data section that can be incremented every time that FIB1 is entered. How many times is FIB1 called in the course of computing F6? F35?

13:

Show that the total number of passages through the FIB1 function to compute the nth member in the sequence of Fibonacci numbers will equal the sum of Fn 1 passages that fall through the conditional branch instruction (i.e., full passages) and Fn passages that take the branch (i.e., abbreviated passages returning 1 as the value).

14:

Investigate replacing the N = 1 or 2 exit in the FIB1 function with the instruction "(p6) br.ret.spnt.many b0" and removing the label done. Does this improve the performance for large N? Suggest why or why not.

15:

Use FIB2 alone with a modified calling program to find out the point in the sequence of Fibonacci numbers where the computed value can no longer be represented as an unsigned 64-bit value. Explain your method.

16:

(Individual or group project) Develop and thoroughly analyze Itanium functions for the factorial function, both based on the well-known recurrence relation Fn! = n x Fn 1! and based on a simple nonrecursive program loop.

17:

(Individual or group project) Look up the czx instruction (compute zero index) in Intel's "Instruction Set Reference" and explore its potential for converting the SCANTEXT program (Section 6.4) into an improved SCANFAST program, possibly incorporating other programming techniques from this present chapter.



ItaniumR Architecture for Programmers. Understanding 64-Bit Processors and EPIC Principles
ItaniumR Architecture for Programmers. Understanding 64-Bit Processors and EPIC Principles
ISBN: N/A
EAN: N/A
Year: 2003
Pages: 223

flylib.com © 2008-2017.
If you may any questions please contact us: flylib@qtcs.net