Summary

We have discussed a broad range of hardware software performance considerations in this chapter, all from a programmer's perspective. An architecture imposes some performance limitations, but relative weighting of various factors is highly implementation-dependent. Software strategies are ideally consonant with hardware capabilities. If not, the degradation in performance can be severe.

We sketched a programmer's view of hardware pipelines and discussed concepts such as hazards and dependencies with specific reference to the Itanium 2 processor implementation. RISC, VLIW, and EPIC architectures take different approaches to instruction-level parallelism to get the most out of superscalar CPU designs.

Software for Itanium systems must analyze instruction sequences, mark explicit stops between groups of mutually independent instructions, and then fit the odd-length groups into bundles of three instructions with appropriate templates. Various compilers and assemblers may produce physically different but logically equivalent bundles using different mixtures of no-op instructions, templates, and targeting of type A instructions into the M- or I-units for execution.

The Itanium architecture provides cache line prefetch and speculative and advanced load instructions that compilers can insert in order to hide some of the delays associated with retrieving data from the memory hierarchy, yet in ways that are logically safe with respect to data dependency and control dependency.

Register rotation in Itanium systems provides robust support for organizing and optimizing the operation of time-critical innermost loops in programs through modulo scheduling, instead of the traditional loop unrolling used by compilers for many RISC architectures.

We outlined several historically prominent program optimization factors and assessed their contemporary pertinence, especially to the Itanium architecture. We concluded the chapter with different algorithms for computing a particular number in the Fibonacci series. This example emphasized that choice of algorithm can have a greater impact than incremental optimization strategies.



ItaniumR Architecture for Programmers. Understanding 64-Bit Processors and EPIC Principles
ItaniumR Architecture for Programmers. Understanding 64-Bit Processors and EPIC Principles
ISBN: N/A
EAN: N/A
Year: 2003
Pages: 223

flylib.com © 2008-2017.
If you may any questions please contact us: flylib@qtcs.net