7.7 Integer Quotients and Remainders

Having previously noted that the Itanium architecture lacks a hardware instruction for performing integer divide operations, we proposed in Chapter 6 a partial remedy using multiplication by the reciprocal of a number. Given that many interesting algorithms require division, modulo, and remainder operations, a modern programming environment should not expect the programmer to design and implement division each time it is needed. A substitute, in the form of software, must be provided.

7.7.1 Routines Used by a High-Level Language

Software support for division can be at least as varied as the languages available for each programming environment. High-level languages usually provide an option to view the assembly language that they generate. The -S option for Itanium compilers with a short test program reveals, for the C division operation, that:

gcc for Linux calls a special internal routine, __divdi3

ecc for Linux inserts an inline instruction sequence

aCC for HP-UX calls a special internal routine, __milli_divI

cc_bundled for HP-UX calls a special internal routine, __milli_divI

None of these routines has a published interface. They are intended exclusively as support routines for C programs. If they are used outside of that environment, there is the potential danger that the undocumented interface could change in the future.

7.7.2 Open-Source Routines from Intel Corporation

Intel Corporation has written and made freely available some open-source Itanium compiler support routines for:

  • String and memory functions

  • Divide, square root, and remainder

  • Optimized mathematical functions

See Appendix B.2.1 for information about access to this software. Of immediate interest to us are the routines for calculating quotients and remainders. The routines support different data sizes for both signed and unsigned integer operands. Variants are optimized for minimum latency (for an isolated calculation) or maximum sustained throughput (for a calculation repeated inside a loop). The naming convention of the source files containing these variants is easily understood:

 [u]int[8,16,64]_[div,rem]_[min_lat,max_thr].s [u]int32_[div,rem].s 

The appropriate entry point name for a br.call instruction is the same as the name of the source file (without .s) e.g., uint64_rem_min_lat for an unsigned 64-bit remainder function optimized for minimum latency for a single use.

These routines follow all conventions for register use and for argument passing that we have outlined in this chapter. The calling routine should put the dividend onto the register stack as out0 and the divisor as out1; the routine returns the result in register r8. The next section illustrates linking one of these routines.



ItaniumR Architecture for Programmers. Understanding 64-Bit Processors and EPIC Principles
ItaniumR Architecture for Programmers. Understanding 64-Bit Processors and EPIC Principles
ISBN: N/A
EAN: N/A
Year: 2003
Pages: 223

flylib.com © 2008-2017.
If you may any questions please contact us: flylib@qtcs.net