We now present an example of linking an external function written in assembly language to a main program written in a high-level language. This function, which produces successive "ran dom" numbers, will also illustrate the calling of routines to compute the remainder from integer division as well as to obtain a nonrepeating initialization value. High-level languages have not always specified functions for producing pseudo-random values, though modern compilers tend to include an extension to languages lacking such a standard. A large body of work exists on the challenging topic of how to generate and test sequences of random numbers (see Ferrenberg, Hayes, Knuth, Park). Sampling of measurements on natural systems can provide random numbers (per Haahr), but computational methods are by far the most common and perhaps commonly abused. Regardless of the method of generation, testing is required to ensure that a sequence of supposedly random numbers does not exhibit autocorrelation. For computer-generated integers, the best attainable sequences will show no ability to infer Ri+j from Ri for a typical j < 2w , where w is the bit-width used in the calculations, except for the value(s) of j used by the algorithm itself. For example, if a computer has a 16-bit datapath, software running on it should be able to produce integer sequences with a periodicity of 216 = 65,536, not some smaller interval of repetition. 7.8.1 Choosing an AlgorithmKnuth provides guidelines in Seminumerical Algorithms for selecting constants for the linear congruential method, in which successive values are generated by the simple relationship:
where m is a "well-chosen" constant multiplier, a is a constant additive term, and d is a constant divisor. On a binary computer, d is usually selected as 2w , where w is the natural integer size in bits, because the natural retention of only a fixed number of least significant bits from multiplication and addition is tantamount to a modulo operation. If d is a power of 2, then 1 is a suitable value for a. In addition, Knuth has shown that m should end with e 21 (decimal), with e being an even digit. The "seed" value R0 is arbitrary. If it is a built-in constant, the routine will always produce the same sequence. Repeatability, while seemingly undesirable, is useful for debugging or studying simulation algorithms; system-provided "black box" random-number generators are not open to inspection and control. We will derive a seed value from the date and time obtained from the operating system, thereby ensuring a different sequence each time the routine presented below is called. The construction of a callable RANDOM function based on the linear congruential method for producing sequences of pseudorandom integers will bring together many of our previous topics. We will then call the RANDOM function from brief test programs written in several high-level languages. 7.8.2 RANDOM: Developing the FunctionTo develop our RANDOM function, we need to augment the basic Itanium instruction set with methods for multiplication and obtaining the remainder from integer division. A traditional nonrepeating seed value is obtained from a system call that returns an integer representing the current system time. A versatile RANDOM function should return integer values in some interval instead of the full 64-bit value that it generates. Accordingly, it needs two ins for the low and high ends of the interval. RANDOM will return a value in that interval in ret0 (register r8) each time it is called. The function must also be self-initializing; on the first call it will obtain an external seed value, and in subsequent calls it will use the previously generated value as the next seed. The seed value needs to be stored in a persistent location, outside of the register and memory stacks, which are relinquished and reset with each call. RANDOM utilizes a read-write data section that is maintained between invocations of the function; a seed value initialized to zero indicates that the function has not been previously called and that the function should obtain an initial seed from the system. The following pseudocode shows the overall conceptual design for a complete RANDOM function: make register stack claim save caller's resources from being overwritten if first-time-called (i.e., SEED is 0) then get SEED from system clock (use gettimeofday) store SEED in data section endif get SEED from data section multiply by some chosen m (use BOOTH) add some chosen a perform modulo 2^64 operation (easy) store result as next SEED perform modulo RANGE operation (use uint64_rem_min_lat) add low end of caller's specified interval ensure final result is in r8 restore caller's resources return to caller Figure 7-9 shows the complete Itanium assembly language function based upon this outline. In deciding on a method to pass arguments in and out of the RANDOM function, we followed the pattern typical of mathematical functions in the C language: RANDOM expects two ins representing the values of LOW and HIGH, and produces a scaled random number as the return value. Figure 7-9 RANDOM: Illustrating a function that calls other functions// RANDOM Random Number Generator // Function RANDOM(LOW,HIGH) returns a pseudorandom // integer between LOW and HIGH. The random integer is // generated by a multiplier/adder method, with "MOD" used // to bring the integer into the range [LOW --> HIGH]. // Seed is initialized using the system time. MULT = 3141592653589793221 // Multiplier TEN6 = 1000000 // 1,000,000 (one million) .global booth, uint64_rem_min_lat, gettimeofday .data // Declare storage .align 8 // Desired alignment SEED: data8 0 // Seed for algorithm TIME: // For gettimeofday SECS: data8 0 // Seconds USEC: data8 0 // Microseconds VOID: data4 0 // (extra time zone) data4 0 // (info not needed) .text // Section for code .align 32 // Desired alignment .global random, random_ // These three lines .proc random, random_ // mark the mandatory random: // function entry random_: // FORTRAN needs underscore .prologue 12,r32 // Mask for rp, ar.pfs only alloc loc1 = ar.pfs,2,5,2,0 // ins, locals, outs .save rp,loc0 // Must save return address mov loc0 = b0 // to our caller .body // Now we really begin... first: add loc2 = @gprel(SEED),gp;; // loc2 -> SEED ld8 loc3 = [loc2];; // loc3 = SEED mov loc4 = gp // Save gp cmp.ne p6,p0 = 0,loc3 // First time called? (p6) br.cond.sptk.many gen // Not usually! add out0 = @gprel(TIME),gp // out0 -> TIME add out1 = @gprel(VOID),gp // out1 -> VOID br.call.sptk.many b0 = gettimeofday // System time mov gp = loc4;; // Restore gp add r2 = @gprel(SECS),gp;; // r2 -> SECS ld8 out0 = [r2] // SECS is multiplicand movl out1 = TEN6 // TEN6 is multiplier br.call.sptk.many b0 = booth // r9,r8 = out0 * out1 mov gp = loc4;; // Restore gp add r2 = @gprel(USEC),gp;; // r2 -> USEC ld8 loc3 = [r2];; // USEC add loc3 = loc3,r8;; // Total microseconds gen: mov out0 = loc3 // SEED is multiplicand movl out1 = MULT // MULT is multiplier br.call.sptk.many b0 = booth // r9,r8 = out0 * out1 mov gp = loc4 // Restore gp add r8 = 1,r8;; // Product + 1 modulo 2^64 st8 [loc2] = r8 // Store as next seed mov out0 = r8 // RND sub out1 = in1,in0;; // HIGH - LOW add out1 = 1,out1 // RANGE br.call.sptk.many b0 = uint64_rem_min_lat // Modulo mov gp = loc4 // Restore gp add r8 = r8,in0 // LOW + (RND mod RANGE) done: mov b0 = loc0 // Restore return address mov ar.pfs = loc1 // Restore caller's ar.pfs br.ret.sptk.many b0;; // Back to caller .endp random // Mark end of procedure .include "uint64_rem_min_lat.s" The gettimeofday function returns time-related information from the system. The first argument is the address of a structure consisting of two successive quad words to receive the number of seconds and microseconds since 00:00 on January 1, 1970 UTC (Coordinated Universal Time, formerly Greenwich Mean Time or GMT). The second argument is the address of a second structure, returning time zone information; we do not use that information or a status code that may be returned in register r8. RANDOM uses just the low-order portion of the full 128-bit product from calculating the number of microseconds since 1970. Rather than link the Intel remainder routine at compile time, we included it using the .include assembler directive. Some assemblers may produce a warning message because the Intel file contains an additional .text directive. We had to provide an additional global symbol random_ with a trailing underscore in order to make the RANDOM function compatible with the way that FORTRAN sets up its external calls. 7.8.3 High-Level Language Calling ProgramsIn order to demonstrate the generality and capability of a fully defined calling standard, we have written very brief calling programs for the RANDOM function in two standard high-level languages, C and FORTRAN. We have attempted to make these programs as similar in form as possible without straying beyond the standard syntax rules for either. These languages differ in their defaults for argument passing (Table 7-1), and we have included a few comments with each program. The main point is to show that, with correct argument passing, a single version of RANDOM, or any assembly language routine, is callable from different languages. Once the function has been tested and documented, the writer of a calling program needs only to ensure that the arguments are formulated properly and that any obligatory "external" declarations are furnished as required by other languages. We used the very simplest method of output afforded by each language, with no regard for left or right justification or the presence of leading zeros, because the appearance of the output was not of concern. We also chose common syntax for each language that was acceptable to the broadest range of compilers. Calling from a C programFigure 7-10 contains a very simple program written in the C language to call the RANDOM function. Appropriate command lines for compiling and linking the components into a complete executable program are as follows: L> gcc -O0 -o bin/random randc.c random.s booth.s L> ecc -O0 -o bin/random randc.c random.s booth.s H> aCC -O0 +DD64 -Ae -o bin/random randc.c random.s booth.s H> cc -O0 +DD64 -o bin/random randc.c random.s booth.s H> cc_bundled +DD64 -o bin/random randc.c random.s booth.s Each time this program runs, it prints a different sequence of 20 small integers. Figure 7-10 Calling program in C// This C calling program will be linked to the // function that generates random numbers. #include <stdio.h> int main () { unsigned long long i, low=1, high=6; if ( sizeof(long int) != 8 ) { printf( "RANDC requires compiling with +DD64\n" ); return(1); } // In C, scalars are passed by value by default i=0; while ( i<20 ) { printf( "%llu\n",random(low,high) ); i++; }; return 0; } Figure 7-11 Calling program in FORTRANPROGRAM randf * This FORTRAN calling program will be linked to the * function that generates random numbers. EXTERNAL random INTEGER*8 random, I * In FORTRAN, scalars are passed by reference by default. DO 100 I = 1, 20 PRINT *, random( %VAL(1), %VAL(6) ) 100 CONTINUE END Calling from a FORTRAN programFigure 7-11 contains a very simple program written in the FORTRAN language to call the RANDOM function. The required symbolic label random_ in the assembly language source is recognized for the function name random (without _) in FORTRAN. By default, FORTRAN passes arguments by reference, while RANDOM expects them to be passed by value. The %VAL intrinsic function is an extension to the language recognized by most modern compilers. Appropriate command lines for compiling and linking the components into a complete executable program are as follows: L> g77 -O0 -o bin/random randf.f random.s booth.s L> efc -O0 -o bin/random randf.f random.s booth.s H> f90 -O0 +DD64 -o bin/random randf.f random.s booth.s where efc is the command for Intel's FORTRAN compiler for Linux. Each time that this program runs, it prints a different sequence of 20 small integers. |