Main Methods of Optimizing Mathematical Calculations

In the C++ .NET programming environment, as well as in other high-level languages, you can use assembly language to develop efficient solutions for optimizing crucial fragments of a program. The results may be varied in terms of performance and program code size . Below, we will consider several examples that are used successfully in practice. They use assembly language to improve the quality of the program code. We will examine how mathematical operations can be optimized. There are several methods for doing so.

Method 1. Using Integer Instructions

Instead of the FPU commands, you can use integer instructions. The data transfer operations can be replaced by more efficient integer commands. For example, commands such as

 fld  QWORD PTR [ESI]  fstp QWORD PTR [EDI]

can be replaced by these:

 mov EAX, [ESI]  mov EBX, [ESI+4]  mov [EDI], EAX  mov [EDI+4], EBX

Method 2. Optimizing Zero Test for Floating-point Numbers

To check if a floating-point number is equal to zero (0), you can use the integer instructions as well. Note the code that implements the zero test by using a sequence of the FPU commands. Here is the source code of a simple C++ .NET console application using this test (Listing 2.10).

Listing 2.10: Testing if the floating-point number is equal to zero (0)

 // CHANGE_FPU_INT.cpp : Defines the entry point for the console  // application  #include "stdafx.h"  int _tmain(int argc, _TCHAR* argv[])  {    float f1 = 0;    float *pf1 = &f1;    bool  isZero;    while (true)      {       printf("\nEnter float value: ");       scanf("%f", pf1);       _asm {           mov      ECX, 1           mov      EBX, DWORD PTR pf1           finit           fld      DWORD PTR [EBX]           ftst           fstsw    AX           sahf           jz       ex           mov      ECX, 0      ex:           mov      DWORD PTR is Zero, ECX           fwait    }     if (isZero)  printf("Entered value is equal to 0\n");     else        printf("Entered value is not equal to 0\n");     }   return 0;  }

An equivalent variant of the program using the ordinary assembly commands may look like this (here, only the assembly code fragment is shown):

 . . . _asm {        mov       ECX, 1        mov       EBX, DWORD PTR pf1        mov       EAX, DWORD PTR [EBX]        add       EAX, EAX        jz        ex        mov       ECX, 0    ex:        mov       DWORD PTR isZero, ECX       }    . . .

For further optimization to the zero test algorithm, try to eliminate the branching of the program code. To do this, find an equivalent for the jz ex conditional jump command. For example, you can use the cmov command with the corresponding condition. With these changes, the assembly code fragment will look like this:

 . . .  _asm {        mov       ECX, 1        mov       EBX, DWORD PTR pf1        mov       EAX, DWORD PTR [EBX]        add       EAX, EAX        cmovz     EAX, ECX        mov       ECX, 0        cmovnz    EAX, ECX        mov       DWORD PTR isZero, EAX       }    . . .

Take into account that the latter two code fragments work with the operands in double-word format. For float numbers with double precision (QWORD), you need to test only the bits numbered 32 ˆ’ 64. If these bits are equal to 0, then the number is equal to 0.

Fig. 2.7 shows the application window for all three modifications of the program code.

Fig. 2.7: Application that tests whether the given floating-point number is equal to zero (0)

Method 3. Using the LEA Commands for Optimization

To optimize mathematical calculations, you can also use the lea commands to load the address. For example, a command such as

 lea EAX, 3[-100][EDX+ECX]

can replace a whole group of the following commands:

 mov EAX, ECX  add EAX, EDX  add EAX, 100  sub EAX, 3

In Listing 2. 11, note the source code of a small console application using this command.

Listing 2.11: The use of the LEA instruction as an arithmetic command

 // LEAEX.cpp : Defines the entry point for the console application  #include "stdafx.h"  int _tmain(int argc, _TCHAR* argv[])  {      int i1, i2, ires;      while (true)       {   printf("\nEnter i1 > ECX: ");         scanf("%d", &i1);         printf("\nEnter i2> EDX: ");         scanf("%d", &i2);    _asm {          mov EDX, DWORD PTR i1          mov ECX, DWORD PTR i2          lea EAX, 3 [   100] [EDX+ECX]          mov DWORD PTR ires, EAX          }      printf("Calculated result = %d\n", ires);   }   return 0;  }

The source code of the program is simple and does not require further explanation. Fig. 2.8 shows the window with this application running.

Fig. 2.8: Using the LEA (address loading) command for performing mathematical operations

We have considered only a part of the whole set of assembly mathematical functions. This consideration encourages you to use the full range of FPU features in your programs in assembly language. If necessary, you can use assembly language to modify existing mathematical algorithms, or even write your own, which are not found in high-level languages.