| < Free Open Study > |
26.4. ExpressionsMuch of the work in a program is done inside mathematical or logical expressions. Complicated expressions tend to be expensive, so this section looks at ways to make them cheaper.
Cross-Reference For more information on expressions, see Section 19.1, "Boolean Expressions." Exploit Algebraic IdentitiesYou can use algebraic identities to replace costly operations with cheaper ones. For example, the following expressions are logically equivalent: not a and not b not (a or b) If you choose the second expression instead of the first, you can save a not operation.
Although the savings from avoiding a single
not
operation are probably inconsequential, the general principle is powerful. Jon Bentley describes a program that
Use Strength ReductionAs mentioned earlier, strength reduction means replacing an expensive operation with a cheaper one. Here are some possible substitutions:
Suppose you have to evaluate a polynomial. If you're rusty on
Visual Basic Example of Evaluating a Polynomialvalue = coefficient( 0 ) For power = 1 To order value = value + coefficient( power ) * x^power Next
If you're thinking about strength reduction, you'll look at the exponentiation operator with a jaundiced eye. One solution would be to replace the exponentiation with a multiplication on each pass through the loop, which is analogous to the strength-reduction case a few sections ago in which a multiplication was
Visual Basic Example of a Reduced-Strength Method of Evaluating a Polynomialvalue = coefficient( 0 ) powerOfX = x For power = 1 to order value = value + coefficient( power ) * powerOfX powerOfX = powerOfX * x Next
This produces a noticeable advantage if you're working with second-order polynomials—that is, polynomials in which the highest-power
If you're serious about strength reduction, you still won't care for those two floatingpoint multiplications. The strength-reduction principle suggests that you can further reduce the strength of the operations in the loop by
Visual Basic Example of Further Reducing the Strength Required to Evaluate a Polynomialvalue = 0 For power = order to 1 Step -1 value = ( value + coefficient( power ) ) * x Next value = value + coefficient( 0 )
This method eliminates the extra
powerOfX
variable and
This is a good example of theory not holding up very well to practice. The code with reduced strength seems like it should be faster, but it isn't. One possibility is that
Initialize at Compile TimeIf you're using a named constant or a magic number in a routine call and it's the only argument, that's a clue that you could precompute the number, put it into a constant, and avoid the routine call. The same principle applies to multiplications, divisions, additions, and other operations.
I once needed to compute the base-two logarithm of an integer, truncated to the
log(x)base = log(x) / log(base) Given this identity, I could write a routine like this one: C++ Example of a Log-Base-Two Routine Based on System Routines
unsigned int Log2( unsigned int x ) {
return (unsigned int) ( log( x ) / log( 2 ) );
}
Cross-Reference
For details on binding
This routine was really slow, and because the value of log(2) never changed, I replaced log(2) with its computed value, 0.69314718 , like this: C++ Example of a Log-Base-Two Routine Based on a System Routine and a Constant
const double LOG2 = 0.69314718;
...
unsigned int Log2( unsigned int x ) {
return (unsigned int) ( log( x ) / LOG2 );
}
Since log() tends to be an expensive routine—much more expensive than type conversions or division—you'd expect that cutting the calls to the log() function by half would cut the time required for the routine by about half. Here are the measured results:
In this case, the
System routines are expensive and provide accuracy that's often
In the previous example, the
Log2()
routine returned an integer value but used a floating-point
log()
routine to compute it. That was overkill for an integer result, so after my first attempt, I wrote a series of integer tests that were
C++ Example of a Log-Base-Two Routine Based on Integers
unsigned int Log2( unsigned int x ) {
if ( x < 2 ) return 0 ;
if ( x < 4 ) return 1 ;
if ( x < 8 ) return 2 ;
if ( x < 16 ) return 3 ;
if ( x < 32 ) return 4 ;
if ( x < 64 ) return 5 ;
if ( x < 128 ) return 6 ;
if ( x < 256 ) return 7 ;
if ( x < 512 ) return 8 ;
if ( x < 1024 ) return 9 ;
...
if ( x < 2147483648 ) return 30;
return 31 ;
}
This routine uses integer operations, never converts to floating point, and blows the doors off both floating-point versions:
Most of the so-called "
Another option is to take advantage of the fact that a right-shift operation is the same as dividing by two. The number of times you can divide a number by two and still have a nonzero value is the same as the log2 of that number. Here's how code based on that observation looks:
{% if main.adsdop %}{% include 'adsenceinline.tpl' %}{% endif %} To non-C++ programmers, this code is particularly hard to read. The complicated expression in the while condition is an example of a coding practice you should avoid unless you have a good reason to use it. This routine takes about 350 percent longer than the longer version above, executing in 2.4 seconds rather than 0.66 seconds. But it's faster than the first approach, and it adapts easily to 32-bit, 64-bit, and other environments.
Use the Correct Type of Constants
Use named constants and literals that are the same type as the variables they're assigned to. When a constant and its
A less advanced compiler or an interpreter generates code for a run-time conversion, so you might be stuck. Here are some differences in performance between the initializations of a floating-point variable x and an integer variable i in two cases. In the first case, the initializations look like this: x = 5 i = 3.14
and require type conversions,
x = 3.14 i = 5 and don't require type conversions. Here are the results, and the variation among compilers is once again notable:
Precompute ResultsA common low-level design decision is the choice of whether to compute results on the fly or compute them once, save them, and look them up as needed. If the results are used many times, it's often cheaper to compute them once and look them up the rest of the time.
This choice
In a space-wars video game, for example, the programmers initially computed gravity coefficients for different distances from the sun. The computation for the gravity coefficients was expensive and affected performance. The program recognized relatively few distinct distances from the sun, however, so the programmers were able to precompute the gravity coefficients and store them in a 10-element array. The array lookup was much faster than the expensive computation.
Cross-Reference For more on using data in tables instead of complex logic, see Chapter 18, "Table-Driven Methods." Suppose you have a routine that computes payment amounts on automobile loans. The code for such a routine would look like this: Java Example of a Complex Computation That Could Be Precomputed
double ComputePayment(
long loanAmount,
int months,
double interestRate
) {
return loanAmount /
(
( 1.0 - Math.pow( ( 1.0 + ( interestRate / 12.0 ) ), -months ) ) /
( interestRate / 12.0 )
);
}
The formula for computing loan payments is complicated and
How big would the table be? The widest-
Most of the computation doesn't depend on loanAmount , however, so you can put the really ugly part of the computation (the denominator of the larger expression) into a table that's indexed by interestRate and months . You recompute the loanAmount part each time:
Java Example of
|
|
Language |
Straight Time |
Code-Tuned Time |
Time Savings |
Performance Ratio |
|---|---|---|---|---|
|
Java |
2.97 |
0.251 |
92% |
10:1 |
|
Python |
3.86 |
4.63 |
-20% |
1:1 |
Depending on your circumstances, you would need to precompute the
loanDivisor
array at program initialization time or read it from a disk file. Alternatively, you could initialize it to
, compute each element the first time it's requested, store it, and look it up each time it's
You don't have to create a table to take advantage of the performance gains you can achieve by precomputing an expression. Code similar to the code in the previous examples raises the possibility of a different kind of precomputation. Suppose you have code that computes payments for many loan amounts, as shown here:
double ComputePayments(
int months,
double interestRate
) {
for ( long loanAmount = MIN_LOAN_AMOUNT; loanAmount < MAX_LOAN_AMOUNT;
loanAmount++ ) {
payment = loanAmount / (
( 1.0 - Math.pow( 1.0+(interestRate/12.0), - months ) ) /
( interestRate/12.0 )
);
...
<-- 1
}
}
(1) The following code would do something with payment here; for this example's point, it doesn't matter what.
Even without precomputing a table, you can precompute the complicated part of the expression outside the loop and use it inside the loop. Here's how it would look:
double ComputePayments(
int months,
double interestRate
) {
long loanAmount;
double divisor = ( 1.0 - Math.pow( 1.0+(interestRate/12.0). - months ) ) /
<-- 1
( interestRate/12.0 );
<-- 1
for ( long loanAmount = MIN_LOAN_AMOUNT; loanAmount <= MAX_LOAN_AMOUNT;
loanAmount++ ) {
payment = loanAmount / divisor;
...
}
}
(1) Here's the part that's precomputed.
This is similar to the techniques suggested earlier of putting array references and pointer dereferences outside a loop. The results for Java in this case are comparable to the results of using the precomputed table in the first optimization:
|
Language |
Straight Time |
Code-Tuned Time |
Time Savings |
Performance Ratio |
|---|---|---|---|---|
|
Java |
7.43 |
0.24 |
97% |
30:1 |
|
Python |
5.00 |
1.69 |
66% |
3:1 |
Python improved here, but not in the first optimization attempt. Many times when one optimization does not produce the desired results, a seemingly similar optimization will work as expected.
Optimizing a program by precomputation can take several forms:
Computing results before the program executes, and wiring them into constants that are assigned at compile time
Computing results before the program executes, and hard-coding them into variables used at run time
Computing results before the program executes, and putting them into a file that's loaded at run time
Computing results once, at program startup, and then referencing them each time they're needed
Computing as much as possible before a loop begins, minimizing the work done inside the loop
Computing results the first time they're needed, and storing them so that you can retrieve them when they're needed again
If you find an expression that's repeated several times, assign it to a variable and refer to the variable rather than recomputing the expression in several places. The loan-calculation example has a common subexpression that you could eliminate. This is the original code:
payment = loanAmount / (
( 1.0 -- Math.pow( 1.0 + ( interestRate / 12.0 ), -months ) ) /
( interestRate / 12.0 )
);
In this sample, you can assign
interestRate/12.0
to a variable that is then referenced twice rather than computing the expression twice. If you have
monthlyInterest = interestRate / 12.0;
payment = loanAmount / (
( 1.0 -- Math.pow( 1.0 + monthlyInterest, -months ) ) /
monthlyInterest
);
The savings in this case don't seem impressive:
|
Language |
Straight Time |
Code-Tuned Time |
Time Savings |
|---|---|---|---|
|
Java |
2.94 |
2.83 |
4% |
|
Python |
3.91 |
3.94 |
-1% |
It appears that the
Math.pow()
routine is so costly that it overshadows the savings from subexpression
| < Free Open Study > |