25.4. Measurement | Code Complete: A Practical Handbook of Software Construction, Second Edition

< Free Open Study >

Because small parts of a program usually consume a disproportionate share of the run time, measure your code to find the hot spots. Once you've found the hot spots and optimized them, measure the code again to assess how much you've improved it. Many aspects of performance are counterintuitive. The earlier case in this chapter, in which 10 lines of code were significantly faster and smaller than one line, is one example of the ways that code can surprise you.

Experience doesn't help much with optimization either. A person's experience might have come from an old machine, language, or compiler when any of those things changes, all bets are off. You can never be sure about the effect of an optimization until you measure the effect.

Many years ago now I wrote a program that summed the elements in a matrix. The original code looked like this:

C++ Example of Straightforward Code to Sum the Elements in a Matrix

sum = 0; for ( row = 0; row < rowCount; row++ ) {    for ( column = 0; column < columnCount; column++ ) {       sum = sum + matrix[ row ][ column ];    } }

This code was straightforward, but performance of the matrix-summation routine was critical, and I knew that all the array accesses and loop tests had to be expensive. I knew from computer-science classes that every time the code accessed a two-dimensional array, it performed expensive multiplications and additions. For a 100-by-100 matrix, that totaled 10,000 multiplications and additions, plus the loop overhead. By converting to pointer notation, I reasoned, I could increment a pointer and replace 10,000 expensive multiplications with 10,000 relatively cheap increment operations. I carefully converted the code to pointer notation and got this:

C++ Example of an Attempt to Tune Code to Sum the Elements in a Matrix

sum = 0; elementPointer = matrix; lastElementPointer = matrix[ rowCount - 1 ][ columnCount - 1 ] + 1; while ( elementPointer < lastElementPointer ) {    sum = sum + *elementPointer++; }

Further Reading

Jon Bentley reported a similar experience in which converting to pointers hurt performance by about 10 percent. The same conversion had in another setting improved performance more than 50 percent. See "Software Exploratorium: Writing Efficient C Programs" (Bentley 1991).

Even though the code wasn't as readable as the first code, especially to programmers who aren't C++ experts, I was magnificently pleased with myself. For a 100-by-100 matrix, I calculated that I had saved 10,000 multiplications and a lot of loop overhead. I was so pleased that I decided to measure the speed improvement, something I didn't always do back then, so that I could pat myself on the back more quantitatively.

Do you know what I found? No improvement whatsoever. Not with a 100-by-100 matrix. Not with a 10-by-10 matrix. Not with any size matrix. I was so disappointed that I dug into the assembly code generated by the compiler to see why my optimization hadn't worked. To my surprise, it turned out that I was not the first programmer who ever needed to iterate through the elements of an array the compiler's optimizer was already converting the array accesses to pointers. I learned that the only result of optimization you can usually be sure of without measuring performance is that you've made your code harder to read. If it's not worth measuring to know that it's more efficient, it's not worth sacrificing clarity for a performance gamble.

No programmer has ever been able to predict or analyze where performance bottlenecks are without data. No matter where you think it's going, you will be surprised to discover that it is going somewhere else.
Joseph M. Newcomer

Measurements Need to Be Precise

Performance measurements need to be precise. Timing your program with a stopwatch or by counting "one elephant, two elephant, three elephant" isn't precise. Profiling tools are useful, or you can use your system's clock and routines that record the elapsed times for computing operations.

Cross-Reference

For a discussion of profiling tools, see "Code Tuning" in Section 30.3.

Whether you use someone else's tool or write your own code to make the measurements, make sure that you're measuring only the execution time of the code you're tuning. Use the number of CPU clock ticks allocated to your program rather than the time of day. Otherwise, when the system switches from your program to another program, one of your routines will be penalized for the time spent executing another program. Likewise, try to factor out measurement overhead and program-startup overhead so that neither the original code nor the tuning attempt is unfairly penalized.

< Free Open Study >