25.5. Iteration

 < Free Open Study > 

Once you've identified a performance bottleneck, you'll be amazed at how much you can improve performance by code tuning. You'll rarely get a 10-fold improvement from one technique, but you can effectively combine techniques; so keep trying, even after you find one that works.

I once wrote a software implementation of the Data Encryption Standard (DES). Actually, I didn't write it once I wrote it about 30 times. Encryption according to DES encodes digital data so that it can't be unscrambled without a password. The encryption algorithm is so convoluted that it seems like it's been used on itself. The performance goal for my DES implementation was to encrypt an 18K file in 37 seconds on an original IBM PC. My first implementation executed in 21 minutes and 40 seconds, so I had a long row to hoe.

Even though most individual optimizations were small, cumulatively they were significant. To judge from the percentage improvements, no three or even four optimizations would have met my performance goal. But the final combination was effective. The moral of the story is that if you dig deep enough, you can make some surprising gains.

The code tuning I did in this case is the most aggressive code tuning I've ever done. At the same time, the final code is the most unreadable, unmaintainable code I've ever written. The initial algorithm is complicated. The code resulting from the high-level language transformation was barely readable. The translation to assembler produced a single 500-line routine that I'm afraid to look at. In general, this relationship between code tuning and code quality holds true. Here's a table that shows a history of the optimizations:

Optimization

Benchmark Time

Improvement

Implement initially straightforward

21:40

Convert from bit fields to arrays

7:30

65%

Unroll innermost for loop

6:00

20%

Remove final permutation

5:24

10%

Combine two variables

5:06

5%

Use a logical identity to combine the first two steps of the DES algorithm

4:30

12%

Make two variables share the same memory to reduce data shuttling in inner loop

3:36

20%

Make two variables share the same memory to reduce data shuttling in outer loop

3:09

13%

Unfold all loops and use literal array subscripts

1:36

49%

Remove routine calls and put all the code in line

0:45

53%

Rewrite the whole routine in assembler

0:22

51%

Final

0:22

98%

Note: The steady progress of optimizations in this table doesn't imply that all optimizations work. I haven't shown all the things I tried that doubled the run time. At least two-thirds of the optimizations I tried didn't work.


Cross-Reference

The techniques listed in this table are described in Chapter 26, "Code-Tuning Techniques."


 < Free Open Study > 


Code Complete
Code Complete: A Practical Handbook of Software Construction, Second Edition
ISBN: 0735619670
EAN: 2147483647
Year: 2003
Pages: 334

flylib.com © 2008-2017.
If you may any questions please contact us: flylib@qtcs.net