Flylib.com

Books Software

 
 
 

Chapter 12: Optimization


Chapter 12: Optimization

Optimization is the process of making your application run more effectively. You can optimize for many things - speed, memory space usage, disk space usage, etc. This chapter, however, focuses on speed optimization.

When to Optimize

It is better to not optimize at all than to optimize too soon. When you optimize, your code generally becomes less clear, because it becomes more complex. Readers of your code will have more trouble discovering why you did what you did which will increase the cost of maintenance of your project. Even when you know how and why your program runs the way it does, optimized code is harder to debug and extend. It slows the development process down considerably, both because of the time it takes to optimize the code, and the time it takes to modify your optimized code.

Compounding this problem is that you don't even know beforehand where the speed issues in your program will be. Even experienced programmers have trouble predicting which parts of the program will be the bottlenecks which need optimization, so you will probably end up wasting your time optimizing the wrong parts . the Section called Where to Optimize will discuss how to find the parts of your program that need optimization.

While you develop your program, you need to have the following priorities:

  • Everything is documented

  • Everything works as documented

  • The code is written in an modular, easily modifiable form

{% if main.adsdop %}{% include 'adsenceinline.tpl' %}{% endif %}

Documentation is essential, especially when working in groups. The proper functioning of the program is essential. You'll notice application speed was not anywhere on that list. Optimization is not necessary during early development for the following reasons:

  • Minor speed problems can be usually solved through hardware, which is often much cheaper than a programmer's time.

  • Your application will change dramatically as you revise it, therefore wasting most of your efforts to optimize it. [1]

  • Speed problems are usually localized in a few places in your code - finding these is difficult before you have most of the program finished.

Therefore, the time to optimize is toward the end of development, when you have determined that your correct code actually has performance problems.

In a web-based e-commerce project I was involved in, I focused entirely on correctness. This was much to the dismay of my colleagues, who were worried about the fact that each page took twelve seconds to process before it ever started loading (most web pages process in under a second). However, I was determined to make it the right way first, and put optimization as a last priority. When the code was finally correct after 3 months of work, it took only three days to find and eliminate the bottlenecks, bringing the average processing time under a quarter of a second. By focusing on the correct order, I was able to finish a project that was both correct and efficient.

[1] Many new projects often have a first code base which is completely rewritten as developers learn more about the problem they are trying to solve. Any optimization done on the first codebase is completely wasted .



Where to Optimize

Once you have determined that you have a performance issue you need to determine where in the code the problems occur. You can do this by running a profiler. A profiler is a program that will let you run your program, and it will tell you how much time is spent in each function, and how many times they are run. gprof is the standard GNU/Linux profiling tool, but a discussion of using profilers is outside the scope of this text. After running a profiler, you can determine which functions are called the most or have the most time spent in them. These are the ones you should focus your optimization efforts on.

If a program only spends 1 % of its time in a given function, then no matter how much you speed it up you will only achieve a maximum of a 1% overall speed improvement. However, if a program spends 20% of its time in a given function, then even minor improvements to that functions speed will be noticeable. Therefore, profiling gives you the information you need to make good choices about where to spend your programming time.

In order to optimize functions, you need to understand in what ways they are being called and used. The more you know about how and when a function is called, the better position you will be in to optimize it appropriately.

There are two main categories of optimization - local optimizations and global optimizations. Local optimizations consist of optimizations that are either hardware specific - such as the fastest way to perform a given computation - or program-specific - such as making a specific piece of code perform the best for the most often-occuring case. Global optimization consist of optimizations which are structural. For example, if you were trying to find the best way for three people in different cities to meet in St. Louis, a local optimization would be finding a better road to get there, while a global optimization would be to decide to teleconference instead of meeting in person. Global optimization often involves restructuring code to avoid performance problems, rather than trying to find the best way through them.