Prevention | Preventative Programming Techniques: Avoid and Correct Common Mistakes (Charles River Media Programming)

< Day Day Up >

The problem is determining when the optimization is required. The simple answer is at the very end of development, but this is not enough information for practical purposes on a real-world project. We must come up with better metrics for when to optimize if we are to prevent ourselves from optimizing too early. Without guidelines, the temptation will always be there, and in reality there are some cases where optimizations will be needed before the end of the project. Even more common is the need to avoid coding practices that make later optimizations difficult.

The Simplest Technique: Do Not Optimize

The simplest prevention technique is not to optimize. Resist the temptations, and wait until the end of development. Then, only optimize to achieve your performance goals; otherwise, you will be wasting time and money when the application could be in use already. This is an easy goal to say, but much harder to realize. The best tool at your disposal for achieving this goal is discipline. Think about every algorithm in terms of flexibility and clarity, and then decide whether you have accidentally considered performance. If performance was a motivating factor, revisit the other options and see if one of them is clearer and easier to modify. When possible, it is also very useful to have input from another programmer who is not directly involved with the algorithm in question. Just make sure that the other programmer understands the importance of avoiding premature optimization, and that you are looking for clear and robust code as opposed to performance-oriented code.

There is an exception to the rule of no premature optimization, but it should only be used after very careful consideration. If it is not possible to run and test the application due to extremely low performance, the only option might be optimization. In the “Cures” section, we will talk more about how to properly optimize the code even when it has to be done early. The important thing to remember about this exception is to take it very seriously and to only optimize enough to continue development for a reasonable amount of time. Do not try to optimize up to performance specifications, or guess how much optimization is necessary to take you to the end of the project. Chances are that the guess will be wrong. Instead, estimate how much performance is necessary for another one to three months of development before optimization needs to be revisited. Often, this will last longer than expected, and you will not have wasted as much time on optimization that might be thrown away by the end of the project anyway. Again, always do this with great caution, or you could be creating problems for later development.

K.I.S.S. (Keep It Simple, Stupid)

Keep it simple, stupid. Not the most politically correct of statements, it nevertheless applies to many disciplines, including software engineering. Human beings have a penchant for making tasks more complicated than they need to be, and programmers are especially good at this. Therefore, while it might seem obvious that finding a simple solution is often the best course of action, it is necessary to keep reminding everyone of this to prevent straying from the proper course.

When it comes to optimization, keeping it simple means waiting until the end of development to add the complications inherent in most optimizations. Avoiding complicated code serves two purposes. The obvious effect is improved development speed and flexibility. Less obvious is the fact that simplicity can allow better optimizations than would otherwise be made. This follows from the fact that higher-level optimizations provide greater performance improvements on average than lower-level optimizations, and these higher-level optimizations are much easier to achieve if the underlying code is simple to understand and modify.

High-Level Languages

Computer languages create a large amount of controversy among software engineers, with many taking this almost to the level of a religion. While there are many valid arguments out there for this language or that language, one claim that never holds states that to truly optimize an application, it must be written in as low level a language as possible. This fallacy leads many to scorn the use of higher-level languages such as C++ and Java. Do not do this; you will be trading off development speed and flexibility for an uncertain gain in performance and a more likely outcome of a canceled project.

Assembly Only

One story involves a programmer evangelizing the use of assembly for the creation of applications. As proof, he offered a sound editing application written by him. The application performed well and had a reasonable number of features. Upon asking why more applications were not written in this manner, another programmer responded with a question of his own. He asked, “How long did it take to make this application?” The assembly programmer responded, “Eight years, why?” The simple fact was that an application that takes eight years to complete is extremely unlikely to be viable as a commercial application that will turn a profit. If your project has unlimited time and no concerns about money, you might be able to write it all in assembly, and even then, the complexity of writing assembly might prevent you from fully optimizing the application.

The complexity of modern computer architectures provides another motivation for using higher-level languages. Many modern processors have reasonable large instruction sets and odd constraints caused by pipelining, timing, and other quirks. It would be impossible to train everyone on a team in the particulars of each processor they are to work with, especially if the team moves from one platform to another on a regular basis. However, higher-level languages provide a solution to this by allowing an expert in the architecture to write a compiler that uses the higher-level information to optimize the low-level instructions generated during compilation. This frees the rest of the development community to focus on issues specific to their applications. Along with the benefit of hiding the complexity of a processor’s architecture, the higher-level language allows the same code to be compiled on multiple architectures with only minor compatibility changes. With the common occurrence of multiplatform applications in recent years, this is a very important feature. Without this ability, it would be prohibitively expensive to develop for more than one platform.

Encapsulation and Abstraction

Surprisingly, one of the most important coding practices for proper optimization can have a negative impact on application speed. Encapsulation is the collection of related code into a single unit such as a class or module, which can then use abstraction to hide the details of the data and implementation behind an interface that is subjected to fewer changes than the internals (Figure 1.5). Object-oriented languages have language features that can enforce these design decisions. Even if the language you are using does not directly allow enforcement of this design, proper discipline and written standards can still be used to follow these practices. No matter which approach is taken, usually overhead is involved, primarily through the indirection that results from proper encapsulation and abstraction. If extra overhead results from using encapsulation and abstraction, how could these practices result in better optimization? The answer to this seemingly contradictory situation results from important optimization principles.

click to expand
Figure 1.5: Relationship between encapsulation and abstraction.

You do not want to optimize until you have determined what would most benefit from optimization. Proper encapsulation assists in this by allowing profiling information to show the correct location of slow code. Profilers, discussed in more detail later, have their own restrictions on performance and information detail. Organizing the code into discreet units allows the concentration of profiling efforts on the most likely targets for optimization.

Once an attempt at an optimization is made, the performance gain must be measured to determine if the desired result was achieved or if performance was actually reduced. Encapsulation allows for more focused optimizations and better tracking of the results of individual optimizations. Several different algorithms can be under consideration for an optimization, without a clear indication of which one would be most beneficial. Abstraction reduces the implementation time of testing multiple algorithms, thus providing an advantage in achieving the best optimization.

Optimizations should be saved until the end of development as much as possible, but you still want to make as many optimizations as possible when the optimization phase does come along. Abstraction allows optimizations to be made only on encapsulated data and implementation without affecting the remainder of the code base. By limiting the amount of code that is required to change, more optimizations can be made without large costs in development and testing time.

Strategy Pattern

One of the most useful methods of encapsulation and abstraction for optimization purposes is the Strategy Design Pattern [GoF95]. The basic idea of the Strategy Pattern is to abstract the interface to an algorithm so that multiple algorithms for the same functionality can be encapsulated in interchangeable objects. This pattern is described in detail in the Design Pattern book by the Gang of Four, so here we will just examine how it applies to optimization.

Use the Strategy Pattern whenever possible for encapsulating any reasonable sized algorithm. This will aid greatly during optimization by allowing an algorithm to be switched by changing a single object creation call. By combining this with the Factory Pattern (Figure 1.6), which allows the creation of related objects without hard-coding the concrete class, several algorithms can be exchanged without recompiling to test for the one that provides optimal performance. Furthermore, by limiting the interaction to the interface of the Strategy, the risk from changing algorithms is greatly reduced.

click to expand
Figure 1.6: Use the Factory Pattern to create different concrete Strategy instances for testing. Each concrete Strategy implements the abstract Strategy interface and is therefore interchangeable.

Another potential benefit of the Strategy Pattern is the ability to dynamically change the algorithm that is used for a particular operation. This allows for the possibility of using different algorithms based on cues from the input data set in real time. While this technique is limited in the number of places it can be used, without using the Strategy Pattern it would not be possible at all.

Editors: Tools of the Trade

Earlier we discussed a form of optimization that should never be performed: optimizing the amount of text in the source code. While the reasons for avoiding this should be obvious, many programmers still fall into this trap because of the common human failing called laziness. Certainly, no one wants to do a lot of extra work that is unnecessary, but what appears to be laziness merely creates more work for us later.

Fortunately, a solution now exists that will allow us to maintain our laziness with only a small amount of up-front cost. This solution is to find one of the most important tools of the modern programmer, the editor, and learn how to use it to increase your code writing efficiency without sacrificing clarity. Here we will discuss several tools available in most of the powerful code editors available and how to take proper advantage of them.

One of the largest timesaving features that an editor designed specifically for your language of choice can offer is auto-completion of names. Auto-completion allows you to type only a portion of a name or statement and then press a single key or key combination to complete the rest of the name or statement. Ever year, the accuracy and speed of auto-completion technology is improving, leading to less and less typing while allowing longer and more descriptive names to be used. Pop-up lists are usually available to help you look for a name if you do not remember the exact name. Further pop-up comments can help to remind you what was the purpose of the code element. They are also becoming increasingly context sensitive, allowing the choices to be narrowed after fewer characters have been provided. One particular product, Visual Assist from Whole Tomato Software, even uses an intelligent guessing algorithm to provide a best guess before all ambiguities are resolved (Figure 1.7). Using this, you often only have to type one or two characters before clicking the Auto-Completion key.

click to expand
Figure 1.7: After typing only m, Visual Assist guesses from the surrounding context that m_ClampIndex is what the user wants. The user only needs to press ^TA_B to complete the entire word.

Beyond simply using this feature, you can improve upon its efficiency by careful coding standards that help the editor decide among ambiguities faster. Try to differentiate the beginning of names as much as possible to allow you to type fewer characters to get to a unique name. Choosing a similar prefix for related names can also ease recall, especially if a pop-up list is available for your editor. Finally, if your language supports the separation of names into separate namespaces, separating names into the appropriate namespaces can reduce the clutter of names to choose from in a particular context. Figure 1.8 demonstrates how namespaces combine with auto-completion can save typing. If you are working in a team environment, an official naming convention is recommended to enforce this across the entire code base.

click to expand
Figure 1.8: Using distinct namespaces and proper naming conventions, utility::example_registry::instance().run(); can be entered into Visual Assist by typing only u^TA_B::↓↓^TA_B::i^TA_B.run();

Some editors also offer a feature that allows you to create your own shortcut character sequences that can then be used with the Auto-Completion key to provide user-specific code templates (Figure 1.9). This can save a lot of typing if you place your most commonly used statements in these templates and create shortcuts that are meaningful to you. Be careful to choose shortcuts that will not interfere with other auto-completion names in your regular code base. File templates are also available in many editors to simplify creating new files and classes with less typing. These can be set to follow the coding standard of the team with correct copyright information and other standard layout conventions (Figure 1.10).

Figure 1.9: Special auto-completion sequences in IntelliJ’s IDEA editor. Typing itco then the key sequence for auto-completion will expand to the text currently in edit box, where text surrounded with $ has special meaning.

Figure 1.10: File template in IntelliJ’s IDEA editor that places standard copyright information and file creation information in a comment at the top of new Java files.

There are many other features available to specific editors that can help with code entry. Even if a feature is not available, all the major editors offer some form of extension language for adding your own functionality to the editor: Microsoft Visual C++ uses Visual Basic, SlickEdit uses SlickC, and Emacs uses a form of LISP. Spending some extra effort writing utilities to perform common functions in your editor can save large amounts of production time throughout the current project and into future projects.

< Day Day Up >