15.1 Unit Testing for Existing Software

Only very few teams have to start a new software development project from scratch. A successful project is normally in the maintenance mode most of its life span. If we want to introduce a new technique such as unit tests or even test-first development to such a project we will have to deal with a few difficulties:

The existing code base was most likely developed without much thought of testability, making subsequent creation of unit tests difficult.
A refactoring effort required to improve testability (and the design) cannot be safely done when there are no tests. But because there are no tests due to poor testability, we get stuck in a chicken and egg problem.
The quality of existing classes is not known. Test cases checking merely the existing and not the actually desired functionality risk lashing down errors in the program.
The older the code base, the higher the probability that it contains a lot of unused functionality. Naturally, we don't want to implement test cases for such a functionality.

So no wonder Kent Beck [00a] considers testing one of the most frustrating areas for a development team that migrates to XP. Beck recommends resisting the temptation to create subsequent unit tests for all existing code. Depending on the project size, this would mean that the team would be tied down for weeks or even months without delivering additional functionality for the customer.

Another approach is more promising and above all less risky: while new system parts are totally subject to a test-first development, the test cases for existing classes are added one by one as needed:

As soon as legacy code has to be changed, for example, due to a new requirement or a program error, unit tests are written around that code. From now on, it is even possible to do refactoring to legacy code.
Every time a problem emerges in legacy classes, no matter how small it may be or how easily it can be fixed, a corresponding test is added.
Whenever untested classes are used, the code that checks the expected functionality in the legacy code should be added first.

If you follow these recommendations, the first thing you will notice is that the development speed will slow down. However, after some time, the testing coverage will achieve a satisfactory result even with the old code base, and the differences between old and new code are getting smaller. Another positive effect is that exactly those parts of the program that are run or changed often are those heavily tested.

Testing around Legacy Code

As plausible as this step-by-step approach may sound, it has one big draw-back: writing unit tests around legacy code is as easy as creating tests before the implementation only in rare cases. Often we find the old code base is so bad that only concurrent refactoring enables us to create unit tests at all. Michael Feathers [02a] discusses the issue and some techniques to tackle it in Working Effectively with Legacy Code.

It is, however, impossible to formulate a universal recipe for how to solve the problem in a specific case. Most of these cases require a sound mixture of caution and instinct to secure the most problematic parts of legacy code by test cases and to make them accessible to refactoring or to a complete replacement. Nevertheless, there are a few proven heuristics: ^[1]

First of all, we have to identify the component we can use to knot our test cases around. The better the object-oriented design in the old code, the easier it will be to identify an isolated and small unit. If this is not possible, if we can't find a separable unit in the worst case, we have to rely on a complete suite of function (or acceptance) tests.
Test cases should be written only for the parts of the identified component that are actually used in other parts of the program. When in doubt, a static code analysis can provide additional—but not absolute—security.
When creating test cases, we should initially concentrate on a unit's behavior visible from the outside. In this respect, existing current design documents could be helpful, but may point in the wrong direction if outdated.
The implementation of fine-grained white-box unit tests for existing code is worthwhile only provided that it is of high quality and in a stable condition, which is normally the case only after some refactoring work.
Refactoring without the safety net of unit tests is risky so that we should do it only in pairs. However, some restructuring steps are relatively unproblematic or can even be executed with corresponding tools, such as extracting a method, renaming a method, or deleting wrong comments. Here again, we should be careful to move forward only in very small steps.
Encapsulation of the old components into a facade facilitates unit testing for new system parts and allows subsequent replacement of this unit by a new development.
The easiest way to write unit tests is to write them for the low-level functions of the old code base. Sometimes, we can conquer legacy from below.
It may happen that existing code is in such a bad state that only complete rewriting promises lasting success. In this case, we should invest sufficient time and energy into the creation of acceptance tests and encapsulation into a facade.

Creating subsequent unit tests is a tricky and challenging business, mainly due to the fact that testability was not a criteria in the design of that code. This means in some cases that enormous efforts are required to turn existing software into a permanently maintainable condition. In the worst case, frustration about later testing causes the entire testing approach to collapse. These risks and investments should be weighed against the cost of a complete or partial new development. More often than initially expected, a redevelopment from scratch turns out to be the cheaper variant.

A team that is nevertheless entrusted with the task of maintaining "testless" software should mainly avoid one thing: simply getting straight down and going without first building a reassuring net of test cases. If the customer remains stubborn, insisting on "quick changes," then there is still the option of just not telling them about these indispensable quality measures.

^[1]The following summary corresponds essentially to discussions at [URL:YahooXP] and [URL:WikiUTFLC].