Section 18.2. Unit Testing Concepts | Java Enterprise in a Nutshell (In a Nutshell (OReilly))

18.2. Unit Testing Concepts

Before we dive into the details of testing Java code using JUnit and testing J2EE components using Cactus, let's get acquainted with the concepts and issues underlying unit testing in general.

18.2.1. Units and Dependencies

The most obvious question to ask about unit testing is, "What's a unit?" The most conservative definition of a testable "unit" is an entirely independent module of code that can be tested autonomously, separate from any other code modules. The tangible interpretation of this idealistic definition depends on the programming environment being used. In function-based programming languages, such as C, Perl, and Fortran, a unit is typically defined as a single function. Similarly, a unit in object-oriented languages like C++, Java, and C# would normally be a class.

Of course, achieving this goal of a totally independent unit of code is not entirely realistic. Virtually any code module depends on other code modules to function properly. Functional decomposition encourages algorithms to be broken down into subfunctions that are invoked by higher-order functions to solve a problem. Object-oriented design has the analogous principle of delegation of responsibilities to subordinate classes used by higher-order classes to implement particular behaviors. These common divide-and-conquer approaches to software architecture naturally lead to interdependencies among code elements.

Given this fact, achieving the goal of unit testing (i.e., verifying the valid operation of each unit in the system) takes some forethought and planning in terms of defining your test suite. The dependencies between units need to be understood at a design level, and your tests need to be crafted to exercise low-level units first and then progress to higher-level units. This "reverse pyramid" approach to sequencing your tests will help to identify base issues early in a test suite and provide a basic sense of progress in terms of system stability (the farther you go without failing a test, the more stable the system, in a loose sense).

18.2.2. Unit Testing Versus Integration Testing

It's important to note at this point the distinction between unit testing and integration testing. While unit testing strives to verify the correct operation of individual units of code in isolation, integration testing attempts to verify the correct operation of multiple units working together in a cooperative fashion, integrated together as they would be in the real runtime environment. In the remainder of this tutorial chapter on unit testing, we will occasionally stray into the realm of integration testing with some of our examples. We'd ask the testing experts reading this to forgive us for this breach of orthodoxy, and we'd ask those who are new to unit testing to remember that the distinction between unit and integration testing is important and to keep it in mind when designing tests.

18.2.3. White Box Versus Black Box Testing

It's important to recognize what kind of testing you're attempting when defining a specific test or even a suite of tests. Black box testing, as mentioned earlier, refers to the testing of the functional specifications of a unit of code from the standpoint of an external interface specification. White box testing refers to validating the functionality of a unit from the standpoint of its internal implementation.

To demonstrate the difference, imagine we are testing a shopping cart module built into an online commerce system. The shopping cart has well-specified behavior that it must implement: items that are added stay in the cart until removed, total prices add up correctly, taxable and nontaxable items are reflected in the overall price, and so forth. In our implementation, we happen to know that the cart is implemented using an efficient list structure that allocates space for items in blocks of 25. Once 25 items are added the cart, space for an additional 25 items is added to the list structure. When those are filled, space for an additional 25 items is added and so on.

If we were writing black box tests for our shopping cart, we would simply encode the behavioral specifications into tests that exercise that behavior and check for expected results. Our tests would do things like add an item and verify that it is recorded in the shopping cart, add various combinations of types of items, and check that the number and price of the items in the cart are reported correctly.

If we were engaging in white box testing , we would focus on the internal implementation details of the variable-sized list structure and verify that the shopping cart behaves correctly when the list's boundary conditions are encountered. Our white box tests would do things like add 25 items, then add the 26^th item and verify that it is recorded correctly, then remove an item so that the item count falls below 25 again, and verify that the cart data is intact after the extra block of list items is removed internally.

In practical testing scenarios, you will find yourself creating a mix of black box and white box tests, sometimes referred to as gray box testing. An initial suite of tests may be strictly based on the written specifications for a unit and therefore contain only black box tests. Later, as issues are encountered with the implementation of the unit, they will be fixed and then tests will be added to the test suite to verify that these issues do not return. This is referred to as regression testing, ensuring that future changes to the code do not cause the system to regress and exhibit prior bad behavior. In the end, the test suite for a unit will typically contain both black box and white box tests. There's no problem with this as long as you're careful to track which white box tests are coupled with specific implementation details of the code module. If the implementation details change, you want to be able to easily review these tests to ensure that they are still relevant, to prune tests that are no longer relevant, and to add new white box tests specific to the new implementation, if needed.

18.2.4. Keep the Tests Simple

If you haven't written tests beforeand if you haven't divined it from the discussion so faryou'll soon discover the main dilemma with testing: it involves more work. On top of sorting out the requirements, designing the architecture, and writing the code, there's the additional effort of writing tests for the code.^[1] The benefits of testing, as discussed earlier in this chapter, are conceptually obvious but sometimes difficult to remember when you're faced with a deadline.

^[1] Before the software engineering pundits cry foul, please note that there is no implied order to these steps in the process. I'm not suggesting that testing must follow coding (to the ire of XP advocates everywhere) nor am I assuming a waterfall approach to software development (to the ire of nearly all software engineers).

We mention this here to underscore the value of testing (short-term additional effort leads to big long-term payoffs) but also to emphasize the importance of keeping your tests as simple as possible in terms of the structure of the code that drives the tests. You want to maximize the benefits of testing while minimizing the overhead involved in writing the tests. An overengineered test suite with an excess of support code will take time to write and more time to extend in the future. And each new engineer who comes along will need to take some time to learn how the whole test structure works, on top of figuring out how the actual application code works. At best, you'll be spending more time on overhead and less time writing application code, but an even worse possibility is that you and your team members will neglect the unit tests because of frustration or the squeeze of deadlines. Simple tests obviate these problems to a large degree.

Simple tests also help keep your test results clean and unambiguous. You want to avoid complicating the picture when it comes to evaluating the test results. If there's a lot of logic tied up in the code that drives the tests, you run the risk of bugs in the test code masking bugs in the application code being tested.

18.2.5. Test Coverage

There's an adage in software engineering that states, "If you didn't test it, it doesn't work." In essence, this implies that you should assume that any code that hasn't been tested will fail at some point. This principle leads you naturally to the issue of determining what code has been tested and what code hasn't. In a real-world application, even one of moderate size, it can be difficult to assess test coverage . There are tools available that can help determine test coverage and tools that insert themselves into the runtime environment, for example, and monitor what code is executed during a given test run and report on coverage in the source code. Good, understandable, modular architectures also help enormously in terms of managing the coverage of test suites. A good object-oriented design will have well-defined module interfaces, clean delegation of responsibilities among the modules, and obvious dependencies between the modules, all of which help when defining a test suite that exercises the full expected functionality of the code under test.

Having said all this, it's also important to be realistic about test coverage. It's unreasonable to expect to have 100% coverage of your code in your test suite. A software unit, whether it's a function, an object, a component, or a web service is made up of a set of code consisting of many possible logical pathways. The code takes input data, does work, and generates some output data. In order to exhaustively test a given code unit, you would need to provide it with every possible permutation of input data and invoke the unit so that every single possible logical pathway was traversed and, therefore, every possible output was generated. Not only is it unreasonable to attempt this for every unit in your system, it's also not terribly efficient, because in all the possible permutations of data inputs and code paths, only a subset of them (often a small subset) is relevant to the real usage of the unit in practice. Determining this critical subset is a context-dependent task, but it's an important element of effective unit testing.

Also in the category of test coverage is the management of evolving code. Once you accept the goal of a test suite that covers the critical functionality of your code, you then have to accept the reality that the test suite must evolve to match the code. If new functionality is added, tests for it need to be added as well. If functionality is changed or removed, tests need to be altered or removed to suit.