3.2 Dependencies

So far, in specifying our example program, we have ignored the part that deals with the dictionary file. Considering that the file format is not precisely specified, we must define one ourselves. Inspired by the numerous Java property files in other projects, we decide that each translation entry should be written to a separate line in the following form:^[3]

 <German word>=<translation>

Multiple entries are allowed. The first attempt at a test looks like this:

 public void testSimpleFile() {    dict = new Dictionary("C:\\temp\\simple.dic");    assertTrue(! dict.isEmpty()); }

But now we find that we have made ourselves dependent on the content of an external file. We could delete this file at the beginning of the test case and create a new one with the desired content. However, this would also cause an undesirable dependency on a file path and platform-dependent particularities. The solution is to permit any java.io.InputStream instance instead of using a file; this input stream can then be easily mapped to a file. This means that our test will become independent of files:

 import java.io.*;    ... public void testTwoTranslationsFromStream() {    String dictText = "Buch=book\n" + "Auto=car";    InputStream in = new StringBufferInputStream(dictText);    dict = new Dictionary(in);    assertTrue(! dict.isEmpty()); }

At this point, we leave it up to our readers as a practical exercise to find the simplest way to the green bar.^[4] Instead, we continue adding more assertions to the test:

 public void testTwoTranslationsFromStream() {    String dictText = "Buch=book\n" + "Auto=car";    InputStream in = new StringBufferInputStream(dictText);    dict = new Dictionary(in);    assertFalse("dict not empty", dict.isEmpty());    assertEquals("translation Buch", "book",       dict.getTranslation("Buch"));    assertEquals("translation Auto", "car",       dict.getTranslation("Auto")); }

Here lies a somewhat unpleasant thing: the StringBufferInputStream class enjoys the deprecated status since JDK 1.2. But for the time being we are short of a better idea and have to live with this little annoyance for now. Remembering the close relationship of our file format with Java property files leads to a simple implementation:

 import java.util.*; import java.io.*; public class Dictionary {    ...    public Dictionary(InputStream in) throws IOException {       this.readTranslations(in);    }    private void readTranslations(InputStream in) throws       IOException {       Properties props = new Properties();       props.load(in);       Iterator i = props.keySet().iterator();       while (i.hasNext()) {          String german = (String) i.next();          String trans = props.getProperty(german);          this.addTranslation(german, trans);       }    } }

Using "throws IOException" ensures that the test method must also throw an IOException. This is our way to entrust JUnit with exception handling to register an error just in case.

Departing from the existing tests, another one is built similarly:

 public void testTranslationsWithTwoEntriesFromStream()       throws IOException {    String dictText = "Buch=book\n" + "Buch=volume";    InputStream in = new StringBufferInputStream(dictText);    dict = new Dictionary(in);    String trans = dict.getTranslation("Buch");    assertEquals("book, volume", trans); }

At this point, it suddenly becomes obvious that the implementation by use of the Properties class led us to a dead end, because the load(InputStream) method overrides a property by the same name in the first entry, if there are duplicate entries; whereas, our specification states that all potential translations of a word should be read from the file. In addition, the behavior of the Properties class is undesirable in another aspect, namely the hash character (#).

We begin to realize that parsing the dictionary file really is a complex matter, so we eventually decide to move this functionality to a separate class, DictionaryParser. This decision is strongly backed up by an important object-oriented design heuristic: the Single Responsibility Principle [Martin02b]. This rule says that a single class should have one and only one responsibility in order to facilitate independent changes and to reduce coupling.

We have arrived at a point where we cannot or do not want to carry the current test to a successful end without doing some restructuring work first. This is the right point in time to take a step back. We preliminarily remove the open test case from the test suite^[5] to be able to refactor with all tests running. A test case is called open when the behavior it specifies has not been (fully) implemented. In this example we want to move the parsing of InputStream objects to a class called DictionaryParser; the interface of this class allows the iteration over all translation entries of a stream. And while restructuring things, we replace java.io.InputStream by java.io. Reader; this class is better suited for reading text because it also takes care of the correct conversion between bytes and characters.

To improve our readers' independent unit testing capabilities, we do not include a detailed description of all single refactoring and testing steps at this point, but limit ourselves instead to representing the result—test suites and implementation. First the test class:

 import java.io.*;    public class DictionaryParserTest extends TestCase {    public DictionaryParserTest(String name) {...}    private DictionaryParser parser;    private DictionaryParser createParser(String dictText)       throws IOException {       Reader reader = new StringReader(dictText);       return new DictionaryParser(reader);    }    private void assertNextTranslation(String german, String trans)       throws Exception {       assertTrue(parser.hasNextTranslation());       parser.nextTranslation();       assertEquals(german, parser.currentGermanWord());       assertEquals(trans, parser.currentTranslation());    }    public void testEmptyReader() throws Exception {       parser = this.createParser("");       assertFalse(parser.hasNextTranslation());    }    public void testOneLine() throws Exception {       String dictText = "Buch=book";       parser = this.createParser(dictText);       this.assertNextTranslation("Buch", "book");       assertFalse(parser.hasNextTranslation());    }    public void testThreeLines() throws Exception {       String dictText = "Buch=book\n" +          "Auto=car\n" +          "Buch=volume";       parser = this.createParser(dictText);       this.assertNextTranslation("Buch", "book");       this.assertNextTranslation("Auto", "car");       this.assertNextTranslation("Buch", "volume");       assertFalse(parser.hasNextTranslation());    } }

We can see that we also avoided code duplication in the test code and moved common functionality to private methods. In general, test classes are also part of the system, so that they should observe the same principles of simplicity and readability. A few more things got changed:

We got rid of the deprecated class StringBufferInputStream and replaced it by StringWriter.
We changed all throws-clauses so that they now throw a generic Exception instead of the more specific IOException. That way, no essential information gets lost and test case maintenance is simplified since we won't have to change the throws clause any more.

And now the pertaining implementation:

 import java.io.*; public class DictionaryParser {    private BufferedReader reader;    private String nextLine;    private String currentGermanWord;    private String currentTranslation;    public DictionaryParser(Reader unbufferedReader) throws       IOException {       reader = new BufferedReader(unbufferedReader);       this.readNextLine();    }    public String currentTranslation() {       return currentTranslation;    }    public String currentGermanWord() {       return currentGermanWord;    }    public boolean hasNextTranslation() {       return nextLine != null;    }    public void nextTranslation() throws IOException {       int index = nextLine.indexOf('=');       currentGermanWord = nextLine.substring(0, index);       currentTranslation = nextLine.substring(index + 1);       this.readNextLine();    }    private void readNextLine() throws IOException {       nextLine = reader.readLine();    } }

The interface of the parser class is iterator-like. Each programmer will probably arrive at a different result as to how the best DictionaryParser interface should look. But this is not a major problem because the tests document the interface for other developers, and refactoring can be done in future changes if necessary. All that's left to do now is to integrate the parser into the dictionary:

 public class Dictionary {    ...    private void readTranslations(Reader reader)       throws IOException {       DictionaryParser parser = new DictionaryParser(reader);       while (parser.hasNextTranslation()) {          parser.nextTranslation();          String german = parser.currentGermanWord();          String trans = parser.currentTranslation();          this.addTranslation(german, trans);       }    } }

Now we are ready to reactivate the test testTranslationsWithTwoEntriesFromStream() and rename it in testTranslationsWithTwoEntriesFromReader()—and, well, it runs perfectly. We could now implement another constructor, Dictionary(String filename), if desired. Yet another good exercise we leave for the ambitious reader to solve ;-)

Taking a look back, what happened in the previous section? Our very normal approach—small tests and small implementation steps—has shown at some point that we had better move some functionality from the CUT (Dictionary) to another class (DictionaryParser). For this reason, we were not yet able to complete a full test, but had to deal first with the new object or new class, respectively. The open test had to migrate to a sort of "inactive test stack" for the time being, from where it was removed as soon as the implementation of the new class was completed, and was then "activated."

If you are afraid you may forget such open test cases, you may want to make some notes; this also applies to other code particularities we cannot deal with right away. Later, in Chapter 6, we will learn another possibility, specifically, the use of dummy implementations to avoid the temporary deactivation of test cases under implementation. And sometimes a dummy implementation evolves into a proper one over time.

This approach is necessary whenever we need subordinate objects that do not yet exist during an implementation. Theoretically, the dependencies tree can have an arbitrary depth and an arbitrary width so that our "stack" can become arbitrarily confusing. However, in practice there are rarely more than two test cases in the "open, but deactivated" state. If you find that you are getting deeper into a large chain of dependencies, then it is normally due to the fact that the top test on the stack includes too big a chunk of functionality. If this happens, the best approach is to go back to the last 100% state and try to write a smaller test.

It is interesting that one would have worked exactly the opposite way in a program designed in dry dock.^[6] While we progressed in a top-down approach, that is, starting from the general functionality, it is much easier with a previously designed system to develop and test in a bottom-up approach, since a top-down approach requires the development of stub or dummy objects (see Chapter 6, Dummy and Mock Objects for Independence). Figure 3.1 shows a simple class diagram to highlight the difference:^[7]

Figure 3.1: Top-down versus bottom-up approaches.

Top-down. Programming and the class tests begin with class A; B and C are simulated by dummies. Next, B and C are dealt with, where the former needs another dummy for class D. In object-oriented systems, this normally means that we work our way from the outer (public) interface of a (sub) system towards the (private) inner life. Therefore, top-down is also an outside-in approach.
Bottom-up. We begin with the nondependent classes and use them to compose the more complex, dependent objects. For example, the development order would be D - B - C - A.

If two (or more) classes depend on one another, then the bottom-up approach fails, i.e., all objects have to be developed en bloc; here again, the test-first development has an advantage.

Both the top-down and the bottom-up approach can be problematic during testing as soon as the distance^[8] between the class under test and the classes required for the test increases. The bigger the distance, the more difficult the creation of the required test fixture. This problem can be mitigated to some extent by using a centralized test object generation mechanism.^[9] The cost for the creation and adaptation of these object meshes increases in line with the size of the application, and the dependencies between test cases and remote objects become more and more unclear. In addition, it becomes more difficult to allocate errors occurring in the tests to the CUT; they may be due to faulty behavior of one of the classes used. Chapter 6 will discuss in detail techniques to avoid such cases.

There remains the question of the right point in time for separation of a new class during the development of another one. We would surely have been able to accommodate the functionality of the DictionaryParser in the Dictionary class. But then, a refactoring would probably have produced this or a similar class at some point in time, just later.

The time of relocation obeys two strategies: Grow then split or split then grow [Feathers00]. The right way is normally somewhere in the middle, since we first have to grow a little to understand the necessity of a new class. This may happen earlier or later, depending on the situation, but the important thing to remember is to eventually achieve a simple design.

Splitting off a new object often leads to redundancy in test cases. Boundary and error cases that we already tackled in our tests for the higher-level object suddenly belong to the responsibilities of the new class, and thus a new test suite. For this reason, it is worth looking at existing tests to see whether or not we still need them. Our unit tests are aimed at verifying the functionality of a single unit, normally that of a single CUT. Any behavior pattern that this CUT delegates is not an integral part of that class.

What still has to be verified is that the delegation itself is correct. In our dictionary example, this would mean considering if the two test*FromReader test cases are redundant. Both test goals (testing for two lines and testing for a duplicate entry) are also verified in DictionaryParserTest. For example, it would be meaningful to replace the two test cases by a single one that would then cover a more complex scenario. One argument against this approach is the documenting character of the test cases for the users of the Dictionary class. There is no ideal method, but redundant tests are better than too few tests.

Another case of dependence occurred within the scope of the previous implementation: files. Dependence of our code on files or other external resources makes testing more difficult, because we will suddenly have to stick to the rules of the game established by others. While the programming language and the local environment can be fully controlled, here we will have to deal with proprietary protocols, access restrictions, time dependencies, and other incalculable situations, making the success of a test indeterministic or at least its implementation more difficult. In the case of the dictionary file, the trick was to modify the interface so that its testability was improved. Of course, we cannot do without any file tests at all; this topic will be discussed in Chapter 6.

Intuitively, many developers shrink back from changing their "correct" codes only to be able to test them better. Making a concession in the application code to facilitate testing is by no means damnable; it has a desirable effect: letting the tests drive the code forces us to design testable objects. "Testable" means in most cases simpler. "Simpler" means in most cases less dependent. As we know, reducing dependencies—both within a system and to the outside—is a central objective of good software design.

^[3]Any modern young programmer would choose XML and not be mistaken. We confess to being somewhat old-fashioned, and we wanted to forgo the introduction of yet another Java API here.

^[4]Hint: The passed Reader instance can be totally ignored here.

^[5]For example, by preceding the name of the test method with an underscore.

^[6]Also known as UML.

^[7]The direction of the arrows determines the navigability and thus the dependence.

^[8]In this context, "distance" means the number of objects in between.

^[9]Such a mechanism was introduced by the term ObjectMother [Schuh01].