18. | Bug Patterns In Java

About This Bug Pattern

In this bug pattern, several steps are necessary in order to initialize an instance of a class. As a result, initialization is more prone to error. If one of the fields is not initialized, a NullPointerException can result.

The Symptoms and the Cause

This pattern is indicated by a NullPointerException at the point that one of the uninitialized fields is accessed; there is a class whose constructors don't initialize all fields directly. For example, consider the following code:

Unfortunately, the initialization sequence for an instance of this class is prone to bugs. You may have noticed that an exception is thrown in the second initialization step. As a result, the field that should have been set after that step is not set.

But a handler for the thrown exception may not know that the field was not set. If, in the process of recovering from the exception, it accesses the value field of the RestrictedInt in question, it may trip over a NullPointerException itself.

If that happens, we are in worse shape than we would be if the handler weren't there at all. At least the checked exception contained some clue about its cause. But NullPointerExceptions are notoriously difficult to diagnose because they (necessarily) contain very little information as to why a value was set to null in the first place. Furthermore, they occur only when the uninitialized field is accessed. That access will probably occur far away from the cause of the bug—that is, from the failure to initialize the field in the first place.

There are, of course, other errors that can occur from run-on initialization bugs. For instance:

The programmer writing the initialization code may forget to put in one of the initialization steps.
There may be an order-based dependence in the initialization steps that is unknown to the programmer, who therefore executes the statements out of order.
The class being initialized might change. New fields might be added, or old ones removed. As a result, all the initialization code in every client must be modified to set the fields appropriately. Much of the modified code will be similar, but if just one copy is missed, a bug is introduced. For this reason, run-on initializers can easily become rogue tiles (see Chapter 7).

Because of all the problems involved with Run-On Initialization, it's much better to define constructors that initialize all fields. In Listing 19-1, the constructor for RestrictedInt should take an int to initialize its value field. There's never a good reason to include a constructor for a class that leaves any of the fields uninitialized. When writing classes from scratch, that's not a difficult principle to follow.

Listing 19-1: A Simple Run-On Initialization

 class RestrictedInt {   public Integer value;   public boolean canTakeZero;   public RestrictedInt(boolean _canTakeZero) {     canTakeZero = _canTakeZero;   }   public void setValue(int _value) throws CantTakeZeroException {     if (_value == 0) {       if (canTakeZero) {         value = new Integer(_value);       }       else {         throw new CantTakeZeroException(this);       }     }     else {       value = new Integer(_value);     }   } } class CantTakeZeroException extends Exception {   public RestrictedInt ri;   public CantTakeZeroException(RestrictedInt _ri) {     super("RestrictedInt can't take zero");     ri = _ri;   } } class Client {   public static void initialize() throws CantTakeZeroException {     RestrictedInt ri = new RestrictedInt(false);     ri.setValue(0);   } }

Tip

There's never a good reason to include a constructor for a class that leaves any of the fields uninitialized.

Cures and Preventions

Let's look at several things you can do to help eliminate this type of initializer.

For Legacy Code

What if you must work with a large codebase in which a class doesn't initialize all of its fields in the constructors—a codebase littered with run-on initializers?

Unfortunately, many programmers find themselves working with legacy codebases in which a class doesn't initialize all of its fields in the constructors more often than they'd like. If the legacy codebase is large and the offending class has many clients, you may not want to modify the constructor signatures, especially if the unit tests over the code are scant. Inevitably, you'll break undocumented invariants.

Often the best thing to do is to throw out that legacy code and start fresh! That may sound like crazy talk, but the time you'll spend patching up bugs in code like that can easily dwarf the time it would take to rewrite it. Many times, I have struggled to work with large bases of legacy code with problems like this, and ultimately I come away wishing I had just started fresh.

But if throwing the code away is not an option, we can still attempt to control the potential for errors by incorporating the following simple practices:

Initialize the fields to non-null, default values.
Include extra constructors.
Include an isInitialized() method in the class.
Construct special classes to represent the default values.

Let's take a look at why we should follow these practices.

Initialize the Fields to Non-Null, Default Values

By filling in the fields with default values, you help to ensure that instances of your class will be in a well-defined state at all times. This practice is particularly important for fields of reference type that will take on the null value unless you specify otherwise.

Why? Because gratuitous uses of null values inevitably result in NullPointerExceptions. And NullPointerExceptions are bad. For one thing, they provide very little information about the true cause of a bug. For another, they tend to be thrown very far away from the actual cause of the bug. Avoid them at all costs.

And if you decide you want to use null values so that you can signal that the class is not yet completely initialized, see Chapter 10 for assistance.

Tip

Remember, gratuitous uses of null values inevitably result in NullPointerExceptions. And NullPointerExceptions are bad.

Include Extra Constructors

When you include additional constructors, you can use them in new contexts, where you don't have to include new run-on initializations. Just because some contexts are forced to use this bad code, other contexts shouldn't have to pay the price.

Place an isInitialized() Method in the Class

You can include an isInitialized() method in the class to allow for quick determination as to whether an instance has been initialized. Such a method is almost always a good idea when working with classes that require run-on initialization.

In cases in which you don't maintain these classes yourself, you can even put such isInitialized() methods into your own utility class. After all, if there is a consequence of an instance not being initialized that is observable from the outside, you can write a method to check for this consequence (even if it entails using the usually ill-advised practice of catching a RuntimeException).

Construct Special Classes to Represent the Default Values

Instead of allowing the fields to be filled in with null values, construct special classes (most likely with Singletons) to represent the default values. Then fill instances of these classes into your fields in the default constructor. Not only will you decrease the chances of a NullPointerException, but you will be able to control precisely which error does occur if these fields are accessed inappropriately. (For more on Singletons, see Resources and "The Composite and Singleton Design Patterns" in Chapter 9.)

For example, we could modify the RestrictedInt class as follows:

Listing 19-2: RestrictedInts with NonValues

 class RestrictedInt implements SimpleInteger {   public SimpleInteger value;   public boolean canTakeZero;   public RestrictedInt(boolean _canTakeZero) {     canTakeZero = _canTakeZero;     value = NonValue.ONLY;   }   public void setValue(int _value) throws CantTakeZeroException {     if (_value == 0) {       if (canTakeZero) {         value = new DefaultSimpleInteger(_value);       }       else {         throw new CantTakeZeroException(this);       }     }     else {       value = new DefaultSimpleInteger(_value);     }   }   public int intValue() {     return ((DefaultSimpleInteger)value).intValue();   } } interface SimpleInteger { } class NonValue implements SimpleInteger {   public static NonValue ONLY = new NonValue();   private NonValue() {} } class DefaultSimpleInteger implements SimpleInteger {   private int value;   public DefaultSimpleInteger(int _value) {     value = _value;   }   public int intValue() {     return value;   } }

Now, if any of your client classes that access this field were to perform an intValue() operation on the resulting element, they would first have to cast to a DefaultSimpleInteger, since NonValues don't support that operation.

The advantage of this approach is that you'll be constantly reminded (with compiler errors) at every point in the code where you forgot to cast that this method call doesn't work on the default value. Also, if at runtime you happen to access this field and it contains the default value, you'll get a ClassCastException, which will be much more informative than a NullPointerException—the ClassCastException will tell you not only what was actually there, but what the program expected to be there as well, and it will occur when the cast is attempted rather than at some later point in the execution when you dereference the value.

The disadvantage is that you'll pay in performance. Every time the field is accessed, the program will also have to perform a cast.

If you're willing to do without the compilation error messages, another solution is to include the intValue() method in interface SimpleInteger. You can then implement this method in the default class with a method that throws whatever error you'd like (and you can include any information that you'd like in the error). To illustrate this, look at the following example:

Listing 19-3: NonValues That Throw Exceptions

 class RestrictedInt implements SimpleInteger {   public SimpleInteger value;   public boolean canTakeZero;   public RestrictedInt(boolean _canTakeZero) {     canTakeZero = _canTakeZero;     value = NonValue.ONLY;   }   public void setValue(int _value) throws CantTakeZeroException {     if (_value == 0) {       if (canTakeZero) {         value = new DefaultSimpleInteger(_value);       }       else {         throw new CantTakeZeroException(this);       }     }     else {       value = new DefaultSimpleInteger(_value);     }   }   public int intValue() {     return value.intValue();   } } interface SimpleInteger {   public int intValue(); } class NonValue implements SimpleInteger {   public static NonValue ONLY = new NonValue();   private NonValue() {}   public int intValue() {     throw new       RuntimeException("Attempt to access an int from a NonValue");   } } class DefaultSimpleInteger implements SimpleInteger {   private int value;   public DefaultSimpleInteger(int _value) {     value = _value;   }   public int intValue() {     return value;   } }

This solution can provide even better error diagnostics than the ClassCastException. It's also more efficient, because no cast is required at runtime. But this solution won't require you to think about the possible values of the field at every access point.

Which solution is best? That depends partly on your personal style and partly on the performance constraints of your project.

Including Methods That Only Throw Exceptions

That last solution in the previous section may, at first glance, seem completely wrong. We added a method that does nothing but throw an exception, whenever it's called.

At first, this practice may strike you as inherently wrong and counterintuitive—after all, a class should only contain methods that actually make sense to perform on the data, right? Including classes such as these can be particularly confusing when you are teaching programmers about object-oriented programming.

For example, consider the two possible ways to define a class hierarchy for Lists, shown in Listings 19-4 and 19-5:

Listing 19-4: Lists with No Universal getters

 abstract class List {} class Empty extends List {} class Cons extends List {   Object first;   List rest;   Cons(Object _first, List _rest) {     first = _first;     rest = _rest;   }   public Object getFirst() {     return first;   }   public List getRest() {     return rest;   } }

Listing 19-5: Lists with getters in the Interface

 abstract class List {   public abstract Object getFirst();   public abstract Object getRest(); } class Empty extends List {   public Object getFirst() {    throw new RuntimeException("Attempt to take first of an empty list");   }   public List getRest() {    throw new RuntimeException("Attempt to take rest of an empty list");   } } class Cons extends List {   Object first;   List rest;   Cons(Object _first, List _rest) {     first = _first;     rest = _rest;   }   public Object getFirst() {     return first;   }   public List getRest() {     return rest;   } }

For a programmer new to object-oriented languages, the motivations behind the first version of List (the one with no universal getters) will be less confusing. Your intuition tells you that a class shouldn't contain a method unless that method does real work. But the above considerations for dealing with default classes apply equally well to this example too.

It can be quite cumbersome to continually insert casts in your code; the code can become quite wordy. Additionally, the class casts can have significant repercussions in terms of performance, especially for an often-called utility class like List.

As with all design practices, this practice is best applied with a consideration for the underlying motivation of the practice. The motivation won't always be applicable; when it isn't, the practice shouldn't be used.