29.

What We've Learned

In this chapter on the Saboteur Data bug pattern we've learned the following:

  • Saboteur Data is often responsible when either the syntactic or the semantic constraints of complex input data or legacy data is violated.

  • The unpredictable nature of this bug rests in the fact that some actions call up the bits of corrupt data while other actions don't.

  • This saboteur data can stay in the system indefinitely, much like a sleeper spy, causing no trouble until the particularly troublesome bit of data is accessed.

  • The bug can come at you in a variety of ways, as many ways as there are data-input avenues.

  • A syntactic data error can occur by manual editing or automated generation of a file.

  • With a syntactic error, a simple parse—like splitting a text line into two Strings, one each before and after the first space—wouldn't catch a minor data corruption such as a missing comma separator between entries.

  • The results of the syntactic error above: if the program expects entries to be separated with a space and a comma, the program may crash; if not, the program may accept the two entries as one and propagate the error.

  • In a semantic error, expectations of elements can be violated. If an expectation of the data in one table is that every element in each set is a domain entry in another table, and we violate this expectation, we may throw an exception when we try to read an element in the second table that isn't there.

  • The best defense against this pattern is one that is universally employed by compiler and interpreter developers. Because the input data to these programs is so complex, developers perform as thorough an integrity check as possible when first reading the input rather than later.

  • The practice of parsing input is a way of eliminating bugs. In fact, any program that reads data—no matter how simple—should parse it.

  • Type checking, another elimination method, is an example of a semanticlevel check on the integrity of a program.

  • If you suspect an occurrence of this bug with data that has already been stored, you can iterate over the data, accessing each bit as it would be accessed in the deployed application and ensuring that everything works as expected. When the data is stored in an immutable database or finite store, such an offline integrity check can also serve as a performance optimization.

  • One caveat: offline checking of data integrity should replace online checks only when the data is immutable and only when there is no chance that the data will be corrupted while reading it from storage.

  • A saboteur might be undetectable because the data necessary to perform all the checks won't be available until after the saboteurs are stored away and inaccessible offline.

  • A saboteur might be undetectable because the complete set of constraints is not even computable (as is the case for many programming languages).

  • A saboteur might be undetectable because the constraints are computable, but the resources required to check them are beyond the access of the program.

The Golden Rule to eliminating data saboteurs: Any program that reads data should parse the data. Good luck in stamping them out!

In Chapter 14 we'll examine the Broken Dispatch—when an overloaded method suddenly breaks code you haven't modified.



Bug Patterns in Java
Bug Patterns In Java
ISBN: 1590590619
EAN: 2147483647
Year: N/A
Pages: 95
Authors: Eric Allen

flylib.com © 2008-2017.
If you may any questions please contact us: flylib@qtcs.net