Java Programming Guidelines

The purpose of programming guidelines is to bring uniformity , clarity, correctness, and optimization to programs. Following these guidelines increases productivity for both the development and maintenance of the software. Common programming errors, such as forgetting to close a file, can be avoided by knowing and following these guidelines. Software maintenance is far easier when the code is readable, logically structured, and well-formatted in the first place.

General Coding Practices

Following standard coding practices is very important for software maintenance. It is very likely that many months after the code was first written, someone will have to fix bugs or add features to the software. Code that is well written makes this task much easier. Java developers have come to expect a common coding style when viewing Java source code. This style was first introduced by Sun and can be viewed on the Internet at http://java.sun.com/docs/codeconv. Virtually all source code found at the java.sun.com Web site follows these coding conventions.

This section provides a list of general and specific rules that should be followed when writing production-grade Java programs:

Don't use deprecated methods .
Don't litter your source file with code that is commented out.
Write all your Java classes using a source code template file.
Use Java API classes and other standard APIs. Know the APIs and what they have to offer.
Avoid using the * construct on import statements. Import explicit class names from the package.
Strive for simplicity when programming in an area of the program likely to be changed.
Consider obscure optimization coding techniques only in areas where change is unlikely and there is a compelling reason for optimization.
If a method becomes overly complex or long, use private methods to break up the method and enhance its readability.
Don't declare instance variables as public. Generally, use private instance variables and access them using public accessor and mutator methods. An exception to this rule is in the use of public constants.
If you never change a reference to an instance variable, declare it as final. The contents of the object can still change.
Create get/set methods on an as-needed basis.
Handle all exceptions in the program and report the exception information before exiting the program.
Catch specific exception types.
Don't use exceptions for normal processing. Exception handling should be used for non-normal processing.
Use the finally construct to avoid resource leaks. For example, close files in the finally block to ensure the file is closed whether or not an exception occurred.
Use interfaces to define constants. Classes using the global constants must implement the interface where the constants are defined.
Use interfaces when a class needs multiple behaviors or it already extends another class and a new behavior is needed.
If your class is not going to be extended, make it final. You can always change this later.
When appropriate, include a main method for testing the class. If the class is not an application entry point, comment out this method before deployment.
Use a Boolean constant to define a debug flag. Use this flag to conditionally execute the debug code. If the constant has a false value, the Java compiler optimizer omits the debugging code from the class files.

Performance and Optimization Guidelines

There are many facets to optimization: system tuning, database design, optimizing the logical flow within a program, and so on. When doing J2EE development, the performance and scalability of the system can be tuned at the JVM, OS, Database, and Application Server level without any application code change. Industrial-strength J2EE application servers such as WebLogic Server provide many tuning parameters to improve the J2EE applications' performance and scalability. Therefore, one should not be caught up in performance optimization early on in the project development life cycle. Developers should be focused on the functionalities of the systems. During performance tuning, if the bottlenecks are narrowed down to the application code, then the code needs to be rewritten. This section covers recommended optimization practices for the Java programming language. These practices, when learned, should become part of your everyday coding habits.

We recommend coding choices that promote efficiency, but not at the expense of readability. There are a number of obscure techniques for making code execute faster, but often the real benefits are marginal. First, the Java compiler's optimizer can perform a number of these programming tricks.

Note

With Java Hotspot, the optimizations are more likely to be done at runtime. See "Java Compiler and JVM Optimizations" later in this chapter for more details on Hotspot.

Although your code optimization may be significant for a few lines of code, it is often very small within the context of the entire program ^[iii]

Optimization techniques for the following Java elements are discussed in the following sections:

Collections
Methods
Objects
Arrays
Loops
Strings
Synchronization
Serialization
Garbage Collection
Java compiler and JVM

Collections

The three areas to be addressed for Java collections are synchronization, memory allocation, and storage efficiency.

Synchronization

If thread safety is not an issue, don't use collection objects that are synchronized. In particular, only use Vector when you need synchronization. If you do not need synchronization, use an ArrayList . You can use an ArrayList when you need synchronization by calling the Collections.synchronizedList method providing the ArrayList as a parameter.

Other Java collection classes carry the same warning about the unnecessary use of synchronization. For example, HashMap is not synchronized, but Hashtable is. HashMap can be turned into a synchronized structure by calling the Collections.synchronizedMap method.

Memory Allocation

Improper use of the Vector , ArrayList , and HashMap objects can cause unnecessary CPU usage and waste memory. If adding objects to a collection causes it to grow, it does so by making a copy of itself in a newly allocated block of memory. If the collection objects are initially created large enough, resources will not be wasted . In the case of a Vector object, if the capacity increment is not set, its size is doubled each time the object needs to expand. Choosing the default setting for a Vector that needs a large amount of storage can be particularly bad.

Note

The default allocation is 10 for both the Vector and ArrayList .

The ArrayList object can also be problematic as its capacity increases by about half for each expansion. This reduces the amount of memory wasted, but at the expense of CPU consumption. Vector and ArrayList objects behave efficiently when elements are added to the end, but adding (removing) to (from) the middle causes arrays of objects to be copied for each operation.

You can control memory allocation of HashMap (and Hashtable ) by setting up the initial capacity and the load factor attributes. The default initial capacity for HashMap is 101 and the default load factor is 75%. Exceeding the capacity of the table causes a new HashMap storage array to be allocated and all entries copied to it. Therefore, it is important to allocate a sufficient number of entries for HashMap objects.

Storage Efficiency

If you use exact keyed lookups in a collection, use a HashMap . If you need data sorted sequentially, with keyed lookup access, and the capability to find entries based on a partial key, use a TreeMap . If you need indexed lookup, use an ArrayList , Vector , or an Array .

Just because a method exists for a type of interface, doesn't mean the implementation is efficient. For example, the LinkedList class has a get method with an index value as its only parameter. What appears to be an ordinary indexed lookup is really a search for an object, starting at the beginning of the linked list and counting forward until the index number is reached.

Note

If the index entry is closer to the end, it starts at the end and counts backward.

Methods

Method modifiers affect the speed of execution for methods. Table 4.2 is a relative comparison of the cost of making method calls based on the modifier type.

Table 4.2. The Method Modifier Directly Affects the Execution Speed

Method Modifier	Relative Effect on Execution Speed
static	1
final	1.4
instance	2.5
interface	3.4
synchronized	6.8

Note

See http://www.protomatter.com/nate/java-optimization/ for more information on method call performance.

As you can see synchronized methods require, because of locking, considerably more overhead static methods. Static, private, and final method calls are not polymorphic and are more easily aligned by the Java compiler or Hotspot.

Note

For more details on Hotspot, see the "Java Compiler and JVM Optimizations" section later in this chapter.

One technique for simplifying code reading in a method is to decompose it into a group of simpler methods. An optimizing compiler may inline these methods because they are only used once and are private.

Many of your application methods are never going to be overridden. If this is the case, make the methods final. If necessary, the restriction can be removed later.

Objects

There are several issues regarding performance that you should be aware of when using objects:

Don't use objects where a primitive type can be used.
Re-use objects instead of creating new ones. However, be aware that clearing elements from a collection is usually more expensive than creating a new empty collection.
Use local variables instead of instance variables. They are faster to access and their usage makes it clear which variables the class needs.
Make application classes final if you are sure they are not going to be extended. The Java compiler and Hotspot may be able to inline the final classes' methods and improve performance.

Arrays

Use Arrays , instead of Vectors or ArrayLists when the number of elements is not going to change. Array access is faster.

When copying arrays, use the System.arraycopy method because it is 2 “10 times faster than copying the elements in a loop.

There are a number of small optimization techniques, such as placing an array element into a temporary variable and referencing the temporary variable, instead of referencing the array element using its index. Directly referencing a temporary variable can be two or three times faster than referencing an array element through its index; however, the savings may be wasted because an optimizing compiler can do this for you.

Lastly, know what the Arrays class has to offer, which is a set of static methods that perform utility functions on arrays. There are four types of functions:

equals() is used to compare two arrays for equality of elements
fill() is used to fill an array with a value
sort () is used to sort an array
binarySearch() is used to find an element in a sorted array

Note

The Arrays class, along with the Collections class, is in the java.util package.

Loops

Loops can be a source of poor performance because they may contain other programming inefficiencies repeated many times. The following are some recommendations for using loops:

Any operation that is repeated in the loop that always returns the same value should be done outside the loop. For example, unless there is a synchronization issue, do not call the size method of a Vector when accessing elements in a loop. Call the size method once outside the loop and use that value inside the loop.
Use integers for the loop index variables. This saves the compiler from having to promote a small type to an int .
When using the short-circuit operators always choose the first element as the one most likely to cause conditional termination.
If the operations in the loop take much more time than the code you have optimized, you haven't saved much in terms of real execution time.

A straightforward demonstration of the last recommendation is as follows. Suppose you optimize some code and are able, using a cryptic series of instructions, to make a part of the code execute twice as fast as before. If these operations only consume 1% of the total processing in the loop, the overall performance gain is only .5%. You have not gained much by adding the difficult-to-maintain code.

There are a couple of less straightforward techniques for improving performance, but these may be of dubious value. Here are some techniques we do not recommend:

Loop unrolling is the process of replacing loops with explicit assignment statements. Don't bother with this. It may not be any faster because the Java compiler or Hotspot can do this also.

Use a straightforward for loop construct, such as

 for (int j=0; j<10; j++) {}

instead of

 for (int j=-1; ++j<10; )

Although the loop variable computation in the second construct is up to two times faster than the first, it is hard to read. More importantly, the processing in the loop and not the comparison and index increment is much more likely to determine how long the loop takes to finish.

Strings

String usage is one of the easiest places to program inefficiently because strings are so easy to use.

There are four recommendations we have for optimizing the usage of strings.

String Are Immutable

The most important thing to remember when using String objects is that they are immutable, but StringBuffer objects are not. Therefore, when adding multiple (more than two) strings together or changing the value of a string, use a StringBuffer . For example, if you are adding together three String objects ( s1 , s2 , and s3 ) and their values are not known at compile time, it is more efficient to use the following construct:

 String s4 = new StringBuffer().append(s1).append(s2).append(s3).toString();

instead of

 s4 = s1 + s2 + s3;

Note

Use the first construct only if the string concatenation is a performance bottleneck because the second form is easier to read.

There is no advantage to using this technique for adding two String objects because the compiler implicitly uses a StringBuffer object. String buffers should not be used when adding final or literal strings because the compiler concatenates them at compile time.

Creating a String Using Literals

Don't initialize String objects by passing a literal value to its constructor that takes a String object as an argument. Passing a literal value to the constructor creates two object references, one to the literal on the heap and one for the String object variable. Initialize your String objects using a literal assignment.

Pre-Allocate Your String Buffers

String buffers should be allocated large enough to hold the resultant string. The default initial size of a StringBuffer object is 16 characters . Each time the string buffer has to expand, it doubles its size and copies the contents into a new StringBuffer object. You want to avoid this unnecessary overhead. Poorly configured long strings are especially expensive to construct.

Switch Statements for Strings

The usual way of comparing a string with many possible values is to create a sieve of if statements and compare each string until a match is found. When comparing four or more strings, it is much faster to use a HashMap and a switch statement. This construct not only provides a dramatic increase in performance, but the execution time is nearly independent of the number of comparisons. Compare this with a sieve of if statements for string variables, where the execution time increases linearly for each additional comparison. An example outlining this technique is given in Listing 4.8.

Listing 4.8 Example of Using `HashMap` for a String `switch` Statement

 // HashMap for storing String private static final HashMap rules; // create Strings for comparison and int values private static final String[] _validOnes = { "valid#1", "valid#2", etc.} public static final int VALID01=0; public static final int VALID02=1; etc. // initialize the HashMap static { rules = new HashMap(); rules.put(values[VALID01],new Integer(VALID01)); rules.put(values[VALID02,new Integer(VALID02); etc. } // Switch construct //Get String s1 from elsewhere, a method call or an argument int val = ((Integer)rules.get(s1)).intValue(); switch(val) { case VALID01: { //do something break; } case VALID02: { //do something break; } // additional cases follow

Synchronization

Synchronization is a mechanism used to allow multi-threaded programs to perform multiple sequential operations as a single atomic operation on shared data structures. Synchronization is necessary to prevent another thread from changing the value of the object's attributes while an atomic operation is being performed. However, synchronization degrades program performance because of the overhead of locking objects and the side effect of placing threads in wait states, thus preventing other useful work from being done.

Although synchronization is simple to set up, it is difficult to get right. That is why Enterprise Java Beans ( EJB ) uses a single-threaded model. This section discusses some important considerations when using synchronization:

Isolate the code that needs to be thread-safe. Don't use a scattershot approach and synchronize everything.
Remember, just because you have a synchronization block around an object doesn't mean the object is protected. You must have synchronization blocks around every modifying reference to the object.
When you synchronize a method, you are blocking access to all synchronized methods in the object.
When you want to inform threads that a lock is going to be released, use the Object.notifyAll method.
The Object.notify and Object.notifyAll methods do not release the lock. The lock is released after the synchronization scope (be it a method or a synchronization block) is exited.

Serialization

The Java I/O system provides the capability to transfer objects across a data stream through serialization. This is the mechanism that RMI uses to transfer parameters and return values between the client and remote object. It is very easy to make a class serializable by simply indicating that it implements the Serializable interface in its class declaration. The Serializable interface is fairly unique because it does not define any methods to be implemented. You are simply allowing the class to be cast to a Serializable object. Directly related to serialization is the Java keyword transient .

Basically, during serialization, all non-transient attributes of the object are converted into a data stream. During deserialization, an object is created and the non-transient attributes are initialized from the data stream. Any attributes that do not need to be stored persistently, because they are always initialized in some other manner, should be declared transient . This will greatly improve the speed and efficiency of the serialization process.

Serialization can be controlled by the class itself by implementing readObject() and/or writeObject() methods. The ObjectInputStream uses reflection to determine if the object has implemented readObject() or writeObject() . This mechanism can be used to initialize transient attributes when the object is deserialized. In the same manner, ObjectOutputStream will call the object's writeObject() method if it exists. The readObject() method typically calls the defaultReadObject() followed by additional initialization that the class requires. The writeObject() method calls defaultWriteObject() followed by additional serialization that the class requires.

Note

If the default method is the only method being called, there is no reason to provide these methods.

An example showing the implementation of serialization is shown in Listing 4.9.

Listing 4.9 The Class Must Explicitly Initialize Transient Attributes During Serialization

 public class TransientExample { private String str; private transient String transString; // default constructor public TransientExample() { str = "hello"; initTransients(); } // initialize transients private void initTransients() { transString = "this string is transient"; } // control serialization private void readObject( ObjectInputStream stream ) throws IOException, ClassNotFoundException { System.out.println( "TransientExample is being de-serialized" ); stream.defaultReadObject(); initTransients(); } // if writeObject() only calls defaultWriteObject() // there is no need to provide this function private void writeObject( ObjectOutputStream stream ) throws IOException { System.out.println( "TransientExample is being serialized" ); stream.defaultWriteObject(); }

Garbage Collection

There are very few recommendations directly related to garbage collection (GC) because wasted memory is often the result of sloppy programming. Assuming those types of problems are fixed, there are still a couple of things that you can do to improve memory usage.

First, assign object references to null when they are no longer being used.

Tip

Don't go overboard with this. If a variable is soon going to be eligible for garbage collection, such as when a method exits, don't bother setting the object reference to null.

Local variables will not be eligible for garbage collection until the method they reside in exits. Going out of scope within a method does not place an object in the GC bin. Object references for important resources should also be set to null to speed up reclamation.

Secondly, if you expect your program to be idle, request the GC to run by calling the System.gc method.

Java Compiler and JVM Optimizations

When compiling your program to generate classes for performance testing or deployment, use the -O (optimize) switch. The -O option directs the compiler to try to generate faster code by inlining static, final, and private methods. The code is strictly optimized for speed of execution, not memory usage. This option may slow down compilation, make larger class files, and/or make it difficult to debug. The -O option implicitly turns on -depend and turns off -g . The -depend option causes the recompilation on any source files on which a class may recursively depend. Without the -depend option, the Java compiler only automatically recompiles the source files that are directly depended upon.

The -O option informs the compiler that all compiled class files are guaranteed to be delivered and upgraded as a unit, enabling optimizations that may otherwise break binary compatibility. Use this option with discretion.

Note

Peter Hagar seems to indicate the -O switch for javac from Sun does nothing; however, this is hard to believe. He is probably right in the sense that it does less than you might think it would do. For more details, see http://www-106.ibm.com/developerworks/java/library/praxis/pr29.html.

The Java compiler performs constant folding and eliminates dead code, and the javac documentation claims its inline static, final, and private methods (at its discretion!).

The major code optimizations occur in the JVM. Sun's Hotspot JVM combines a dynamic compiler with a JIT framework that turns commonly referenced sections of byte code into compiled native code.

Most inlining occurs during the JIT phase because the JVM knows what is used and inlines only what is necessary. Hotspot even inline calls across virtual method invocations, something static optimizers cannot do. Hotspot uses generational garbage collection, which speeds up and smoothes out program execution. Hotspot and other JIT-based compilers can improve the execution speed of Java code 10 to 50 times.

Note

This does not necessarily mean your Java programs will run 10 “50 times faster as I/O waits, synchronization locks, and so on can decrease the effectiveness of the optimizer.

The JVM can be configured to support applications with large memory requirements. The maximum size and the initial size of the garbage collection heap can be set using the JVM -Xmx and -Xms options respectively. The default value for both of these is 6MB. The stack size per thread can also be controlled using the -Xss option. The default value for stack thread size is 128KB.

When adjusting the memory options of the JVM confirm that the physical memory space allowed for processes by the operating system are consistent with the Java memory parameters. Otherwise, you may experience big performance degradations due to swapping.

In summary, when targeting your software for a production environment, follow these rules:

Use javac -O when compiling programs.
Use java -hotspot when running programs. The -hotspot option is the default.
Use java -Xmx max-size -Xms initial-size -Xss stack-size for applications that need large amounts of memory.

General Coding Practices

Performance and Optimization Guidelines

Collections

Synchronization

Memory Allocation

Storage Efficiency

Methods

Table 4.2. The Method Modifier Directly Affects the Execution Speed

Objects

Arrays

Loops

Strings

String Are Immutable

Creating a String Using Literals

Pre-Allocate Your String Buffers

Switch Statements for Strings

Listing 4.8 Example of Using HashMap for a String switch Statement

Synchronization

Serialization

Listing 4.9 The Class Must Explicitly Initialize Transient Attributes During Serialization

Garbage Collection

Java Compiler and JVM Optimizations

Listing 4.8 Example of Using `HashMap` for a String `switch` Statement