The purpose of programming guidelines is to bring uniformity , clarity, correctness, and optimization to programs. Following these guidelines increases productivity for both the development and maintenance of the software. Common programming errors, such as forgetting to close a file, can be avoided by knowing and following these guidelines. Software maintenance is far easier when the code is readable, logically structured, and well-formatted in the first place. General Coding PracticesFollowing standard coding practices is very important for software maintenance. It is very likely that many months after the code was first written, someone will have to fix bugs or add features to the software. Code that is well written makes this task much easier. Java developers have come to expect a common coding style when viewing Java source code. This style was first introduced by Sun and can be viewed on the Internet at http://java.sun.com/docs/codeconv. Virtually all source code found at the java.sun.com Web site follows these coding conventions. This section provides a list of general and specific rules that should be followed when writing production-grade Java programs:
Performance and Optimization GuidelinesThere are many facets to optimization: system tuning, database design, optimizing the logical flow within a program, and so on. When doing J2EE development, the performance and scalability of the system can be tuned at the JVM, OS, Database, and Application Server level without any application code change. Industrial-strength J2EE application servers such as WebLogic Server provide many tuning parameters to improve the J2EE applications' performance and scalability. Therefore, one should not be caught up in performance optimization early on in the project development life cycle. Developers should be focused on the functionalities of the systems. During performance tuning, if the bottlenecks are narrowed down to the application code, then the code needs to be rewritten. This section covers recommended optimization practices for the Java programming language. These practices, when learned, should become part of your everyday coding habits. We recommend coding choices that promote efficiency, but not at the expense of readability. There are a number of obscure techniques for making code execute faster, but often the real benefits are marginal. First, the Java compiler's optimizer can perform a number of these programming tricks. Note With Java Hotspot, the optimizations are more likely to be done at runtime. See "Java Compiler and JVM Optimizations" later in this chapter for more details on Hotspot. Although your code optimization may be significant for a few lines of code, it is often very small within the context of the entire program [iii] Optimization techniques for the following Java elements are discussed in the following sections:
CollectionsThe three areas to be addressed for Java collections are synchronization, memory allocation, and storage efficiency. SynchronizationIf thread safety is not an issue, don't use collection objects that are synchronized. In particular, only use Vector when you need synchronization. If you do not need synchronization, use an ArrayList . You can use an ArrayList when you need synchronization by calling the Collections.synchronizedList method providing the ArrayList as a parameter. Other Java collection classes carry the same warning about the unnecessary use of synchronization. For example, HashMap is not synchronized, but Hashtable is. HashMap can be turned into a synchronized structure by calling the Collections.synchronizedMap method. Memory AllocationImproper use of the Vector , ArrayList , and HashMap objects can cause unnecessary CPU usage and waste memory. If adding objects to a collection causes it to grow, it does so by making a copy of itself in a newly allocated block of memory. If the collection objects are initially created large enough, resources will not be wasted . In the case of a Vector object, if the capacity increment is not set, its size is doubled each time the object needs to expand. Choosing the default setting for a Vector that needs a large amount of storage can be particularly bad. Note The default allocation is 10 for both the Vector and ArrayList . The ArrayList object can also be problematic as its capacity increases by about half for each expansion. This reduces the amount of memory wasted, but at the expense of CPU consumption. Vector and ArrayList objects behave efficiently when elements are added to the end, but adding (removing) to (from) the middle causes arrays of objects to be copied for each operation. You can control memory allocation of HashMap (and Hashtable ) by setting up the initial capacity and the load factor attributes. The default initial capacity for HashMap is 101 and the default load factor is 75%. Exceeding the capacity of the table causes a new HashMap storage array to be allocated and all entries copied to it. Therefore, it is important to allocate a sufficient number of entries for HashMap objects. Storage EfficiencyIf you use exact keyed lookups in a collection, use a HashMap . If you need data sorted sequentially, with keyed lookup access, and the capability to find entries based on a partial key, use a TreeMap . If you need indexed lookup, use an ArrayList , Vector , or an Array . Just because a method exists for a type of interface, doesn't mean the implementation is efficient. For example, the LinkedList class has a get method with an index value as its only parameter. What appears to be an ordinary indexed lookup is really a search for an object, starting at the beginning of the linked list and counting forward until the index number is reached. Note If the index entry is closer to the end, it starts at the end and counts backward. MethodsMethod modifiers affect the speed of execution for methods. Table 4.2 is a relative comparison of the cost of making method calls based on the modifier type. Table 4.2. The Method Modifier Directly Affects the Execution Speed
Note See http://www.protomatter.com/nate/java-optimization/ for more information on method call performance. As you can see synchronized methods require, because of locking, considerably more overhead static methods. Static, private, and final method calls are not polymorphic and are more easily aligned by the Java compiler or Hotspot. Note For more details on Hotspot, see the "Java Compiler and JVM Optimizations" section later in this chapter. One technique for simplifying code reading in a method is to decompose it into a group of simpler methods. An optimizing compiler may inline these methods because they are only used once and are private. Many of your application methods are never going to be overridden. If this is the case, make the methods final. If necessary, the restriction can be removed later. ObjectsThere are several issues regarding performance that you should be aware of when using objects:
ArraysUse Arrays , instead of Vectors or ArrayLists when the number of elements is not going to change. Array access is faster. When copying arrays, use the System.arraycopy method because it is 2 “10 times faster than copying the elements in a loop. There are a number of small optimization techniques, such as placing an array element into a temporary variable and referencing the temporary variable, instead of referencing the array element using its index. Directly referencing a temporary variable can be two or three times faster than referencing an array element through its index; however, the savings may be wasted because an optimizing compiler can do this for you. Lastly, know what the Arrays class has to offer, which is a set of static methods that perform utility functions on arrays. There are four types of functions:
Note The Arrays class, along with the Collections class, is in the java.util package. LoopsLoops can be a source of poor performance because they may contain other programming inefficiencies repeated many times. The following are some recommendations for using loops:
A straightforward demonstration of the last recommendation is as follows. Suppose you optimize some code and are able, using a cryptic series of instructions, to make a part of the code execute twice as fast as before. If these operations only consume 1% of the total processing in the loop, the overall performance gain is only .5%. You have not gained much by adding the difficult-to-maintain code. There are a couple of less straightforward techniques for improving performance, but these may be of dubious value. Here are some techniques we do not recommend:
Although the loop variable computation in the second construct is up to two times faster than the first, it is hard to read. More importantly, the processing in the loop and not the comparison and index increment is much more likely to determine how long the loop takes to finish. StringsString usage is one of the easiest places to program inefficiently because strings are so easy to use. There are four recommendations we have for optimizing the usage of strings. String Are ImmutableThe most important thing to remember when using String objects is that they are immutable, but StringBuffer objects are not. Therefore, when adding multiple (more than two) strings together or changing the value of a string, use a StringBuffer . For example, if you are adding together three String objects ( s1 , s2 , and s3 ) and their values are not known at compile time, it is more efficient to use the following construct: String s4 = new StringBuffer().append(s1).append(s2).append(s3).toString(); instead of s4 = s1 + s2 + s3; Note Use the first construct only if the string concatenation is a performance bottleneck because the second form is easier to read. There is no advantage to using this technique for adding two String objects because the compiler implicitly uses a StringBuffer object. String buffers should not be used when adding final or literal strings because the compiler concatenates them at compile time. Creating a String Using LiteralsDon't initialize String objects by passing a literal value to its constructor that takes a String object as an argument. Passing a literal value to the constructor creates two object references, one to the literal on the heap and one for the String object variable. Initialize your String objects using a literal assignment. Pre-Allocate Your String BuffersString buffers should be allocated large enough to hold the resultant string. The default initial size of a StringBuffer object is 16 characters . Each time the string buffer has to expand, it doubles its size and copies the contents into a new StringBuffer object. You want to avoid this unnecessary overhead. Poorly configured long strings are especially expensive to construct. Switch Statements for StringsThe usual way of comparing a string with many possible values is to create a sieve of if statements and compare each string until a match is found. When comparing four or more strings, it is much faster to use a HashMap and a switch statement. This construct not only provides a dramatic increase in performance, but the execution time is nearly independent of the number of comparisons. Compare this with a sieve of if statements for string variables, where the execution time increases linearly for each additional comparison. An example outlining this technique is given in Listing 4.8. Listing 4.8 Example of Using HashMap for a String switch Statement// HashMap for storing String private static final HashMap rules; // create Strings for comparison and int values private static final String[] _validOnes = { "valid#1", "valid#2", etc.} public static final int VALID01=0; public static final int VALID02=1; etc. // initialize the HashMap static { rules = new HashMap(); rules.put(values[VALID01],new Integer(VALID01)); rules.put(values[VALID02,new Integer(VALID02); etc. } // Switch construct //Get String s1 from elsewhere, a method call or an argument int val = ((Integer)rules.get(s1)).intValue(); switch(val) { case VALID01: { //do something break; } case VALID02: { //do something break; } // additional cases follow SynchronizationSynchronization is a mechanism used to allow multi-threaded programs to perform multiple sequential operations as a single atomic operation on shared data structures. Synchronization is necessary to prevent another thread from changing the value of the object's attributes while an atomic operation is being performed. However, synchronization degrades program performance because of the overhead of locking objects and the side effect of placing threads in wait states, thus preventing other useful work from being done. Although synchronization is simple to set up, it is difficult to get right. That is why Enterprise Java Beans ( EJB ) uses a single-threaded model. This section discusses some important considerations when using synchronization:
SerializationThe Java I/O system provides the capability to transfer objects across a data stream through serialization. This is the mechanism that RMI uses to transfer parameters and return values between the client and remote object. It is very easy to make a class serializable by simply indicating that it implements the Serializable interface in its class declaration. The Serializable interface is fairly unique because it does not define any methods to be implemented. You are simply allowing the class to be cast to a Serializable object. Directly related to serialization is the Java keyword transient . Basically, during serialization, all non-transient attributes of the object are converted into a data stream. During deserialization, an object is created and the non-transient attributes are initialized from the data stream. Any attributes that do not need to be stored persistently, because they are always initialized in some other manner, should be declared transient . This will greatly improve the speed and efficiency of the serialization process. Serialization can be controlled by the class itself by implementing readObject() and/or writeObject() methods. The ObjectInputStream uses reflection to determine if the object has implemented readObject() or writeObject() . This mechanism can be used to initialize transient attributes when the object is deserialized. In the same manner, ObjectOutputStream will call the object's writeObject() method if it exists. The readObject() method typically calls the defaultReadObject() followed by additional initialization that the class requires. The writeObject() method calls defaultWriteObject() followed by additional serialization that the class requires. Note If the default method is the only method being called, there is no reason to provide these methods. An example showing the implementation of serialization is shown in Listing 4.9. Listing 4.9 The Class Must Explicitly Initialize Transient Attributes During Serializationpublic class TransientExample { private String str; private transient String transString; // default constructor public TransientExample() { str = "hello"; initTransients(); } // initialize transients private void initTransients() { transString = "this string is transient"; } // control serialization private void readObject( ObjectInputStream stream ) throws IOException, ClassNotFoundException { System.out.println( "TransientExample is being de-serialized" ); stream.defaultReadObject(); initTransients(); } // if writeObject() only calls defaultWriteObject() // there is no need to provide this function private void writeObject( ObjectOutputStream stream ) throws IOException { System.out.println( "TransientExample is being serialized" ); stream.defaultWriteObject(); } Garbage CollectionThere are very few recommendations directly related to garbage collection (GC) because wasted memory is often the result of sloppy programming. Assuming those types of problems are fixed, there are still a couple of things that you can do to improve memory usage. First, assign object references to null when they are no longer being used. Tip Don't go overboard with this. If a variable is soon going to be eligible for garbage collection, such as when a method exits, don't bother setting the object reference to null. Local variables will not be eligible for garbage collection until the method they reside in exits. Going out of scope within a method does not place an object in the GC bin. Object references for important resources should also be set to null to speed up reclamation. Secondly, if you expect your program to be idle, request the GC to run by calling the System.gc method. Java Compiler and JVM OptimizationsWhen compiling your program to generate classes for performance testing or deployment, use the -O (optimize) switch. The -O option directs the compiler to try to generate faster code by inlining static, final, and private methods. The code is strictly optimized for speed of execution, not memory usage. This option may slow down compilation, make larger class files, and/or make it difficult to debug. The -O option implicitly turns on -depend and turns off -g . The -depend option causes the recompilation on any source files on which a class may recursively depend. Without the -depend option, the Java compiler only automatically recompiles the source files that are directly depended upon. The -O option informs the compiler that all compiled class files are guaranteed to be delivered and upgraded as a unit, enabling optimizations that may otherwise break binary compatibility. Use this option with discretion. Note Peter Hagar seems to indicate the -O switch for javac from Sun does nothing; however, this is hard to believe. He is probably right in the sense that it does less than you might think it would do. For more details, see http://www-106.ibm.com/developerworks/java/library/praxis/pr29.html. The Java compiler performs constant folding and eliminates dead code, and the javac documentation claims its inline static, final, and private methods (at its discretion!). The major code optimizations occur in the JVM. Sun's Hotspot JVM combines a dynamic compiler with a JIT framework that turns commonly referenced sections of byte code into compiled native code. Most inlining occurs during the JIT phase because the JVM knows what is used and inlines only what is necessary. Hotspot even inline calls across virtual method invocations, something static optimizers cannot do. Hotspot uses generational garbage collection, which speeds up and smoothes out program execution. Hotspot and other JIT-based compilers can improve the execution speed of Java code 10 to 50 times. Note This does not necessarily mean your Java programs will run 10 “50 times faster as I/O waits, synchronization locks, and so on can decrease the effectiveness of the optimizer. The JVM can be configured to support applications with large memory requirements. The maximum size and the initial size of the garbage collection heap can be set using the JVM -Xmx and -Xms options respectively. The default value for both of these is 6MB. The stack size per thread can also be controlled using the -Xss option. The default value for stack thread size is 128KB. When adjusting the memory options of the JVM confirm that the physical memory space allowed for processes by the operating system are consistent with the Java memory parameters. Otherwise, you may experience big performance degradations due to swapping. In summary, when targeting your software for a production environment, follow these rules:
|