Java Coding Techniques | Performance Analysis for Javaв„ў Websites

Previously, we addressed how to avoid memory leaks by understanding how our application impacts Java's memory management strategy. Likewise, good Java programmers understand how their application interacts with the Java compiler and the Java runtime environment. Programming with the compiler and the runtime environment in mind often makes the application faster and more memory efficient without compromising the application's readability or portability. This section covers some programming techniques for improving application performance. We focus, of course, on those techniques most beneficial to the web application environment.

Minimizing Object Creation

One of the best mechanisms for minimizing time spent in garbage collection is, as mentioned previously, to minimize the amount of garbage created. Let's discuss some common techniques for reducing the objects an application creates, as well as covering some common object creation mistakes.

String Usage

If you want to build a string from several fragments , you might concatenate them together using code similar to this:

 1 final static String a = "<p>"; 2 final static String b = "</p>"; 3 String c = a + "some text" + b; 4 c += a + " this is some more text." + b; 5 c += a + " and another paragraph" + b; 6 return c;

Taking the code at face value, we expect it to create object c just once. As subsequent statements append string fragments to c , we expect the code to add these fragments to our existing object c .

In Java, however, objects of class String are immutable. That is, once we create a String object, we cannot change it. So, in line 4, we cannot append more characters to the object c we created in line 3. Instead, Java creates a new object under the covers and places the contents of the existing object c plus the newly appended text into this object. This new object becomes c , and the old c object is discarded. The same process repeats for line 5: Java creates a new c object to hold the contents of the old c object plus the newly appended text. The old c object (created just previously in line 4) is discarded. Obviously, whether we know it or not, our code generates a lot of "garbage" objects. Extrapolate this example to a web application receiving 20 user requests per second. If each request generates 50 "garbage" objects, we generate 1,000 discarded objects per second. This directly impacts our garbage collection overhead.

A better solution uses the StringBuffer class. Unlike objects of the String class, these objects are mutable, so they change as their contents change. If we use StringBuffer objects, the previous example looks like this:

 1 static final String a = "<p>"; 2 static final String b = "</p>"; 3 StringBuffer c = a + "some text" + b; 4 c.append(a + " this is some more text." + b); 5 c.append(a + " and another paragraph" + b); 6 return c.toString();

Note that we still use the string concatenation operator, + , within a line. This is safe because the Java compiler internally uses a StringBuffer object just to construct this part of the string. ^[3] If you wish, you can use StringBuffer.append() for all the fragments, but this technique makes the code more difficult to read without improving the performance.

^[3] See the discussion on String and StringBuffer usage at java.lang.StringBuffer . Retrieved January 16, 2002, from the World Wide Web: <http://java.sun.com/products/jdk/1.1/docs/apijava.lang.StringBuffer.html>

Also, remember that string literals are String objects too. If you use string literals throughout your program, consider declaring them all as static final String " " (as we did in the example above). This declaration assures only one instance of the literal string in your application, and this instance never becomes a candidate for garbage collection. Other techniques for managing string literals include placing them in singleton objects or storing them in resource bundles. (However, if you place them in resource bundles, consider caching them to avoid repeated reads from disk.)

Unused Local Variables

Over time, programs change. If you refactor your code, clean up unnecessary local variables. Otherwise, your methods allocate these "dead" variables for each method call. Some compilers recognize unused variables and avoid them automatically. Many compilers, however, do not, so check with your vendor for details. If your compiler does not provide this feature, consider using a profiling tool to identify unused variables. These tools often prove most useful for large or unfamiliar applications.

Pooling

Chapter 2 discussed the specifics of thread pooling. However, you may benefit from pooling other objects as well. First, consider your application's objects, and determine if any fit the following pattern:

Creating the object is expensive, either because of the object's size or initialization complexity.
The application uses this object frequently, but only for brief, limited activities.
After the application finishes with the object, it becomes a candidate for garbage collection.
The object is stateless.
If the object does change with use, the object reinitializes to its original state quickly and easily.
The application never places the object inside an HTTP session or similar long-lived object.

Basically, if you create and destroy the same objects frequently, these objects make good candidates for a pooling strategy. However, pool management also incurs overhead. Before investing in a custom pool strategy, be sure the benefits outweigh the costs.

Of course, using custom pools works the same as using connection pools or a custom thread pool: The application requests an object from the pool, uses it briefly , and returns it to the pool. This pattern allows multiple web application threads to share expensive objects and to reduce the create/delete overhead associated with these objects. If you want a custom object pool, check out the information in Chapter 2 regarding thread pools. Many of the hints and cautions mentioned there apply to any type of pool. Again, let us caution you to first try a small sample application using the pool. Profile this sample to determine if pooling truly provides benefits for your application or if the pool management overhead negates any potential pooling benefits.

Multi-Threading Issues

Java designers built multi-threading into the base language. J2EE also embraces multi-threading: J2EE containers create new threads to handle the current request burden . (As noted in Chapter 2, we suggest setting a hard maximum on the number of threads a container may create to avoid resource congestion.) Multi-threading allows a web server to support multiple, simultaneous requests. However, programmers more familiar with the "one user, one application," single-threaded world of the thick client frequently experience difficulties in moving to a multi-threaded design. Usually, these folks trip over issues such as how to maintain state information properly and how to use shared resources correctly.

To complicate matters, multi-threading issues usually come to light only under load testing. While the developers write their code and perform simple single-user tests against it, everything works beautifully. However, when we run 20 simultaneous users through the same application, things begin to break. (That is, hopefully things begin to break. Some race conditions may only appear after very long runs, and under extremely specific scenarios.)

This section covers the basics of multi-threaded programming for a web application, including good programming practices for servlets and the proper use of synchronization in your application. However, keep in mind that good programming does not always eliminate multi-threading problems. You must also check any third-party code your web application references for multi-threading support. Also, you must verify that your remote systems (content servers, databases, and so on) accept simultaneous traffic. Good multi-threading support in your web application does not guarantee excellent performance. However, improper multi-threading support (particularly the improper use of synchronization) frequently creates performance bottlenecks. Also, an awareness of multi-threading issues usually proves useful to the performance test team. Remember, threading problems only emerge under load, so the performance team may be the first to discover these problems in an application.

Multi-Threaded Servlets

As stated in Chapter 2, multi-threaded servlets give Java web sites the best performance. However, one of the drawbacks of multi-threaded servlets is their inability to maintain state information in instance or class variables. By way of review, recall that the container only creates a single instance of a multi-threaded servlet and allows multiple threads to execute against this instance. Therefore, we cannot keep state information in instance or class variables because all execution threads have access to these variables, and they may try to update them simultaneously . This potentially leads to terrible error states, such as one visitor receiving another visitor's financial or medical information in the returned HTML.

Here's a poorly coded servlet ready to create just such an error state.

 public class MyServlet extends HttpServlet {   private String acctNumber = null;   public void doGet(HttpServletRequest request, HttpServletResponse response) {      acctNumber = request.getParameter("account");      if (acctNumber == null  "".equals(acctNumber)){         // do error page         }      AccountInfo info = doDatabaseCall(acctNumber);      request.setAttribute("acctInfo",info);      getServletContext().getRequestDispatcher("foo.jsp").forward(request,response);      } // doGet } // class MyServlet

Let's walk through this servlet and observe the multi-threading problems it contains. First, a request comes in, and executes through the following statement:

 acctNumber = request.getParameter("account");

(Let's say the new value of acctNumber is "1234" after the first request executes the statement.) Let's assume another request arrives simultaneously and executes through the same line just slightly after the first request, also updating the value in the variable acctNumber . Let's say the second request sets the value of acctNumber to "5678" .

As the first thread continues, it executes the database call to get the account info:

 AccountInfo info = doDatabaseCall(acctNumber);

However, because the first thread and the second thread share the same instance of the variable acctNumber , the database call retrieves the account information of second requestor ! (The database retrieves account data for account number "5678" , its current value, instead of "1234" .) The servlet then sends the wrong person's account information to a JSP for display to the first requester!

To solve this problem, move the declaration of acctNumber inside the doGet() method. Method variables are unique to the method and thread on which they were invoked.

Synchronization

Because many programmers feel uncomfortable with multi-threaded programming, they overuse or misuse the synchronized statement in their servlet code. The synchronized statement allows only one thread to enter a code block at a time. Other threads trying to access the code block must wait until the thread inside the block exits. If you must use the synchronized statement, minimize the code inside the synchronized block. The longer the synchronized code requires to execute, the longer other threads wait to enter the block. Limit the code inside the block to the essential elements requiring synchronization. For example, synchronize moving objects in and out of a pool, but not the use of an object itself after it leaves the pool. (The pool is actually the shared resource, not the object obtained from the pool.)

For example, let's assume we create a sharing crisis in our code similar to the one in the previous example:

 1 public class MyServlet extends HttpServlet { 2   private String someString; 3   public void service(HttpServletRequest req, HttpServletResponse res) { 4     someString = req.getParameter("myParameter"); 5     ... 6     } 7   }

This code is not threadsafe (that is, it does not support multi-threading). Just as in our previous example, as multiple threads execute through the service() method, they overwrite each other's value of the variable someString . So, for multiple simultaneous requests, this code behaves in a nondeterminate manner. (Given a fixed set of inputs, we do not get predictable output.)

Sometimes, when the developers discover this problem, they look to synchronization to solve their threading problem. Surprisingly, as a result, we sometimes find this very example solved by marking the entire service() method as synchronized ! (See the code sample below.) Of course, this strategy makes the method threadsafe, but it also forces requests through the servlet serially rather than in parallel. In short, this technique turns a multi-threaded servlet into a single-threaded servlet. The synchronized statement acts as a bottleneck, and for web sites of almost any size, this results in terrible performance.

 1 public class MyServlet extends HttpServlet { 2   private String someString; 3   public synchronized void service(HttpServletRequest req, HttpServletResponse res) { 4      someString = req.getParameter("myParameter");        ... 6      } 7   }

As we discussed in the previous section, the best solution removes the instance variables, and places them within the scope of the method, as shown in the code sample below. This eliminates the threading problems.

 1 public class MyServlet extends HttpServlet { 2   public synchronized void service(HttpServletRequest req, HttpServletResponse res) { 3      String someString = req.getParameter("myParameter"); 4      ... 5      } 6   }

This works nicely for simple problems, but what if you're calling code that isn't threadsafe, and you need to implement some thread safety yourself? You might use the following synchronization technique to protect a class ( MyClass in the following example) that isn't threadsafe.

 1 public class MyServlet extends HttpServlet { 2   public void service(HttpServletRequest req, HttpServletResponse res) { 3      String stuff; 4        synchronized(this) { 5          MyClass myObject = MyClass.getObject(); 6          myObject.performTask(); 7          stuff = myObject.getSomething(); 8          } 9        ... 10       } 11     }

This protects the instance of the single-threaded class ( myObject) , and allows the service() method to use the contents of the variable stuff after line 8. However, by using synchronized(this) , we prevent any other thread from executing a method of our class MyServlet while another thread is in the synchronized block. If many threads use this code block, we in effect make this servlet single-threaded.

The following solution provides a better alternative with minimal synchronization:

 1 public class MyServlet extends HttpServlet { 2   Object lock = new Object(); 3   public void service(HttpServletRequest req, HttpServletResponse res) { 4     String stuff; 5 6     synchronized(lock) { 7       MyClass myObject = MyClass.getObject(); 8       myObject.performTask(); 9       stuff = myObject.getSomething(); 10      } 11    ... 12    } 13   }

Here we provide a shared object for synchronization rather than the servlet class itself. This allows other threads to continue executing other methods of the MyServlet servlet while another thread is inside the synchronized block. ^[4]

^[4] Double-checked locking inside a singleton requires more consideration. See http://www.cs.umd.edu/~pugh/java/memoryModel/DoubleCheckedLocking.html for more details.