5.3 Synchronization


My shelves are overflowing with books, including many duplicate books, out-of-date books, and books I haven't looked at for 10 years and probably never will again. Over the years, these books have cost me tens of thousands of dollars, maybe more, to acquire. By contrast, two blocks down the street from my apartment, you'll find the Central Brooklyn Public Library. Its shelves are also overflowing with books; and over its 150 years , it's spent millions on its collection. But the difference is that its books are shared among all the residents of Brooklyn, and consequently the books have very high turnover . Most books in the collection are used several times a year. Although the public library spends a lot more money buying and storing books than I do, the cost per page read is much lower at the library than for my personal shelves. That's the advantage of a shared resource.

Of course, there are disadvantages to shared resources, too. If I need a book from the library, I have to walk over there. I have to find the book I'm looking for on the shelves. I have to stand in line to check the book out, or else I have to use it right there in the library rather than bringing it home with me. Sometimes, somebody else has checked the book out, and I have to fill out a reservation slip requesting that the book be saved for me when it's returned. And I can't write notes in the margins, highlight paragraphs, or tear pages out to paste on my bulletin board. (Well, I can, but if I do, it significantly reduces the usefulness of the book for future borrowers; and if the library catches me, I may lose my borrowing privileges.) There's a significant time and convenience penalty associated with borrowing a book from the library rather than purchasing my own copy, but it does save me money and storage space.

A thread is like a borrower at a library; the thread borrows from a central pool of resources. Threads make programs more efficient by sharing memory, file handles, sockets, and other resources. As long as two threads don't want to use the same resource at the same time, a multithreaded program is much more efficient than the multiprocess alternative, in which each process has to keep its own copy of every resource. The downside of a multithreaded program is that if two threads want the same resource at the same time, one of them will have to wait for the other to finish. If one of them doesn't wait, the resource may get corrupted. Let's look at a specific example. Consider the run( ) method of Example 5-1 and Example 5-2. As previously mentioned, the method builds the result as a String , and then prints the String on the console using one call to System.out.println( ) . The output looks like this:

 DigestThread.java: 69 101 80 -94 -98 -113 29 -52 -124 -121 -38 -82 39  -4 8 -38 119 96 -37 -99 DigestRunnable.java: 61 116 -102 -120 97 90 53 37 -14 111 -60 -86 -112  124 -54 111 114 -42 -36 -111 DigestThread.class: -62 -99 -39 -19 109 10 -91 25 -54 -128 -101 17 13  -66 119 25 -114 62 -21 121 DigestRunnable.class: 73 15 7 -122 96 66 -107 -45 69 -36 86 -43 103  -104 25 -128 -97 60 14 -76 

Four threads run in parallel to produce this output. Each writes one line to the console. The order in which the lines are written is unpredictable because thread scheduling is unpredictable, but each line is written as a unified whole. Suppose, however, we used this variation of the run( ) method, which, rather than storing intermediate parts of the result in the String variable result , simply prints them on the console as they become available:

 public void run( ) {   try {     FileInputStream in = new FileInputStream(input);     MessageDigest sha = MessageDigest.getInstance("SHA");     DigestInputStream din = new DigestInputStream(in, sha);     int b;     while ((b = din.read( )) != -1) ;     din.close( );     byte[] digest = sha.digest( );     System.out.print(input + ": ");     for (int i = 0; i < digest.length; i++) {      System.out.print(digest[i] + " ");     }     System.out.println( );   }   catch (IOException ex) {     System.err.println(ex);   }   catch (NoSuchAlgorithmException ex) {     System.err.println(ex);   }      } 

When you run the program on the same input, the output looks something like this:

 DigestRunnable.class: 73 15 7 -122 96 66 -107 -45 69 -36 86 -43 103 -104 25 -128 DigestRunnable.java: DigestThread.class: DigestThread.java: 61 -62 69 116 -99 101 -102 -39 80 -120 -19 -94 97 109 -98 90 -97 10 -113 53  60 -91 29 37 14 25 -52 -14 -76 -54 -124 111 -128 -121 -60 -101 -38 -86 17 -82 -112 13 39 124 -66 -4 -54 119 8 111 25 -38  114 -114 119 -42 62 96 -36 -21 -37 -111 121 -99 

The digests of the different files are all mixed up! There's no telling which number belongs to which digest. Clearly, this is a problem.

The reason this mix-up occurs is that System.out is shared between the four different threads. When one thread starts writing to the console through several System.out.print() statements, it may not finish all its writes before another thread breaks in and starts writing its output. The exact order in which one thread preempts the other threads is indeterminate. You'll probably see slightly different output every time you run this program.

We need a way to assign exclusive access to a shared resource to one thread for a specific series of statements. In this example, that shared resource is System.out , and the statements that need exclusive access are:

 System.out.print(input + ": "); for (int i = 0; i < digest.length; i++) {   System.out.print(digest[i] + " "); } System.out.println( ); 

5.3.1 Synchronized Blocks

Java's means of assigning exclusive access to an object is the synchronized keyword. To indicate that these five lines of code should be executed together, wrap them in a synchronized block that synchronizes on the System.out object, like this:

 synchronized (System.out) {   System.out.print(input + ": ");   for (int i = 0; i < digest.length; i++) {     System.out.print(digest[i] + " ");   }   System.out.println( ); } 

Once one thread starts printing out the values, all other threads will have to stop and wait for it to finish before they can print out their values. Synchronization is only a partial lock on an object. Other methods can use the synchronized object if they do so blindly, without attempting to synchronize on the object. For instance, in this case, there's nothing to prevent an unrelated thread from printing on System.out if it doesn't also try to synchronize on System.out . Java provides no means to stop all other threads from using a shared resource. It can only prevent other threads that synchronize on the same object from using the shared resource.

In fact, the PrintStream class internally synchronizes most methods on the PrintStream object, System.out in this example. In other words, every other thread that calls System.out.println( ) will be synchronized on System.out and will have to wait for this code to finish. PrintStream is unique in this respect. Most other OutputStream subclasses do not synchronize themselves .

Synchronization must be considered any time multiple threads share resources. These threads may be instances of the same Thread subclass or use the same Runnable class, or they may be instances of completely different classes. The key is the resources they share, not what classes they are. In Java, all resources are represented by objects that are instances of particular classes. Synchronization becomes an issue only when two threads both possess references to the same object. In the previous example, the problem was that several threads had access to the same PrintStream object, System.out . In this case, it was a static class variable that led to the conflict. However, instance variables can also have problems.

For example, suppose your web server keeps a log file. The log file may be represented by a class like the one shown in Example 5-12. This class itself doesn't use multiple threads. However, if the web server uses multiple threads to handle incoming connections, then each of those threads will need access to the same log file and consequently to the same LogFile object.

Example 5-12. LogFile
 import java.io.*; import java.util.*; public class LogFile {   private Writer out;   public LogFile(File f) throws IOException {     FileWriter fw = new FileWriter(f);     this.out = new BufferedWriter(fw);   }   public void writeEntry(String message) throws IOException {     Date d = new Date( );     out.write(d.toString( ));     out.write('\t');     out.write(message);     out.write("\r\n");   }   public void close( ) throws IOException {     out.flush( );     out.close( );   }      protected void finalize( ) {     try {       this.close( );     }     catch (IOException ex) {     }   } } 

In this class, the writeEntry() method finds the current date and time, then writes into the underlying file using four separate invocations of out.write( ) . A problem occurs if two or more threads each have a reference to the same LogFile object and one of those threads interrupts another in the process of writing the data. One thread may write the date and a tab, then the next thread might write three complete entries; then, the first thread could write the message, a carriage return, and a linefeed . The solution, once again, is synchronization. However, here there are two good choices for which object to synchronize on. The first choice is to synchronize on the Writer object out . For example:

 public void writeEntry(String message) throws IOException {     synchronized (out) {       Date d = new Date( );       out.write(d.toString( ));       out.write('\t');       out.write(message);       out.write("\r\n");     }   } 

This works because all the threads that use this LogFile object also use the same out object that's part of that LogFile . It doesn't matter that out is private. Although it is used by the other threads and objects, it's referenced only within the LogFile class. Furthermore, although we're synchronizing here on the out object, it's the writeEntry( ) method that needs to be protected from interruption. The Writer classes all have their own internal synchronization, which protects one thread from interfering with a write( ) method in another thread. (This is not true of input and output streams, with the exception of PrintStream . It is possible for a write to an output stream to be interrupted by another thread.) Each Writer class has a lock field that specifies the object on which writes to that writer synchronize.

The second possibility is to synchronize on the LogFile object itself. This is simple enough to arrange with the this keyword. For example:

 public void writeEntry(String message) throws IOException {     synchronized (this) {       Date d = new Date( );       out.write(d.toString( ));       out.write('\t');       out.write(message);       out.write("\r\n");     }   } 

5.3.2 Synchronized Methods

Since synchronizing the entire method body on the object itself is such a common thing to do, Java provides a shortcut. You can synchronize an entire method on the current object (the this reference) by adding the synchronized modifier to the method declaration. For example:

 public synchronized void writeEntry(String message)     throws IOException {     Date d = new Date( );     out.write(d.toString( ));     out.write('\t');     out.write(message);     out.write("\r\n");   } 

Simply adding the synchronized modifier to all methods is not a catchall solution for synchronization problems. For one thing, it exacts a severe performance penalty in many VMs (though more recent VMs have improved greatly in this respect), potentially slowing down your code by a factor of three or more. Second, it dramatically increases the chances of deadlock. Third, and most importantly, it's not always the object itself you need to protect from simultaneous modification or access, and synchronizing on the instance of the method's class may not protect the object you really need to protect. For instance, in this example, what we're really trying to prevent is two threads simultaneously writing onto out . If some other class had a reference to out completely unrelated to the LogFile , this attempt would fail. However, in this example, synchronizing on the LogFile object is sufficient because out is a private instance variable. Since we never expose a reference to this object, there's no way for any other object to invoke its methods except through the LogFile class. Therefore, synchronizing on the LogFile object has the same effect as synchronizing on out .

5.3.3 Alternatives to Synchronization

Synchronization is not always the best solution to the problem of inconsistent behavior caused by thread scheduling. There are a number of techniques that avoid the need for synchronization entirely. The first is to use local variables instead of fields wherever possible. Local variables do not have synchronization problems. Every time a method is entered, the virtual machine creates a completely new set of local variables for the method. These variables are invisible from outside the method and are destroyed when the method exits. As a result, it's impossible for one local variable to be used in two different threads. Every thread has its own separate set of local variables.

Method arguments of primitive types are also safe from modification in separate threads because Java passes arguments by value rather than by reference. A corollary of this is that methods such as Math.sqrt( ) that simply take zero or more primitive data type arguments, perform some calculation, and return a value without ever interacting with the fields of any class are inherently thread-safe. These methods often either are or should be declared static.

Method arguments of object types are a little trickier because the actual argument passed by value is a reference to the object. Suppose, for example, you pass a reference to an array into a sort ( ) method. While the method is sorting the array, there's nothing to stop some other thread that also has a reference to the array from changing the values in the array.

String arguments are safe because they're immutable ; that is, once a String object has been created, it cannot be changed by any thread. An immutable object never changes state. The values of its fields are set once when the constructor runs and never altered thereafter. StringBuffer arguments are not safe because they're not immutable; they can be changed after they're created.

A constructor normally does not have to worry about issues of thread safety. Until the constructor returns, no thread has a reference to the object, so it's impossible for two threads to have a reference to the object. (The most likely issue is if a constructor depends on another object in another thread that may change while the constructor runs, but that's uncommon. There's also a potential problem if a constructor somehow passes a reference to the object it's creating into a different thread, but this is also uncommon.)

You can take advantage of immutability in your own classes. It's often the easiest way to make a class thread-safe, often much easier than determining exactly which methods or code blocks to synchronize. To make an object immutable, simply declare all its fields private and don't write any methods that can change them. A lot of classes in the core Java library are immutable, for instance, java.lang.String , java.lang.Integer , java.lang.Double , and many more. This makes these classes less useful for some purposes, but it does make them a lot more thread-safe.

A third technique is to use a thread-unsafe class but only as a private field of a class that is thread-safe. As long as the containing class accesses the unsafe class only in a thread-safe fashion and as long as it never lets a reference to the private field leak out into another object, the class is safe. An example of this technique might be a web server that uses an unsynchronized LogFile class but gives each separate thread its own separate log so no resources are shared between the individual threads.

Java Network Programming
Java Network Programming, Third Edition
ISBN: 0596007213
EAN: 2147483647
Year: 2003
Pages: 164

Similar book on Amazon

flylib.com © 2008-2017.
If you may any questions please contact us: flylib@qtcs.net