Object Streams

   


Using a fixed-length record format is a good choice if you need to store data of the same type. However, objects that you create in an object-oriented program are rarely all of the same type. For example, you may have an array called staff that is nominally an array of Employee records but contains objects that are actually instances of a subclass such as Manager.

If we want to save files that contain this kind of information, we must first save the type of each object and then the data that define the current state of the object. When we read this information back from a file, we must

  • Read the object type;

  • Create a blank object of that type;

  • Fill it with the data that we stored in the file.

It is entirely possible (if very tedious) to do this by hand, and in the first edition of this book we did exactly this. However, Sun Microsystems developed a powerful mechanism that allows this to be done with much less effort. As you will soon see, this mechanism, called object serialization, almost completely automates what was previously a very tedious process. (You see later in this chapter where the term "serialization" comes from.)

Storing Objects of Variable Type

To save object data, you first need to open an ObjectOutputStream object:

 ObjectOutputStream out = new ObjectOutputStream(new FileOutputStream("employee.dat")); 

Now, to save an object, you simply use the writeObject method of the ObjectOutputStream class as in the following fragment:

 Employee harry = new Employee("Harry Hacker", 50000, 1989, 10, 1); Manager boss = new Manager("Carl Cracker", 80000, 1987, 12, 15); out.writeObject(harry); out.writeObject(boss); 

To read the objects back in, first get an ObjectInputStream object:

 ObjectInputStream in = new ObjectInputStream(new FileInputStream("employee.dat")); 

Then, retrieve the objects in the same order in which they were written, using the readObject method.

 Employee e1 = (Employee) in.readObject(); Employee e2 = (Employee) in.readObject(); 

When reading back objects, you must carefully keep track of the number of objects that were saved, their order, and their types. Each call to readObject reads in another object of the type Object. You therefore will need to cast it to its correct type.

If you don't need the exact type or you don't remember it, then you can cast it to any superclass or even leave it as type Object. For example, e2 is an Employee object variable even though it actually refers to a Manager object. If you need to dynamically query the type of the object, you can use the getClass method that we described in Chapter 5.

You can write and read only objects with the writeObject/readObject methods. For primitive type values, you use methods such as writeInt/readInt or writeDouble/readDouble. (The object stream classes implement the DataInput/DataOutput interfaces.) Of course, numbers inside objects (such as the salary field of an Employee object) are saved and restored automatically. Recall that, in Java, strings and arrays are objects and can, therefore, be processed with the writeObject/readObject methods.

There is, however, one change you need to make to any class that you want to save and restore in an object stream. The class must implement the Serializable interface:

 class Employee implements Serializable { . . . } 

The Serializable interface has no methods, so you don't need to change your classes in any way. In this regard, it is similar to the Cloneable interface that we also discussed in Chapter 6. However, to make a class cloneable, you still had to override the clone method of the Object class. To make a class serializable, you do not need to do anything else.

Example 12-4 is a test program that writes an array containing two employees and one manager to disk and then restores it. Writing an array is done with a single operation:

 Employee[] staff = new Employee[3]; . . . out.writeObject(staff); 

Similarly, reading in the result is done with a single operation. However, we must apply a cast to the return value of the readObject method:

 Employee[] newStaff = (Employee[]) in.readObject(); 

Once the information is restored, we print each employee because you can easily distinguish employee and manager objects by their different toString results. This should convince you that we did restore the correct types.

Example 12-4. ObjectFileTest.java

   1. import java.io.*;   2. import java.util.*;   3.   4. class ObjectFileTest   5. {   6.    public static void main(String[] args)   7.    {   8.       Manager boss = new Manager("Carl Cracker", 80000, 1987, 12, 15);   9.       boss.setBonus(5000);  10.  11.       Employee[] staff = new Employee[3];  12.  13.       staff[0] = boss;  14.       staff[1] = new Employee("Harry Hacker", 50000, 1989, 10, 1);  15.       staff[2] = new Employee("Tony Tester", 40000, 1990, 3, 15);  16.  17.       try  18.       {  19.          // save all employee records to the file employee.dat  20.          ObjectOutputStream out = new ObjectOutputStream(new FileOutputStream ("employee.dat"));  21.          out.writeObject(staff);  22.          out.close();  23.  24.          // retrieve all records into a new array  25.          ObjectInputStream in =  new ObjectInputStream(new FileInputStream("employee .dat"));  26.          Employee[] newStaff = (Employee[]) in.readObject();  27.          in.close();  28.  29.          // print the newly read employee records  30.          for (Employee e : newStaff)  31.             System.out.println(e);  32.       }  33.       catch (Exception e)  34.       {  35.          e.printStackTrace();  36.       }  37.    }  38. }  39.  40. class Employee implements Serializable  41. {  42.    public Employee() {}  43.  44.    public Employee(String n, double s, int year, int month, int day)  45.    {  46.       name = n;  47.       salary = s;  48.       GregorianCalendar calendar = new GregorianCalendar(year, month - 1, day);  49.       hireDay = calendar.getTime();  50.    }  51.  52.    public String getName()  53.    {  54.       return name;  55.    }  56.  57.    public double getSalary()  58.    {  59.       return salary;  60.    }  61.  62.    public Date getHireDay()  63.    {  64.       return hireDay;  65.    }  66.  67.    public void raiseSalary(double byPercent)  68.    {  69.       double raise = salary * byPercent / 100;  70.       salary += raise;  71.    }  72.  73.    public String toString()  74.    {  75.       return getClass().getName()  76.          + "[name=" + name  77.          + ",salary=" + salary  78.          + ",hireDay=" + hireDay  79.          + "]";  80.    }  81.  82.    private String name;  83.    private double salary;  84.    private Date hireDay;  85. }  86.  87. class Manager extends Employee  88. {  89.    /**  90.       @param n the employee's name  91.       @param s the salary  92.       @param year the hire year  93.       @param month the hire month  94.       @param day the hire day  95.    */  96.    public Manager(String n, double s, int year, int month, int day)  97.    {  98.       super(n, s, year, month, day);  99.       bonus = 0; 100.    } 101. 102.    public double getSalary() 103.    { 104.       double baseSalary = super.getSalary(); 105.       return baseSalary + bonus; 106.    } 107. 108.    public void setBonus(double b) 109.    { 110.       bonus = b; 111.    } 112. 113.    public String toString() 114.    { 115.       return super.toString() 116.         + "[bonus=" + bonus 117.         + "]"; 118.    } 119. 120.    private double bonus; 121. } 


 java.io.ObjectOutputStream 1.1 

  • ObjectOutputStream(OutputStream out)

    creates an ObjectOutputStream so that you can write objects to the specified OutputStream.

  • void writeObject(Object obj)

    writes the specified object to the ObjectOutputStream. This method saves the class of the object, the signature of the class, and the values of any nonstatic, nontransient field of the class and its superclasses.


 java.io.ObjectInputStream 1.1 

  • ObjectInputStream(InputStream is)

    creates an ObjectInputStream to read back object information from the specified InputStream.

  • Object readObject()

    reads an object from the ObjectInputStream. In particular, this method reads back the class of the object, the signature of the class, and the values of the nontransient and nonstatic fields of the class and all its superclasses. It does deserializing to allow multiple object references to be recovered.

Understanding the Object Serialization File Format

Object serialization saves object data in a particular file format. Of course, you can use the writeObject/readObject methods without having to know the exact sequence of bytes that represents objects in a file. Nonetheless, we found studying the data format to be extremely helpful for gaining insight into the object streaming process. We did this by looking at hex dumps of various saved object files. However, the details are somewhat technical, so feel free to skip this section if you are not interested in the implementation.

Every file begins with the two-byte "magic number"

 AC ED 

followed by the version number of the object serialization format, which is currently

 00 05 

(We use hexadecimal numbers throughout this section to denote bytes.) Then, it contains a sequence of objects, in the order that they were saved.

String objects are saved as

74

two-byte length

characters


For example, the string "Harry" is saved as

 74 00 05 Harry 

The Unicode characters of the string are saved in "modified UTF-8" format.

When an object is saved, the class of that object must be saved as well. The class description contains

  1. The name of the class;

  2. The serial version unique ID, which is a fingerprint of the data field types and method signatures;

  3. A set of flags describing the serialization method; and

  4. A description of the data fields.

Java gets the fingerprint by

  1. Ordering descriptions of the class, superclass, interfaces, field types, and method signatures in a canonical way;

  2. Then applying the so-called Secure Hash Algorithm (SHA) to that data.

SHA is a fast algorithm that gives a "fingerprint" to a larger block of information. This fingerprint is always a 20-byte data packet, regardless of the size of the original data. It is created by a clever sequence of bit operations on the data that makes it essentially 100 percent certain that the fingerprint will change if the information is altered in any way. SHA is a U.S. standard, recommended by the National Institute for Science and Technology (NIST). (For more details on SHA, see, for example, Cryptography and Network Security: Principles and Practice, by William Stallings [Prentice Hall, 2002].) However, Java uses only the first 8 bytes of the SHA code as a class fingerprint. It is still very likely that the class fingerprint will change if the data fields or methods change in any way.

Java can then check the class fingerprint to protect us from the following scenario: An object is saved to a disk file. Later, the designer of the class makes a change, for example, by removing a data field. Then, the old disk file is read in again. Now the data layout on the disk no longer matches the data layout in memory. If the data were read back in its old form, it could corrupt memory. Java takes great care to make such memory corruption close to impossible. Hence, it checks, using the fingerprint, that the class definition has not changed when it restores an object. It does this by comparing the fingerprint on disk with the fingerprint of the current class.

NOTE

Technically, as long as the data layout of a class has not changed, it ought to be safe to read objects back in. But Java is conservative and checks that the methods have not changed either. (After all, the methods describe the meaning of the stored data.) Of course, in practice, classes do evolve, and it may be necessary for a program to read in older versions of objects. We discuss this later in the section entitled "Versioning" on page 679.


Here is how a class identifier is stored:

72

2-byte length of class name

class name

8-byte fingerprint

1-byte flag

2-byte count of data field descriptors

data field descriptors

78 (end marker)

superclass type (70 if none)

The flag byte is composed of three bit masks, defined in java.io.ObjectStreamConstants:

 static final byte SC_WRITE_METHOD = 1;    // class has writeObject method that writes additional data static final byte SC_SERIALIZABLE = 2;    // class implements Serializable interface static final byte SC_EXTERNALIZABLE = 4;    // class implements Externalizable interface 

We discuss the Externalizable interface later in this chapter. Externalizable classes supply custom read and write methods that take over the output of their instance fields. The classes that we write implement the Serializable interface and will have a flag value of 02. The java.util.Date class defines its own readObject/writeObject methods and has a flag of 03.

Each data field descriptor has the format:

1-byte type code

2-byte length of field name

field name

class name (if field is an object)

The type code is one of the following:

B

byte

C

char

D

double

F

float

I

int

J

long

L

object

S

short

Z

boolean

[

array


When the type code is L, the field name is followed by the field type. Class and field name strings do not start with the string code 74, but field types do. Field types use a slightly different encoding of their names, namely, the format used by native methods. (See Volume 2 for native methods.)

For example, the salary field of the Employee class is encoded as:

 D 00 06 salary 

Here is the complete class descriptor of the Employee class:

72 00 08 Employee

 
 

E6 D2 86 7D AE AC 18 1B 02

Fingerprint and flags

 

00 03

Number of instance fields

 

D 00 06 salary

Instance field type and name

 

L 00 07 hireDay

Instance field type and name

 

74 00 10 Ljava/util/Date;

Instance field class name Date

 

L 00 04 name

Instance field type and name

 

74 00 12 Ljava/lang/String;

Instance field class name String

 

78

End marker

 

70

No superclass


These descriptors are fairly long. If the same class descriptor is needed again in the file, then an abbreviated form is used:

71

4-byte serial number


The serial number refers to the previous explicit class descriptor. We discuss the numbering scheme later.

An object is stored as

73

class descriptor

object data


For example, here is how an Employee object is stored:

40 E8 6A 00 00 00 00 00

salary field value double

73

hireDay field value new object

 

71 00 7E 00 08

Existing class java.util.Date

 

77 08 00 00 00 91 1B 4E B1 80 78

External storage details later

74 00 0C Harry Hacker

name field value String


As you can see, the data file contains enough information to restore the Employee object.

Arrays are saved in the following format:

75

class descriptor

4-byte number of entries

entries


The array class name in the class descriptor is in the same format as that used by native methods (which is slightly different from the class name used by class names in other class descriptors). In this format, class names start with an L and end with a semicolon.

For example, an array of three Employee objects starts out like this:

75

Array

 

72 00 0B [LEmployee;

New class, string length, class name Employee[]

  

FC BF 36 11 C5 91 11 C7 02

Fingerprint and flags

  

00 00

Number of instance fields

  

78

End marker

  

70

No superclass

  

00 00 00 03

Number of array entries


Note that the fingerprint for an array of Employee objects is different from a fingerprint of the Employee class itself.

Of course, studying these codes can be about as exciting as reading the average phone book. But it is still instructive to know that the object stream contains a detailed description of all the objects that it contains, with sufficient detail to allow reconstruction of both objects and arrays of objects.

Solving the Problem of Saving Object References

We now know how to save objects that contain numbers, strings, or other simple objects. However, there is one important situation that we still need to consider. What happens when one object is shared by several objects as part of its state?

To illustrate the problem, let us make a slight modification to the Manager class. Let's assume that each manager has a secretary, implemented as an instance variable secretary of type Employee. (It would make sense to derive a class Secretary from Employee for this purpose, but we don't do that here.)

 class Manager extends Employee {    . . .    private Employee secretary; } 

Having done this, you must keep in mind that the Manager object now contains a reference to the Employee object that describes the secretary, not a separate copy of the object.

In particular, two managers can share the same secretary, as is the case in Figure 12-6 and the following code:

 harry = new Employee("Harry Hacker", . . .); Manager carl = new Manager("Carl Cracker", . . .); carl.setSecretary(harry); Manager tony = new Manager("Tony Tester", . . .); tony.setSecretary(harry); 

Now, suppose we write the employee data to disk. What we don't want is for the Manager to save its information according to the following logic:

  • Save employee data;

  • Save secretary data.

Then, the data for harry would be saved three times. When reloaded, the objects would have the configuration shown in Figure 12-7.

Figure 12-7. Here, Harry is saved three times


This is not what we want. Suppose the secretary gets a raise. We would not want to hunt for all other copies of that object and apply the raise as well. We want to save and restore only one copy of the secretary. To do this, we must copy and restore the original references to the objects. In other words, we want the object layout on disk to be exactly like the object layout in memory. This is called persistence in object-oriented circles.

Of course, we cannot save and restore the memory addresses for the secretary objects. When an object is reloaded, it will likely occupy a completely different memory address than it originally did.

Figure 12-6. Two managers can share a mutual employee


Instead, Java uses a serialization approach. Hence, the name object serialization for this mechanism. Here is the algorithm:

  • All objects that are saved to disk are given a serial number (1, 2, 3, and so on, as shown in Figure 12-8).

    Figure 12-8. An example of object serialization


  • When saving an object to disk, find out if the same object has already been stored.

  • If it has been stored previously, just write "same as previously saved object with serial number x." If not, store all its data.

When reading back the objects, simply reverse the procedure. For each object that you load, note its sequence number and remember where you put it in memory. When you encounter the tag "same as previously saved object with serial number x," you look up where you put the object with serial number x and set the object reference to that memory address.

Note that the objects need not be saved in any particular order. Figure 12-9 shows what happens when a manager occurs first in the staff array.

Figure 12-9. Objects saved in random order


All of this sounds confusing, and it is. Fortunately, when object streams are used, the process is also completely automatic. Object streams assign the serial numbers and keep track of duplicate objects. The exact numbering scheme is slightly different from that used in the figures see the next section.

NOTE

In this chapter, we use serialization to save a collection of objects to a disk file and retrieve it exactly as we stored it. Another very important application is the transmittal of a collection of objects across a network connection to another computer. Just as raw memory addresses are meaningless in a file, they are also meaningless when communicating with a different processor. Because serialization replaces memory addresses with serial numbers, it permits the transport of object collections from one machine to another. We study that use of serialization when discussing remote method invocation in Volume 2.


Example 12-5 is a program that saves and reloads a network of Employee and Manager objects (some of which share the same employee as a secretary). Note that the secretary object is unique after reloading when newStaff[1] gets a raise, that is reflected in the secretary fields of the managers.

Example 12-5. ObjectRefTest.java

   1. import java.io.*;   2. import java.util.*;   3.   4. class ObjectRefTest   5. {   6.    public static void main(String[] args)   7.    {   8.       Employee harry = new Employee("Harry Hacker", 50000, 1989, 10, 1);   9.       Manager boss = new Manager("Carl Cracker", 80000, 1987, 12, 15);  10.       boss.setSecretary(harry);  11.  12.       Employee[] staff = new Employee[3];  13.  14.       staff[0] = boss;  15.       staff[1] = harry;  16.       staff[2] = new Employee("Tony Tester", 40000, 1990, 3, 15);  17.  18.       try  19.       {  20.          // save all employee records to the file employee.dat  21.          ObjectOutputStream out = new ObjectOutputStream(new FileOutputStream ("employee.dat"));  22.          out.writeObject(staff);  23.          out.close();  24.  25.          // retrieve all records into a new array  26.          ObjectInputStream in =  new ObjectInputStream(new FileInputStream("employee .dat"));  27.          Employee[] newStaff = (Employee[]) in.readObject();  28.          in.close();  29.  30.          // raise secretary's salary  31.          newStaff[1].raiseSalary(10);  32.  33.          // print the newly read employee records  34.          for (Employee e : newStaff)  35.             System.out.println(e);  36.       }  37.       catch (Exception e)  38.       {  39.          e.printStackTrace();  40.       }  41.    }  42. }  43.  44. class Employee implements Serializable  45. {  46.    public Employee() {}  47.  48.    public Employee(String n, double s, int year, int month, int day)  49.    {  50.       name = n;  51.       salary = s;  52.       GregorianCalendar calendar = new GregorianCalendar(year, month - 1, day);  53.       hireDay = calendar.getTime();  54.    }  55.  56.    public String getName()  57.    {  58.       return name;  59.    }  60.  61.    public double getSalary()  62.    {  63.       return salary;  64.    }  65.  66.    public Date getHireDay()  67.    {  68.       return hireDay;  69.    }  70.  71.    public void raiseSalary(double byPercent)  72.    {  73.       double raise = salary * byPercent / 100;  74.       salary += raise;  75.    }  76.  77.    public String toString()  78.    {  79.       return getClass().getName()  80.          + "[name=" + name  81.          + ",salary=" + salary  82.          + ",hireDay=" + hireDay  83.          + "]";  84.    }  85.  86.    private String name;  87.    private double salary;  88.    private Date hireDay;  89. }  90.  91. class Manager extends Employee  92. {  93.    /**  94.       Constructs a Manager without a secretary  95.       @param n the employee's name  96.       @param s the salary  97.       @param year the hire year  98.       @param month the hire month  99.       @param day the hire day 100.    */ 101.    public Manager(String n, double s, int year, int month, int day) 102.    { 103.       super(n, s, year, month, day); 104.       secretary = null; 105.    } 106. 107.    /** 108.       Assigns a secretary to the manager. 109.       @param s the secretary 110.    */ 111.    public void setSecretary(Employee s) 112.    { 113.       secretary = s; 114.    } 115. 116.    public String toString() 117.    { 118.       return super.toString() 119.         + "[secretary=" + secretary 120.         + "]"; 121.    } 122. 123.    private Employee secretary; 124. } 

Understanding the Output Format for Object References

This section continues the discussion of the output format of object streams. If you skipped the previous discussion, you should skip this section as well.

All objects (including arrays and strings) and all class descriptors are given serial numbers as they are saved in the output file. This process is referred to as serialization because every saved object is assigned a serial number. (The count starts at 00 7E 00 00.)

We already saw that a full class descriptor for any given class occurs only once. Subsequent descriptors refer to it. For example, in our previous example, a repeated reference to the Date class was coded as

 71 00 7E 00 08 

The same mechanism is used for objects. If a reference to a previously saved object is written, it is saved in exactly the same way, that is, 71 followed by the serial number. It is always clear from the context whether the particular serial reference denotes a class descriptor or an object.

Finally, a null reference is stored as

 70 

Here is the commented output of the ObjectRefTest program of the preceding section. If you like, run the program, look at a hex dump of its data file employee.dat, and compare it with the commented listing. The important lines toward the end of the output show the reference to a previously saved object.

AC ED 00 05

File header

75

Array staff (serial #1)

 

72 00 0B [LEmployee;

New class, string length, class name Employee[] (serial #0)

  

FC BF 36 11 C5 91 11 C7 02

Fingerprint and flags

  

00 00

Number of instance fields

  

78

End marker

  

70

No superclass

  

00 00 00 03

Number of array entries

 

73

staff[0] new object (serial #7)

  

72 00 07 Manager

New class, string length, class name (serial #2)

   

36 06 AE 13 63 8F 59 B7 02

Fingerprint and flags

   

00 01

Number of data fields

   

L 00 09 secretary

Instance field type and name

   

74 00 0A LEmployee;

Instance field class name String (serial #3)

   

78

End marker

   

72 00 08 Employee

Superclass new class, string length, class name (serial #4)

    

E6 D2 86 7D AE AC 18 1B 02

Fingerprint and flags

    

00 03

Number of instance fields

    

D 00 06 salary

Instance field type and name

    

L 00 07 hireDay

Instance field type and name

    

74 00 10 Ljava/util/Date;

Instance field class name String (serial #5)

    

L 00 04 name

Instance field type and name

    

74 00 12 Ljava/lang/String;

Instance field class name String (serial #6)

    

78

End marker

    

70

No superclass

  

40 F3 88 00 00 00 00 00

salary field value double

  

73

hireDay field value new object (serial #9)

  

72 00 0E java.util.Date

New class, string length, class name (serial #8)

   

68 6A 81 01 4B 59 74 19 03

Fingerprint and flags

   

00 00

No instance variables

   

78

End marker

   

70

No superclass

  

77 08

External storage, number of bytes

  

00 00 00 83 E9 39 E0 00

Date

  

78

End marker

 

74 00 0C Carl Cracker

name field value String (serial #10)

 

73

secretary field value new object (serial #11)

  

71 00 7E 00 04

existing class (use serial #4)

  

40 E8 6A 00 00 00 00 00

salary field value double

  

73

hireDay field value new object (serial #12)

   

71 00 7E 00 08

Existing class (use serial #8)

   

77 08

External storage, number of bytes

   

00 00 00 91 1B 4E B1 80

Date

   

78

End marker

  

74 00 0C Harry Hacker

name field value String (serial #13)

71 00 7E 00 0B

staff[1] existing object (use serial #11)

73

staff[2] new object (serial #14)

 

71 00 7E 00 04

Existing class (use serial #4)

 

40 E3 88 00 00 00 00 00

salary field value double

 

73

hireDay field value new object (serial #15)

  

71 00 7E 00 08

Existing class (use serial #8)

  

77 08

External storage, number of bytes

  

00 00 00 94 6D 3E EC 00 00

Date

  

78

End marker

 

74 00 0B Tony Tester

name field value String (serial # 16)


It is usually not important to know the exact file format (unless you are trying to create an evil effect by modifying the data). What you should remember is this:

  • The object stream output contains the types and data fields of all objects.

  • Each object is assigned a serial number.

  • Repeated occurrences of the same object are stored as references to that serial number.

Modifying the Default Serialization Mechanism

Certain data fields should never be serialized, for example, integer values that store file handles or handles of windows that are only meaningful to native methods. Such information is guaranteed to be useless when you reload an object at a later time or transport it to a different machine. In fact, improper values for such fields can actually cause native methods to crash. Java has an easy mechanism to prevent such fields from ever being serialized. Mark them with the keyword TRansient. You also need to tag fields as TRansient if they belong to nonserializable classes. Transient fields are always skipped when objects are serialized.

The serialization mechanism provides a way for individual classes to add validation or any other desired action to the default read and write behavior. A serializable class can define methods with the signature

 private void readObject(ObjectInputStream in)    throws IOException, ClassNotFoundException; private void writeObject(ObjectOutputStream out)    throws IOException; 

Then, the data fields are no longer automatically serialized, and these methods are called instead.

Here is a typical example. A number of classes in the java.awt.geom package, such as Point2D.Double, are not serializable. Now suppose you want to serialize a class LabeledPoint that stores a String and a Point2D.Double. First, you need to mark the Point2D.Double field as TRansient to avoid a NotSerializableException.

 public class LabeledPoint implements Serializable {    . . .    private String label;    private transient Point2D.Double point; } 

In the writeObject method, we first write the object descriptor and the String field, state, by calling the defaultWriteObject method. This is a special method of the ObjectOutputStream class that can only be called from within a writeObject method of a serializable class. Then we write the point coordinates, using the standard DataOutput calls.

 private void writeObject(ObjectOutputStream out)    throws IOException {    out.defaultWriteObject();    out.writeDouble(point.getX());    out.writeDouble(point.getY()); } 

In the readObject method, we reverse the process:

 private void readObject(ObjectInputStream in)    throws IOException {    in.defaultReadObject();    double x = in.readDouble();    double y = in.readDouble();    point = new Point2D.Double(x, y); } 

Another example is the java.util.Date class that supplies its own readObject and writeObject methods. These methods write the date as a number of milliseconds from the epoch (January 1, 1970, midnight UTC). The Date class has a complex internal representation that stores both a Calendar object and a millisecond count, to optimize lookups. The state of the Calendar is redundant and does not have to be saved.

The readObject and writeObject methods only need to save and load their data fields. They should not concern themselves with superclass data or any other class information.

Rather than letting the serialization mechanism save and restore object data, a class can define its own mechanism. To do this, a class must implement the Externalizable interface. This in turn requires it to define two methods:

 public void readExternal(ObjectInputStream in)   throws IOException, ClassNotFoundException; public void writeExternal(ObjectOutputStream out)   throws IOException; 

Unlike the readObject and writeObject methods that were described in the preceding section, these methods are fully responsible for saving and restoring the entire object, including the superclass data. The serialization mechanism merely records the class of the object in the stream. When reading an externalizable object, the object stream creates an object with the default constructor and then calls the readExternal method. Here is how you can implement these methods for the Employee class:

 public void readExternal(ObjectInput s)    throws IOException {    name = s.readUTF();    salary = s.readDouble();    hireDay = new Date(s.readLong()); } public void writeExternal(ObjectOutput s)    throws IOException {   s.writeUTF(name);   s.writeDouble(salary);   s.writeLong(hireDay.getTime()); } 

TIP

Serialization is somewhat slow because the virtual machine must discover the structure of each object. If you are concerned about performance and if you read and write a large number of objects of a particular class, you should investigate the use of the Externalizable interface. The tech tip http://developer.java.sun.com/developer/TechTips/txtarchive/Apr00_Stu.txt demonstrates that in the case of an employee class, using external reading and writing was about 35% 40% faster than the default serialization.


CAUTION

Unlike the readObject and writeObject methods, which are private and can only be called by the serialization mechanism, the readExternal and writeExternal methods are public. In particular, readExternal potentially permits modification of the state of an existing object.


NOTE

For even more exotic variations of serialization, see http://www.absolutejava.com/serialization.


Serializing Singletons and Typesafe Enumerations

You have to pay particular attention when serializing and deserializing objects that are assumed to be unique. This commonly happens when you are implementing singletons and typesafe enumerations.

If you use the enum construct of JDK 5.0, then you need not worry about serialization it just works. However, suppose you maintain legacy code that contains an enumerated type such as

 public class Orientation {    public static final Orientation HORIZONTAL = new Orientation(1);    public static final Orientation VERTICAL  = new Orientation(2);    private Orientation(int v) { value = v; }    private int value; } 

This idiom was common before enumerations were added to the Java language. Note that the constructor is private. Thus, no objects can be created beyond Orientation.HORIZONTAL and Orientation.VERTICAL. In particular, you can use the == operator to test for object equality:

 if (orientation == Orientation.HORIZONTAL) . . . 

There is an important twist that you need to remember when a typesafe enumeration implements the Serializable interface. The default serialization mechanism is not appropriate. Suppose we write a value of type Orientation and read it in again:

 Orientation original = Orientation.HORIZONTAL; ObjectOutputStream out = . . .; out.write(value); out.close(); ObjectInputStream in = . . .; Orientation saved = (Orientation) in.read(); 

Now the test

 if (saved == Orientation.HORIZONTAL) . . . 

will fail. In fact, the saved value is a completely new object of the Orientation type and not equal to any of the predefined constants. Even though the constructor is private, the serialization mechanism can create new objects!

To solve this problem, you need to define another special serialization method, called readResolve. If the readResolve method is defined, it is called after the object is deserialized. It must return an object that then becomes the return value of the readObject method. In our case, the readResolve method will inspect the value field and return the appropriate enumerated constant:

 protected Object readResolve() throws ObjectStreamException {    if (value == 1) return Orientation.HORIZONTAL;    if (value == 2) return Orientation.VERTICAL;    return null; // this shouldn't happen } 

Remember to add a readResolve method to all typesafe enumerations in your legacy code and to all classes that follow the singleton design pattern.

Versioning

In the previous sections, we showed you how to save relatively small collections of objects by means of an object stream. But those were just demonstration programs. With object streams, it helps to think big. Suppose you write a program that lets the user produce a document. This document contains paragraphs of text, tables, graphs, and so on. You can stream out the entire document object with a single call to writeObject:

 out.writeObject(doc); 

The paragraph, table, and graph objects are automatically streamed out as well. One user of your program can then give the output file to another user who also has a copy of your program, and that program loads the entire document with a single call to readObject:

 doc = (Document) in.readObject(); 

This is very useful, but your program will inevitably change, and you will release a version 1.1. Can version 1.1 read the old files? Can the users who still use 1.0 read the files that the new version is now producing? Clearly, it would be desirable if object files could cope with the evolution of classes.

At first glance it seems that this would not be possible. When a class definition changes in any way, then its SHA fingerprint also changes, and you know that object streams will refuse to read in objects with different fingerprints. However, a class can indicate that it is compatible with an earlier version of itself. To do this, you must first obtain the fingerprint of the earlier version of the class. You use the stand-alone serialver program that is part of the JDK to obtain this number. For example, running

 serialver Employee 

prints

 Employee: static final long serialVersionUID = -1814239825517340645L; 

If you start the serialver program with the -show option, then the program brings up a graphical dialog box (see Figure 12-10).

Figure 12-10. The graphical version of the serialver program


All later versions of the class must define the serialVersionUID constant to the same fingerprint as the original.

 class Employee implements Serializable // version 1.1 {    . . .    public static final long serialVersionUID = -1814239825517340645L; } 

When a class has a static data member named serialVersionUID, it will not compute the fingerprint manually but instead will use that value.

Once that static data member has been placed inside a class, the serialization system is now willing to read in different versions of objects of that class.

If only the methods of the class change, then there is no problem with reading the new object data. However, if data fields change, then you may have problems. For example, the old file object may have more or fewer data fields than the one in the program, or the types of the data fields may be different. In that case, the object stream makes an effort to convert the stream object to the current version of the class.

The object stream compares the data fields of the current version of the class with the data fields of the version in the stream. Of course, the object stream considers only the nontransient and nonstatic data fields. If two fields have matching names but different types, then the object stream makes no effort to convert one type to the other the objects are incompatible. If the object in the stream has data fields that are not present in the current version, then the object stream ignores the additional data. If the current version has data fields that are not present in the streamed object, the added fields are set to their default (null for objects, zero for numbers, and false for Boolean values).

Here is an example. Suppose we have saved a number of employee records on disk, using the original version (1.0) of the class. Now we change the Employee class to version 2.0 by adding a data field called department. Figure 12-11 shows what happens when a 1.0 object is read into a program that uses 2.0 objects. The department field is set to null. Figure 12-12 shows the opposite scenario: a program using 1.0 objects reads a 2.0 object. The additional department field is ignored.

Figure 12-11. Reading an object with fewer data fields


Figure 12-12. Reading an object with more data fields


Is this process safe? It depends. Dropping a data field seems harmless the recipient still has all the data that it knew how to manipulate. Setting a data field to null may not be so safe. Many classes work hard to initialize all data fields in all constructors to non-null values, so that the methods don't have to be prepared to handle null data. It is up to the class designer to implement additional code in the readObject method to fix version incompatibilities or to make sure the methods are robust enough to handle null data.

Using Serialization for Cloning

There is an amusing (and, occasionally, very useful) use for the serialization mechanism: it gives you an easy way to clone an object provided the class is serializable. (Recall from Chapter 6 that you need to do a bit of work to allow an object to be cloned.)

To clone a serializable object, simply serialize it to an output stream and then read it back in. The result is a new object that is a deep copy of the existing object. You don't have to write the object to a file you can use a ByteArrayOutputStream to save the data into a byte array.

As Example 12-6 shows, to get clone for free, simply extend the SerialCloneable class, and you are done.

You should be aware that this method, although clever, will usually be much slower than a clone method that explicitly constructs a new object and copies or clones the data fields (as you saw in Chapter 6).

Example 12-6. SerialCloneTest.java
  1. import java.io.*;  2. import java.util.*;  3.  4. public class SerialCloneTest  5. {  6.    public static void main(String[] args)  7.    {  8.       Employee harry = new Employee("Harry Hacker", 35000, 1989, 10, 1);  9.       // clone harry 10.      Employee harry2 = (Employee) harry.clone(); 11. 12.      // mutate harry 13.      harry.raiseSalary(10); 14. 15.      // now harry and the clone are different 16.      System.out.println(harry); 17.      System.out.println(harry2); 18.   } 19. } 20. 21. /** 22.    A class whose clone method uses serialization. 23. */ 24. class SerialCloneable implements Cloneable, Serializable 25. { 26.    public Object clone() 27.    { 28.       try 29.       { 30.          // save the object to a byte array 31.          ByteArrayOutputStream bout = new ByteArrayOutputStream(); 32.          ObjectOutputStream out = new ObjectOutputStream(bout); 33.          out.writeObject(this); 34.          out.close(); 35. 36.          // read a clone of the object from the byte array 37.          ByteArrayInputStream bin = new ByteArrayInputStream(bout.toByteArray()); 38.          ObjectInputStream in = new ObjectInputStream(bin); 39.          Object ret = in.readObject(); 40.          in.close(); 41. 42.          return ret; 43.       } 44.       catch (Exception e) 45.       { 46.          return null; 47.       } 48.    } 49. } 50. 51. /** 52.    The familiar Employee class, redefined to extend the 53.    SerialCloneable class. 54. */ 55. class Employee extends SerialCloneable 56. { 57.    public Employee(String n, double s, int year, int month, int day) 58.    { 59.       name = n; 60.       salary = s; 61.       GregorianCalendar calendar = new GregorianCalendar(year, month - 1, day); 62.       hireDay = calendar.getTime(); 63.    } 64. 65.    public String getName() 66.    { 67.       return name; 68.    } 69. 70.    public double getSalary() 71.    { 72.       return salary; 73.    } 74. 75.    public Date getHireDay() 76.    { 77.       return hireDay; 78.    } 79. 80.    public void raiseSalary(double byPercent) 81.    { 82.       double raise = salary * byPercent / 100; 83.       salary += raise; 84.    } 85. 86.    public String toString() 87.    { 88.       return getClass().getName() 89.          + "[name=" + name 90.          + ",salary=" + salary 91.          + ",hireDay=" + hireDay 92.          + "]"; 93.    } 94. 95.    private String name; 96.    private double salary; 97.    private Date hireDay; 98. } 


       
    top



    Core Java 2 Volume I - Fundamentals
    Core Java(TM) 2, Volume I--Fundamentals (7th Edition) (Core Series) (Core Series)
    ISBN: 0131482025
    EAN: 2147483647
    Year: 2003
    Pages: 132

    flylib.com © 2008-2017.
    If you may any questions please contact us: flylib@qtcs.net