Serialization | Expert C# 2008 Business Objects

Serialization is the process of converting a complex set of data, such as that which is contained in an object, into a single blob that's often called a byte stream . The reverse, deserialization , is the process of unpacking the byte stream to re-create the complex set of data.

Note

The term "byte stream" can refer to many things. It might refer to an XML document in a string variable, a string variable containing comma-delimited values, an actual Stream object (such as a MemoryStream ), or any other scheme through which a complex set of data values can be treated as a single entity for transfer across the network.

Serialization is important when we talk about distributing objects. Most objects contain a complex set of data, and that data is typically stored in a set of variables . If we want to transfer that data across the network, we really can't do it one variable at a time, as doing so would mean incurring a network overhead for each and every one. Performance would become totally unacceptable. Instead, we need some way to send all the data across the network at once. By serializing the complex data into a single byte stream, we simplify the task to simply transferring that one stream across the network. We've efficiently transferred all of the data in one network call.

However, serialization is useful for more than just transferring objects across the network. We can also use it to clone objects efficiently. To do this, we simply serialize an object into a memory buffer, and then deserialize it to create an exact clone of the original object. The ASP.NET Session object, for example, uses this cloning technique to store objects between page calls, and to re-create objects as the next page is processed . (This only occurs when the Session object is configured to run outside the ASP.NET process, but it's a practical example of serialization within the .NET Framework itself.)

As you know from Chapters 1 and 2, the remoting subsystem makes use of serialization as well. When the SOAP and binary formatters need to prepare an object for transmission across the remoting channel, they use serialization to convert the object's data into a single byte stream. The "unanchored" objects that we've been discussing can also be correctly called "serializable" objects.

Types of Serialization

Before we discuss how to code for serialization, we need to make a distinction between the two types of serialization employed by remoting and web services, respectively.

Remoting uses a type of serialization that finds all the variables contained within an object, and converts their values into a byte stream. If we wish to do so, we can suppress the serialization of specific variables by marking them with the [NonSerialized()] attribute.

Some objects will contain variables that are references to other objects. When we serialize such an object all the objects it references are also serialized, and their data is also included in the resulting byte stream. When objects reference other objects, the result is called an object graph , and when one of these byte streams is deserialized, the entire object graph is re-created. The original object and its references are effectively cloned.

Web services use a different serialization scheme. Rather than copying the variables within an object, web-services serialization just scans the object to find any fields that are public in scope, and any public properties that are read-write. Read-only and write-only properties are ignored, as are any non public variables. The web-services serialization scheme will also serialize an object graph, but because of its limitations, we can't guarantee that web services will serialize all our objects.

Web services uses the XmlSerializer object, which provides this more limited form of serialization. If we return an object from a web service, we get only the information that's accessible via the object's public read-write properties and fields. Additionally, this information is always generated in XML; there's no option to generate the data in a more compact binary format. Remoting, on the other hand, uses either a SOAP formatter or a binary-formatter object to handle serialization, both of which serialize entire object graphs. The SOAP formatter generates XML output containing the data, while the binary formatter generates much more compact binary output of the same data.

Note	In this book, we'll be making heavy use of remoting's serialization to implement distributed objects. In Chapter 10, we'll also make use of web-services serialization, as we implement a web-services interface for our sample application.

We do have to write a bit of extra code to make our object support serialization for remoting. This involves applying the [Serializable()] attribute to our class, and possibly also implementing the ISerializable interface. Happily, implementing ISerializable is only necessary if the default behavior of serialization is unacceptable, which isn't usually an issue. In our case, we'll stick with the default serialization behavior, meaning that we can just use the [Serializable()] attribute.

Note

One of my design goals for this book is to decrease the amount of plumbing code a business developer has to write or even see. Although there are some interesting things we could do by implementing ISerializable , it would mean that every business object would include a fairly large amount of nonbusiness code. In fact, it would basically mean that we'd be stuck writing the GetState() and SetState() methods from my Professional Visual Basic 6 Business Objects book. Luckily, the [Serializable()] attribute does all of this work for us, so we don't have to write it manually.

The [Serializable()] Attribute

As we stated previously, applying the [Serializable()] attribute to a class tells the .NET runtime to allow the serialization of our object. Rather than writing a bunch of code by hand, we get the desired serialization behavior with one simple attribute:

  [Serializable()]  public class MyBusinessClass { }

We can get the same effect by applying the [Serializable()] attribute and implementing the ISerializable interface along with a special constructor, but it's far easier just to use the [Serializable()] attribute!

The [NonSerialized()] Attribute

Once a class is marked as [Serializable()] , all of its member variables will be automatically serialized into a byte stream when the object is serialized. Sometimes, however, we may have variables that shouldn't be serialized. In most cases, these will be references to other objects.

Consider an Invoice object, for instance. It contains a collection of LineItem objects, and we'd probably want to serialize those along with the Invoice , because they're essentially part of the Invoice object. However, it might also reference a Customer object. The Customer object is used by Invoice , but it isn't part of Invoice as shown in Figure 3-13.

Figure 3-13: Relationship between Invoice, LineItem, and Customer classes

In such a case, we wouldn't want to serialize the Customer object as part of the Invoice , so our class might look like this:

  [Serializable()] public class Invoice {   LineItemCollection _lineItems;   [NonSerialized()]   Customer _customer; }

The [NonSerialized()] attribute marks the _customer variable so that the serialization process will ignore it. The one caveat with doing this is that any code that interacts with _customer must assume that _customer could be null. If the object is serialized and deserialized, the _customer variable will have the value null in the newly created object.

Serialization and Remoting

From the explanations in this chapter so far, it's clear that serialization and remoting have a close relationship. As stated in Chapters 1 and 2, when an object that's available via remoting has the [Serializable()] attribute applied, the remoting subsystem will automatically copy our object across the network. By applying the attribute, we've created an unanchored object.

To see how this works, let's return to our earlier example application in which we created a TestServer assembly that was exposed via remoting from a TestService website. Add a serializable class named Customer to the TestServer project as follows :

  [Serializable()]   public class Customer   {     string _name;     int _threadID;     public Customer()     {       _threadID = AppDomain.GetCurrentThreadId();     }     public string Name     {       get       {         return _name;       }       set       {         _name = value;       }     }     public int CreatedID     {       get       {         return _threadID;       }     }     public int CurrentID     {       get       {         return AppDomain.GetCurrentThreadId();       }     }   }

Because this class is marked as [Serializable()] , it's unanchored and will be passed across the network by value automatically. Notice that when the object is created, it records the thread ID where it's running, and includes properties to return that value and the thread where the object is currently running. We'll use this to prove that it has been copied back to the client.

We can then add a method to our anchored TestService class so that the client can ask for Customer objects:

 public class TestService : MarshalByRefObject {   public string GetServerName()   {     return System.Environment.MachineName;   }   public int GetProcessID()   {     return System.Diagnostics.Process.GetCurrentProcess.Id;   }  public Customer GetCustomer()   {     Customer obj = new Customer();     obj.Name = "Rockford Lhotka";     return obj;   }  }

Now we have an anchored server-side object, TestService , which returns unanchored Customer objects on request. The Customer object will be created on the server, and then passed by value back to the client. In the end, the Customer object will physically be in the client process.

We can add a button to our TestClient application's form to retrieve the Customer object:

  private void button2_Click(object sender, System.EventArgs e)     {       System.Text.StringBuilder output = new System.Text.StringBuilder();       TestServer.TestService svc = new TestServer.TestService();       TestServer.Customer cust;       cust = svc.GetCustomer();       output.AppendFormat("Got customer: {0}\n", cust.Name);       output.AppendFormat("Customer created on: {0}\n", cust.CreatedID);       output.AppendFormat("Customer currently on: {0}\n", cust.CurrentID);       MessageBox.Show(output.ToString());     }

When the application is run and the button is clicked, we should get a result similar to Figure 3-14.

Figure 3-14: Example result from running the test application

Notice that the Customer object was created on a different thread from the one where it's currently running. It was created in the server process, passed by value to the client, and is now running in the client process. Remoting serialized the Customer object's data into a byte stream, transferred it to the client, and then deserialized the byte stream into a new Customer object on the client ”an exact clone of the original.

Be aware that remoting doesn't transfer the code , but only the data. For this mechanism to work, the DLL containing the Customer class must be on the client machine along with the client application. In order to deserialize the byte stream on the client, remoting loads the DLL containing the Customer class, creates an empty Customer object, and then populates it with the deserialized data.

Thankfully, due to the support for no-touch deployment built into .NET, this isn't as serious a drawback as it might appear. No-touch deployment can be used to ensure that the client has the DLL containing the business class, and that remoting and serialization will work together to copy individual objects from the server to the client, and vice versa. As long as the DLL containing the unanchored class is deployed to the client workstation, remoting can be used to pass objects based on that class back and forth between client and server.

Note	For more information on using no-touch deployment in .NET, please refer to the Appendix.

Manually Invoking Serialization

Though we'll typically rely on remoting to invoke the appropriate serialization on our behalf , there are times when we might want to serialize and deserialize an object manually. One good example of this is that we might want to write a Clone() method for a class, and the easiest way to do it is via serialization as follows:

 [Serializable()] public class TheClass {   String _name;   Guid _id;  public object Clone()   {     IO.MemoryStream buffer = new IO.MemoryStream();     Runtime.Serialization.Formatters.Binary.BinaryFormatter formatter =       new Runtime.Serialization.Formatters.Binary.BinaryFormatter();     formatter.Serialize(buffer, this);     buffer.Position = 0;     return formatter.Deserialize(buffer);   }  }

The Clone() method here uses the binary formatter to serialize the object into a MemoryStream , which is just a memory buffer ”a byte stream in memory. Once the object's data is in memory, we reset the "cursor" within the MemoryStream object to the beginning of the stream, so that we can read the data it contains. Finally, we use the formatter to deserialize the byte stream to create a new identical object.

Note

Technically, the resulting object may not be identical to the original. For instance, we may have marked some variables with the [NonSerialized()] attribute, in which case some variable values may not have been copied. Alternatively, had we implemented the ISerializable interface, we could have chosen to serialize some or none of the object's actual data.

The .NET Framework even defines a formal interface for objects that know how to clone themselves , and we can use this technique to implement it ”it's called ICloneable :

 [Serializable()] public class TheClass : ICloneable {   String _name;   Guid _id;  public object Clone()   {     IO.MemoryStream buffer = new IO.MemoryStream();     Runtime.Serialization.Formatters.Binary.BinaryFormatter formatter =       new Runtime.Serialization.Formatters.Binary.BinaryFormatter();   formatter.Serialize(buffer, this);     buffer.Position = 0;     return formatter.Deserialize(buffer);   }  }

Alternatively, we can manually invoke the XmlSerializer object to serialize an object using web-services-style serialization:

  public string GetData()   {     IO.MemoryStream buffer = new IO.MemoryStream();     Xml.Serialization.XmlSerializer formatter =       new Xml.Serialization.XmlSerializer(this.GetType());     formatter.Serialize(buffer, this);     buffer.Position = 0;    return buffer.ToString();   }

This is not a clone method. We can't use web-services serialization to clone an object, and because our current class doesn't include any public , read-write properties, this method won't actually serialize any data at all right now! However, if we use this method in an object that does have some public , read-write properties, it will return an XML string containing the values of those properties. This is exactly what would be returned as the result of a web service that returned the same object as a result.