Using Value Types as Reference Types | C# Class Design Handbook: Coding Effective Classes

There are occasions when we would want to use a value type where a reference type is expected. Perhaps we need to pass an instance of a value type to a method that expects an object parameter; for example, say we need to insert a value instance into a .NET Framework collection.

Boxing and Unboxing

Converting an instance of a value type to an instance of a reference type is called boxing, and the reverse process is called unboxing. These are new concepts to most developers moving to .NET, and in C# this happens often without us even being aware of it. However, it is important to understand what's happening, so we'll spend a little time looking at these processes in detail.

Value Types as Objects

Consider the following code sample:

    // Create an ArrayList object    System.Collections.ArrayList payroll =        new System.Collections.ArrayList();    // Add some Money objects    payroll.Add(new Money(30500.0));    payroll.Add(new Money(54000.0));    payroll.Add(new Money(27900.0));

ArrayList is a standard .NET Framework collection class, and like all the other collection classes, ArrayList holds a collection of references to objects. These objects must be reference types, and so must be located on the managed heap. The Add() method of the ArrayList class expects an object as a parameter.

If we try to add an instance of a value type (a Money structure, defined earlier in the chapter in Defining and Using Value Types) to the ArrayList, the CLR automatically creates a boxed copy of the Money instance on the managed heap. In this sense, boxed means that the CLR has added a wrapper around the value instance to make it appear and behave like a reference type. This happens automatically in C#.

The boxed object holds a copy of the data from the value object; if the boxed object is modified, it won't affect the original value object. Likewise, if the original value object is modified, it won't affect the boxed object. This is what we'd expect from a value type.

The following code retrieves an element from the ArrayList:

    Money myCash = (Money)payroll[0];

In C#, the ArrayList class contains an Item property, which serves as the indexer for the class. This allows us to use the [] syntax to return a reference to an object on the managed heap. When we try to assign this to a Money instance using an explicit cast, the CLR extracts the value from the boxed object and copies it into our local Money instance. Again, we have a copy of the value, and changes to it won't affect the original or the boxed value still on the heap. This is unboxing, and it also happens automatically in C#. However, because of the type-safe nature of C#, we have to use an explicit cast when unboxing. An exception will be thrown at run time if the object in the ArrayList is not of the correct type.

Important

While we have shown boxing using an ArrayList, you should be aware that boxing does not occur when value types are declared in a normal C# array defined using the [] syntax. In this situation, memory is allocated on the managed heap to contain the value type array elements but these are not boxed.

C# performs boxing automatically; and apart from supplying an explicit cast when unboxing, we don't need to write any special code to make that happen. This is a mixed blessing, because boxing and unboxing imposes an overhead at run time.

Performance Implications

In order to understand what effect this overhead may have on your application, and the situations in which this can occur, we'll go a little deeper into boxing and unboxing, because understanding these issues will help you decide whether it is better to implement a particular type as a struct or a class. Let's start by seeing what happens in the code snippet above when a Money value type is added to an ArrayList.

First a stack-based Money type is created as a result of the new operator. The relevant constructor is called to initialize the object's fields.
The C# compiler has recognized that the Add() method of the ArrayList class requires a reference type instead of a value type and so will have emitted MSIL code to perform the boxing operation.
Memory is allocated on the managed heap. This includes the memory required for the fields in the Money structure, plus some extra memory required to establish the boxed type as a reference type. For example, memory will be allocated for a method table pointer, which is not relevant for a value type.
The fields in the stack-based Money instance, created as a result of the new operator, will be copied byte-by-byte into the newly allocated heap memory.
Finally, the address of the heap-allocated object will be returned and passed to the Add() method of the ArrayList class.

This process is repeated three times, once for each Money instance that we add to the ArrayList. When it comes to unboxing, the following actions reverse the process:

First, memory is allocated on the stack for the myCash variable and the default Money constructor is called to initialize the member fields.
Next, the boxed reference is checked to see if it is null. If yes, then a NullReferenceException is thrown.
The type of the boxed object is then checked against the type declared in the explicit cast in the source code. In this example, if the object is not of the Money type, an InvalidCastException is thrown.
Finally, a pointer to the boxed value type on the managed heap is obtained and the contents of the Money fields are copied byte-by-byte into the stack-based variable myCash.

You can see that if your application is performing a lot of boxing and unboxing then the overhead in terms of performance and memory usage can be significant.

Other Boxing Scenarios

The previous example gives one scenario in which a value type is passed as a parameter to a method requiring a reference type. Boxing will also occur whenever you implicitly or explicitly cast an instance of a value type to a reference type. For example, the following code will create a boxed version of an int:

    int number = 43;                   // define a value type    object o = number;                 // boxes a copy of number    int anotherNumber = (int)o;        // unboxes the copied value

Boxing also happens when you cast a value type to an interface. By definition interfaces are reference types. So if we wish to obtain the IComparable interface from our Money structure, the following code will cause a boxing operation:

    Money someMoney = new Money(4.0); // define a value type    IComparable iface = someMoney;    // get reference to an interface

Another situation that you may be unaware of is when checking an object's type:

    Money someMoney = new Money(4.5); // define a value type    Type t = someMoney.GetType();     // access type information

In this situation, someMoney is again boxed. This is because Money doesn't directly implement GetType(), as it is inherited from System.ValueType. In order to resolve this method call, the CLR needs to have a pointer to a Money() method table, which can only be obtained by boxing someMoney.

The best thing to do would be to limit the amount of boxing and unboxing. If you find your application is doing a great deal of boxing and unboxing, it will be more efficient to define your data type as a class rather than a struct.

Note

Managed Extensions for C++ makes you work harder to achieve boxing and unboxing. In this environment, you must explicitly box and unbox value instances using the __box and dynamic_downcast operators respectively. The code is a little harder to read and write but at least you know when boxing and unboxing is taking place, and that can have its advantages.