2.9. Reference and Value Types

< Day Day Up >

The Common Language Runtime (CLR) supports two kinds of types: reference types and value types (see Figure 2-2 on the following page). Reference types include classes, arrays, interfaces, and delegates. Value types include the primitive data types such as int, char, and byte as well as struct and enum types. Value and reference types are distinguished by their location in the .NET class hierarchy and the way in which .NET allocates memory for each. We'll look at both, beginning with the class inheritance hierarchy.

Figure 2-2. Hierarchy of common reference and value types

System.Object and System.ValueType

Both reference and value types inherit from the System.Object class. The difference is that almost all reference types inherit directly from it, whereas value types inherit further down the hierarchy directly from the System.ValueType class.

As the base for all types, System.Object provides a set of methods that you can expect to find on all types. This set includes the ToString method used throughout this chapter, as well as methods to clone a type, create a unique hash code for a type, and compare type instances for equality. Chapter 4 discusses these methods in detail and describes how to implement them on custom classes.

System.ValueType inherits from System.Object. It does not add any members, but does override some of the inherited methods to make them more suitable for value types. For example, Equals() is overridden to return true if the value of two objects' fields match. By definition, all value types implicitly inherit from the ValueType class.

Memory Allocation for Reference and Value Types

The primary difference between value and reference types is the way the CLR handles their memory requirements. Value types are allocated on a runtime stack, and reference types are placed on a managed heap that is referenced from the stack.

Figure 2-3 illustrates how the value and reference types from our example (refer to Figure 2-1) are represented in memory. Let's step through what happens when an instance of a reference type is created and is then assigned to a second variable:

 Apparel myApparel  = new Apparel(); Apparel myApparel2 = myApparel;

Figure 2-3. Memory layout for value and reference types

The CLR allocates memory for the object on the top of the managed heap.
Overhead information for the object is added to the heap. This information consists of a pointer to the object's method table and a SyncBlockIndex that is used to synchronize access to the object among multiple threads.
The myApparel object is created as an instance of the Apparel class, and its Price and FabType fields are placed on the heap.
The reference to myApparel is placed on the stack.
When a new reference variable myApparel2 is created, it is placed on the stack and given a pointer to the existing object. Both reference variables myApparel and myApparel2 now point to the same object.

Creating a reference object can be expensive in time and resources because of the multiple steps and required overhead. However, setting additional references to an existing object is quite efficient, because there is no need to make a physical copy of the object. The reverse is true for value types.

Boxing

.NET contains a special object type that accepts values of any data type. It provides a generic way to pass parameters and assign values when the type of the value being passed or assigned is not tied to a specific data type. Anything assigned to object must be treated as a reference type and stored on the heap. Consider the following statements:

 int age = 17; object refAge = age;

The first statement creates the variable age and places its value on the stack; the second assigns the value of age to a reference type. It places the value 17 on the heap, adds the overhead pointers described earlier, and adds a stack reference to it. This process of wrapping a value type so that it is treated as a reference type is known as boxing. Conversely, converting a reference type to a value type is known as unboxing and is performed by casting an object to its original type. Here, we unbox the object created in the preceding example:

 int newAge = (int) refAge; string newAge = (string) refAge;   // Fails. InvalidCastException

Note that the value being unboxed must be of the same type as the variable to which it is being cast.

In general, boxing can be ignored because the CLR handles the details transparently. However, it should be considered when designing code that stores large amounts of numeric data in memory. To illustrate, consider the System.Array and ArrayList classes mentioned earlier. Both are reference types, but they perform quite differently when used to store simple data values.

The ArrayList methods are designed to work on the generic object type. Consequently, the ArrayList stores all its items as reference types. If the data to be stored is a value type, it must be boxed before it can be stored. The array, on the other hand, can hold both value and reference types. It treats the reference types as the ArrayList does, but does not box value types.

The following code creates an array and an ArrayList of integer values. As shown in Figure 2-4, the values are stored quite differently in memory.

 // Create array with four values Int[] ages = {1,2,3,4};    // Place four values in ArrayList ArrayList ages = new ArrayList(); For (int i=0; i<4; i++) {    ages.add(i); // expects object parameter }

Figure 2-4. Memory layout comparison of `Array` and `ArrayList`

The array stores the values as unboxed int values; the ArrayList boxes each value. It then adds overhead required by reference types. If your application stores large amounts of data in memory and does not require the special features of the ArrayList, the array is a more efficient implementation. If using .NET 2.0 or later, the List class is the best choice because it eliminates boxing and includes the more flexible ArrayList features.

Summary of Value and Reference Type Differences

Memory Allocation

We have seen that memory allocation is the most significant difference between value and reference types. Reference types are allocated on the heap and value types on the thread or call stack. When a reference type is created, it is initialized to null, indicating it does not point to anything. A value type is initialized to zero (0).

Releasing Memory

Memory on the stack is freed when a variable goes out of scope. A garbage collection process that occurs when a system memory threshold is reached releases memory on the heap. Garbage collection is controlled by .NET and occurs automatically at unpredictable intervals. Chapter 4 discusses it in detail.

Variable Assignments

When a variable is set to a reference type, it receives a pointer to the original object rather than the object value itself. When a variable is set to a value type, a field-by-field copy of the original variable is made and assigned to the new variable.