Boxing


Because local variable value types are stack based and their interfaces and System.Object are heap based, an important question to consider is what happens when a value type is converted to one of its implemented interfaces or to its root base class, object. The cast is known as boxing and it has special behavior. Casting from a value type to a reference type involves several steps.

1.

First, memory is allocated on the heap that will contain the value type's data and a little overhead (a SyncBlockIndex and method table pointer).

2.

Next, a memory copy occurs from the value type's data on the stack, into the allocated location on the heap.

3.

Finally, the object or interface reference is updated to point at the location on the heap.

The reverse operation is unboxing. By definition, the unbox CIL instruction simply dereferences the data on the heap; it doesn't include the copy from the heap to the stack. In most cases with C#, however, a copy follows unboxing anyway.

Boxing and unboxing are important to consider because boxing has some performance and behavioral implications. Besides learning how to recognize them within C# code, a developer can count the box/unbox instructions in a particular snippet of code by looking through the CIL. Each operation has specific instructions, as shown in Table 8.1.

Table 8.1. Boxing Code in CIL

C# Code

CIL Code

 static void Main() {       int number;       object thing;       number = 42;       // Boxing       thing = number;       // Unboxing       number = (int)thing; return; } 


 .method private hidebysig     static void Main() cil managed {   .entrypoint   // Code size      21 (0x15)   .maxstack  1   .locals init ([0] int32 number,            [1] object thing)   IL_0000: nop   IL_0001: ldc.i4.s    42   IL_0003: stloc.0   IL_0004: ldloc.0   IL_0005: box            [mscorlib]System.Int32  IL_000a: stloc.1  IL_000b: ldloc.1  IL_000c: unbox.any            [mscorlib]System.Int32  IL_0011: stloc.0  IL_0012: br.sIL_0014  IL_0014: ret } // end of method Program::Main 



When boxing occurs in low volume, the performance concerns are irrelevant. However, boxing is sometimes subtle and frequent occurrences can make a difference with performance. Consider Listing 8.5 and Output 8.1.

Listing 8.5. Subtle Box and Unbox Instructions

class DisplayFibonacci {    static void Main()    {        int totalCount;        System.Collections.ArrayList list = new System.Collections.ArrayList();        Console.Write("Enter a number between 2 and 1000:");        totalCount = int.Parse(Console.ReadLine());        // Execution-time error:        // list.Add(0);   // Cast to double or 'D' suffix required                          // Whether cast or using 'D' suffix,                          // CIL is identical.        list.Add((double)0);        list.Add((double)1);        for (int count = 2; count < totalCount; count++)        {            list.Add(                ((double)list[count - 1] +                        (double)list[count - 2]));                }                foreach (double count in list)                {                    Console.Write("{0}, ", count);                }      } }

Output 8.1.

[View full width]

Enter a number between 2 and 1000:42 0, 1, 1, 2, 3, 5, 8, 13, 21, 34, 55, 89, 144, 233, 377, 610, 987, 1597, 2584, 4181, 6765,  10946, 17711, 28657, 46368, 75025, 121393, 196418, 317811, 514229, 832040, 1346269,  2178309, 3524578, 5702887, 9227465, 14930352, 24157817, 39088169, 63245986, 102334155,  165580141,

The code shown in Listing 8.5, when compiled, produces five box and three unbox instructions in the resulting CIL.

  1. The first two box instructions occur in the initial calls to list.Add(). The signature for the ArrayList method is int Add(object value). As such, any value type passed to this method is boxed.

  2. Next are two unbox instructions in the call to Add() within the for loop. The return from an ArrayList's index operator is always object because that is what ArrayList collects. In order to add the two values, however, you need to cast them back to doubles. This cast back from an object to a value type is an unbox call.

  3. Now you take the result of the addition and place it into the ArrayList instance, which again results in a box operation. Note that the first two unbox instructions and this box instruction occur within a loop.

  4. In the foreach loop, you iterate through each item in ArrayList and assign them to count. However, as you already saw, the items within ArrayList are objects, so assigning them to a double is unboxing each of them.

  5. The signature for Console.WriteLine() that is called within the foreach loop is void Console.Write(string format, object arg). As a result, each call to it invokes a box operation back from double and into object.

Obviously, you can easily improve this code by eliminating many of the boxing operations. Using an object rather than double in the last foreach loop is one improvement you can make. Another would be to change the ArrayList data type to one that supports a concept known as generics (see Chapter 12). The point, however, is that boxing can be rather subtle, so developers need to pay special attention and notice situations where it could potentially occur repeatedly and affect performance.

There is another unfortunate runtime-boxing-related problem. If you wanted to change the initial two Add() calls so that they did not use a cast (or a double literal), you would have to insert integers into the array list. Since ints will implicitly cast to doubles, this would appear to be an innocuous modification. However, the casts to double from within the for loop, and again in the assignment to count in the foreach loops, would fail. The problem is that immediately following the unbox operation is an attempt to perform a memory copy of the int into a double. You cannot do this without first casting to an int, because the code will throw an InvalidCastException at execution time. Listing 8.6 shows a similar error commented out and followed by the correct cast.

Listing 8.6. Unboxing Must Be to the Underlying Type

// ... int number; object thing; double bigNumber; number = 42; thing = number; // ERROR: InvalidCastException // bigNumber = (double)thing; bigNumber = (double)(int)thing; // ...

Advanced Topic: Value Types in the lock Statement

C# supports a lock statement for synchronizing code. The statement compiles down to System.Threading.Monitor's Enter() and Exit() methods. These two methods must be called in pairs. Enter() records a lookup of the unique reference argument passed so that when Exit() is called with the same reference, the lock can be released. The trouble with using value types is the boxing. Therefore, each time Enter() or Exit() is called, a new value is created on the heap. Comparing the reference of one copy to the reference of a different copy will always return false, so you cannot hook up Enter() with the corresponding Exit(). Therefore, value types in the lock() statement are not allowed.


Listing 8.7 points out a few more runtime boxing idiosyncrasies and Output 8.2 shows the results.

Listing 8.7. Subtle Boxing Idiosyncrasies

interface IAngle {     void MoveTo(int hours, int minutes, int seconds); } ______________________________________________________________________________ ______________________________________________________________________________ struct Angle : IAngle {        // ...   // NOTE:This makes Angle mutable, against the general   //        guideline   public void MoveTo(int hours, int minutes, int seconds)   {              _Hours = hours;              _Minutes = minutes;              _Seconds = seconds;   } } ______________________________________________________________________________ ______________________________________________________________________________ class Program {   static void Main()   {        // ...        Angle angle = new Angle(25, 58, 23);        object objectAngle = angle; // Box        Console.Write( ((Angle)objectAngle).Hours);        // Unbox and discard        ((Angle)objectAngle).MoveTo(26, 58, 23);        Console.Write( ((Angle)objectAngle).Hours);         // Box, modify, and discard         ((IAngle)angle).MoveTo(26, 58, 23);         Console.Write(", " + ((Angle)angle).Hours);         // Modify heap directly         ((IAngle)objectAngle).MoveTo(26, 58, 23);         Console.WriteLine(", " + ((Angle)objectAngle).Hours);         // ...     } }

Output 8.2.

25, 25, 25, 26

Listing 8.7 uses the Angle struct and IAngle interface from Listing 8.1. Note also that the IAngle.MoveTo() interface changes Angle to be mutable. This brings out some of the idiosyncrasies and, in so doing, demonstrates the importance of the guideline to make structs immutable.

In the first two lines, you initialize angle and then box it into a variable called objectAngle. Next, you call move in order to change Hours to 26. However, as the output demonstrates, no change actually occurs the first time. The problem is that in order to call MoveTo(), the compiler unboxes objectAngle and (by definition) places it on the stack. Although the stack value is successfully modified at execution time, this value is discarded and no change occurs on the heap location referenced by objectAngle.

In the next example, a similar problem occurs in reverse. Instead of calling MoveTo() directly, the value is cast to IAngle. The cast invokes a box instruction and the runtime copies the angle data to the heap. Next, the data on the heap is modified directly on the heap before the call returns. The result is that no copy back from the heap to the stack occurs. Instead, the modified heap data is ready for garbage collection while the data in angle remains unmodified.

In the last case, the cast to IAngle occurs with the data on the heap already, so no copy occurs. MoveTo() updates the _Hours value and the code behaves as desired.

Advanced Topic: Unboxing Avoided

As discussed earlier, the unboxing instruction does not include the copy back to the stack. In fact, some languages support the ability to access value types on the heap directly. This is generally not possible with C#. However, when accessing the boxed value via its interface, no copy is necessary.

Listing 8.7 added an interface implementation to the Angle struct. Listing 8.8 uses the interface to avoid unboxing.

Listing 8.8. Avoiding Unboxing and Copying

int number; object thing; number = 42; // Boxing thing = number; // No unbox instruction.                                       string text = ((IFormattable)thing).ToString(         "X", null);  Console.WriteLine(text);

Interfaces are reference types anyway, so calling one of its members does not even require unboxing. Furthermore, calling a struct's ToString() method (that overrides object's ToString() method) does not require an unbox. When compiling, it is clear that a struct's overriding ToString() method will always be called because all value types are sealed. The result is that the C# compiler can instruct a direct call to the method without unboxing.





Essential C# 2.0
Essential C# 2.0
ISBN: 0321150775
EAN: 2147483647
Year: 2007
Pages: 185

flylib.com © 2008-2017.
If you may any questions please contact us: flylib@qtcs.net