Memory Management Under the Hood | Performance Consulting: A Practical Guide for HR and Learning Professionals

Chapter 4 - Advanced C# Topics

bySimon Robinsonet al.
Wrox Press 2002

One of the declared advantages of C# programming is that the programmer doesn't need to worry about detailed memory management; in particular the garbage collector deals with all memory cleanup on your behalf . The result is that you can get something close to the efficiency of languages like C++ without the complexity of having to handle memory management yourself in the way that you need to do in C++. However, although you don't have to manage the memory manually, if you are to write efficient code, it still pays to understand what is going on behind the scenes. In this section we will take a look at what happens in the computer's memory when you allocate variables . I should stress that the precise details of much of the content of this section are undocumented. You should interpret this section as a rather simplified guide to the general principles that are going on, rather than as a statement of exact implementation.

Value Data Types

We will start off by looking at what happens when you create a variable of a value type. We will examine what happens in memory when you execute these lines of code:

   {     int nRacingCars = 10;     double engineSize = 3000.0;     // do calculations;     }

In this code we have indicated to the compiler that we need space in memory to store an integer and a double, and that these memory locations are to be referred to respectively as nRacingCars and engineSize . The line that declares each variable indicates the point at which we will start requiring access to this variable, and the closing curly brace indicates the point at which the variables go out of scope.

Windows uses a system known as virtual addressing , in which the mapping from the memory address seen by your program to the actual location in hardware memory or on disk is entirely managed behind the scenes by Windows. The result of this is that each application on a 32-bit processor sees 4GB of available memory, irrespective of how much hardware memory you actually have in your computer (on 64-bit processors this number will be greater). This 4GB of memory is known as the virtual address space or virtual memory . For convenience we will continue referring to it simply as 'memory'.

Each memory location from this 4GB is numbered starting from zero. If you want to indicate a value stored at a particular location in memory, you need to supply the number that represents that memory location. In any high-level language, be it C#, VB, C++, Java or any other similar language, one of the things that the compiler does is convert the human-readable names that you have given your variables into the memory addresses that the processor understands. This 4GB of memory actually contains everything that is a part of the program, including the executable code and the contents of all variables used when the program runs. Any DLLs called up will all be loaded into this same address space; each item of code or data will have its own definite location.

Somewhere inside this memory is an area that is known as the stack . The stack is where value data types are usually stored. When you call a method, the stack is also used to copy any parameters passed in. In order to understand how the stack works, we need to notice the following important fact about the scope of variables in C#. It is always the case that if a variable a goes into scope before variable b , then b will go out of scope first. Look at this code:

   {     int a;     // do something     {     int b;     // do something else     }     }

First, a gets declared. Then, inside the inner code block, b gets declared. Then the inner code block terminates and b goes out of scope, then a goes out of scope. So, the lifetime of b is entirely contained within the lifetime of a .

Note that if the compiler hits a line like int i, j , the order of coming into scope looks indeterminate. Both variables are declared at the same time and go out of scope at the same time. In this situation, it doesn't matter to us in what order the two variables are removed from memory. The compiler will internally always ensure that the one that was put in memory first is removed last, thus preserving our rule about no crossover of variable lifetimes.

This idea that you always deallocate variables in the reverse order to how you allocate them might look a bit abstract, but it is crucial to the way that the stack works. Let's see what actually happens when we declare the variables nRacingCars and engineSize from our earlier example.

Somewhere in the program is something called the stack pointer . This is simply a variable (or an address in memory) that tells us the address of the next free location in the stack. When the program first starts running, the stack pointer will point to just past the end of the block of memory that is reserved for the stack. The stack actually fills downwards, from high memory addresses to low addresses. As data is put on the stack, the stack pointer will be adjusted accordingly , so it always points to just past the next free location. Now, we don't know exactly where in the address space the stack is - we don't need to know for C# development - but let's say for the sake of argument, that immediately before the above code that allocates the variables is executed, the stack pointer contains the value 800000 or, in hexadecimal, 0xC3500 ). We are taking this memory location just for the sake of argument, to keep the numbers simple. In fact, later on when we start running code that uses pointers, we will see that the stack actually starts round about memory location 1243328 ( 0x12F8C8 ). However, we can explain the principles of how the stack works just as well using any address, so we may as well pick a simple one. We will also mostly use decimal for addresses, again for simplicity, although it is more common to write memory addresses in hexadecimal format.

The situation is illustrated in the diagram. In the diagram, bold text indicates the contents of memory locations; plain text indicates the address or a description of the location:

At this point, the variable nRacingCars comes into scope, and the value 10 is placed in it. What happens is that the value 10 will be placed in locations 799996-799999 , the four bytes just below the location pointed to by the stack pointer - which are the first four free bytes on the stack. Four bytes because that's how much memory is needed to store an int . To accommodate this, 4 will be subtracted from to the stack pointer, so it now points to the location 799996 , just after the new first free location.

The next line declares the variable engineSize , a double and initializes it to the value 3000.0 . A double occupies 8 bytes, so the value 3000.0 will be placed in locations 799988 - 799995 in the stack, in whatever format the processor uses for 8-byte floating point numbers, and the stack pointer will be decremented by 8, so that once again, it points just past the next free location on the stack.

When engineSize goes out of scope, the computer knows that it is no longer needed. Due to the way variable lifetimes are always nested, we can guarantee that, whatever else has happened while engineSize was in scope, the stack pointer will at this time happen to be pointing to the location just at where engineSize was stored. The process of removing this variable from scope is simple. The stack pointer is incremented by 8, so that it now points to where engineSize used to be. At this point in our code, we are at the closing curly brace, at which nRacingCars goes out of scope too, so the stack pointer gets incremented again, by 4 this time. If another variable were to come into scope at this point, it would just overwrite memory descending from location 799999 , where nRacingCars used to be stored.

I've gone into a lot of detail about how variables are allocated, although even this discussion hasn't been exhaustive. For example, I've not touched on how the compiler is able to figure out the address of each variable. The thing I really want you to notice is just how fast and efficient the process of allocating variables on the stack is. There is little to be done other than increment the stack pointer - a simple arithmetic operation that will take just a few clock cycles.

Reference Data Types

While the stack gives very high performance, it is not really flexible enough to be used for all variables. The requirement that the lifetimes of variables must be nested is too restrictive for many purposes. Often, you may want to use some method to allocate some memory to store some data, and be able to keep that data available long after that method exited. This possibility exists whenever storage space is requested with the new operator - as is the case for all reference types. That's where the managed heap comes in.

If you have done any coding that requires low-level memory management in the past, you will be familiar with the stack and the heap as used in pre-.NET programs. The managed heap is not however quite the same as the heap that pre-.NET code such as classic C++ uses. It works under the control of the garbage collector and carries significant performance benefits compared to traditional heaps.

The managed heap (or just heap for short) is just another area of memory from that available 4GB. To see how the heap works and how memory is allocated for reference data types, examine this bit of code:

   void DoWork()     {     Customer arabel;     arabel = new Customer();     Customer mrJones = new Nevermore60Customer();     }

In this code, we have assumed the existence of two classes, Customer and Nevermore60Customer . These classes are in fact taken from Mortimer phones examples developed in Appendix A.

We declare a Customer reference called arabel . The space for this will be allocated on the stack, but remember that this is only a reference, not an actual Customer instance. The amount of space taken by the arabel reference will be 4 bytes for the address at which an instance of a Customer is actually stored. We need 4 bytes to be able to store an integer value between 0 and 4GB.

Then we get to the next line:

 arabel = new Customer();

This line of code does several things. First, it allocates memory in the heap to store a Customer instance (a real instance, not just an address). Then, it sets the variable arabel to store the address of the memory it is allocated. It will also call the appropriate Customer() constructor to initialize the fields in the class instance, but we won't worry about that part here.

The customer instance will not be placed in the stack - it will be placed in the heap. Now, we don't know precisely how many bytes a Customer instance occupies, but let's say for the sake of argument it is 32. These 32 bytes contain the instance fields of Customer as well as some information that .NET uses to identify and manage its class instances, including the vtable .

The .NET runtime will look through the heap and grab the first contiguous block of 32 bytes that is unused. For the sake of argument, we will say that this happens to be at address 200000 , and that the arabel reference occupied locations 799996 - 799999 on the stack. (Actually, from experiment, 20000 is nowhere near where the heap actually is - it is nearer locations 12000000 , but we want to keep the numbers simple). This means that before instantiating the arabel object, the contents of memory will look like this.

Note that, unlike the stack, memory in the heap is allocated upwards, so the free space can be found above the used space.

After allocating the object, memory looks like this.

The next line of code that we execute does the same thing again, except that space on the stack for the mrJones reference needs to be allocated at the same time as the space for mrJones is allocated on the heap:

 Customer mrJones = new Nevermore60Customer();

This line will result in 4 bytes being allocated on the stack to hold the mrJones reference. This will be stored at locations 799992 - 799995 , while the mrJones instance itself will be allocated from locations 200032 upwards on the heap.

You can already see from this that the process of setting up a reference variable is more complex than it is for setting up a value variable, and inevitably there will be a performance hit. In fact we have somewhat oversimplified the process too, since the .NET runtime will need to maintain information about the state of the heap, and this information will also need to be updated whenever new data is added to the heap. However, we do now have a more flexible scheme for variable lifetime. To illustrate this, let's look at what happens when our method exits and the arabel and mrJones references go out of scope. In accordance with the normal working of the stack, the stack pointer will be incremented so these variables no longer exist. However, these variables only store addresses, not the actual class instances. The data for those class instances is still sitting there on the heap, where it will remain until either the program terminates, or the garbage collector is called. More importantly from our point of view, it is perfectly possible for us to set other reference variables to point to the same objects - this means that those objects will be available after the arabel and mrJones references have gone out of scope. And this is an important difference between the stack and the heap: objects allocated successively on the heap do not have nested lifetimes.

That's the power of reference data types, and you will see this feature used extensively in C# code. It means that we have a high degree of control over the lifetime of our data, since it is guaranteed to exist in the heap as long as we are maintaining some reference to it.

The above discussion and diagrams show the managed heap working very much like the stack, to the extent that successive objects are placed next to each other in memory. This means that we can work out where to place the next object very simply, by using a heap pointer that indicates the next free memory location, and which gets adjusted as we add more objects to the heap. However, there appears to be a problem here. When we explained the operation of the stack, we emphasized that it was only possible for the stack to operate so efficiently because of the way that lifetimes of stack variables are nested. The lifetimes of references have gone out of scope, yet the heap apparently works as if they also follow this rule. How is that possible? The answer is that it works thanks to the garbage collector. When the garbage collector runs, it will remove all those objects from the heap that are no longer referenced. Immediately after it has done this, the heap will have objects scattered on it, mixed up with memory that has just been freed, a bit like this:

If the managed heap stayed looking like that, allocating further objects on it would be an awkward process, with the computer having to search through it looking for a block of memory big enough to store each object. However, the garbage collector doesn't leave the heap in this state. As soon as it has freed up all the objects it can, it compacts all the others by moving them all back to the end of the heap to form one contiguous block again. This means that the heap can continue working just like the stack as far as locating where to store new objects is concerned . Of course, when the objects are moved about, all the references to those objects need to be updated with the correct new addresses, but the garbage collector handles that too.

This action of compacting by the garbage collector is where the managed heap really works differently from old unmanaged heaps. With the managed heap, it is just a question of reading the value of the heap pointer, rather than, say, trawling through a linked list of addresses to find somewhere to put the new data. For this reason, instantiating reference types under .NET is much faster. Interestingly, accessing them tends to be faster too, since the objects are compacted towards the same area of memory on the heap that will mean less page swapping may be required. Microsoft believes that these performance gains will compensate, and possibly more than compensate, for the performance penalty that we get whenever the garbage collector needs to do some work and has to change all those references to objects it has moved.