How the .NET Garbage Collector Works | Advanced .NET Programming

The aim of the garbage collector is to locate those managed objects that are no longer referenced by any variable that is currently in scope in the program. The theory is quite simple: if an object isn't referenced directly or indirectly by any object reference in scope, there is no legitimate way that your code can ever get access to that object ever again, and therefore the object can be deleted. (By 'legitimate' way, I mean by de-referencing managed object references. Theoretically, you could get access to one of these objects by doing some clever unmanaged pointer arithmetic but the assumption is that you are not going to be silly enough to try anything like that.)

When a collection occurs, the garbage collector initially builds a list of all the objects that can be accessed from your code. It does this by systematically trawling through every variable that is in scope, recursively de-referencing every object it finds until it has the complete list. Once it has that list, it can compact up the managed heap so that these objects form a contiguous block(though there may be gaps in the block if any objects have been pinned). That's the basis of the algorithm; there are various complications - in a number of places Microsoft has made the algorithm more sophisticated either for performance reasons or in order to allow for things like finalization.

In this section, we'll go over the principles behind the .NET implementation of the garbage collector - how a collection is actually performed. The full details of the algorithm are not publicly available, since Microsoft reserves the right to tweak details of it in future, but the rough outline has been documented. Understanding how garbage collections are performed can be important, partly because it helps you to understand how to implement finalizers that have as little adverse impact on performance as possible, and partly since there are several concepts that arise from the garbage collector, including generations and resurrection, which may under some situations affect your code.

The details presented here are specifically for Microsoft's implementation of the .NET garbage collector. Bear in mind that this is not the only way of implementing a garbage collector, so don't assume that any other garbage collector that you might encounter in future is implemented the same way. Also, bear in mind that there have been years of research put into optimizing garbage collection algorithms - so although some of the following discussion might give you the impression of a very long, cumbersome task, in practice the actual algorithm and data structures used have been so finely tuned that the process of garbage collection really does consume very little time.

In the following discussion, I'll frequently refer loosely to the managed heap. In fact, there are two managed heaps, because large objects are placed on their own heap, in a separate area of memory. What counts as large is presumably subject to change - tests by one of the reviewers of this chapter suggest the threshold is 80 KB, but you shouldn't rely on this figure staying unchanged through different .NET versions. This separate large object heap exists for performance reasons. As we will see shortly, the garbage collector normally compacts the (small object) managed heap by moving objects that are still in use around so they form one contiguous block. For small objects, the performance gain from having a small and contiguous heap outweighs the cost of performing the compaction. For sufficiently large objects, however, the cost of moving the objects is too great. So the large object heap is not compacted. Obviously, for the vast majority of applications, only a very tiny proportion of objects will be large enough to be placed on the large object heap.

Invoking the GC

In principle, there are two possible ways that the system might automatically instigate a collection. There could be some background thread that continuously monitors memory usage and instigates a collection whenever it deems it appropriate, or alternatively a check can be made whenever an object is allocated (in which case no background thread is necessary). Of these, Microsoft has gone for the second approach.

This means that in .NET, whenever an object is allocated, there is a possibility that a collection might be performed - this will happen if the memory requested pushes the data occupied by objects on the managed heap up to a combined size at which the garbage collector deems it beneficial to perform a garbage collection - and if that is the case then the garbage collector kicks in. At present, the actual test used appears to be that a portion of the heap known as Generation 0 (we'll explain generations later) is full, but we can expect that these details will certainly be tweaked for performance as future versions of .NET are released.

Besides this automatic invocation of the GC, it is also possible to explicitly request a collection in your code, by calling the static method, System.GC.Collect() - we'll examine this scenario later.

Taking Control of the Program

The first thing the garbage collector needs to do once invoked is to take over control of all the other running threads in your program. The collector will be running on whatever thread called new Foo() or GC.Collect(), but it clearly needs to intercept and pause any other threads to prevent them from attempting to access the heap while the GC is doing its job. Precisely how the GC does this is an implementation detail, and is going to be dependent on factors such as whether your application is running on a workstation or server (because of different performance requirements for these platforms), whether the application is running on a multiprocessor system or not, whether you are using multiple threads, and whether any of the threads are executing unmanaged code at the time the collection starts. You don't really need to worry about the details, other than the fact that one way or another the garbage collector will either suspend or take over the threads in the process, causing the application to pause while the collection happens. However, I will briefly mention the main techniques used, just so you can recognize the names if you see them in documentation:

Hijacking. The garbage collector overwrites the return address from a method with the address of one of its own methods. Result: when some method that is being executed returns, control gets transferred to the garbage collector instead of the invoking method.
Safe points. These are points in the code at which the JIT compiler decides that it would be suitable to suspend the thread so that a garbage collection can take place. The JIT compiler will insert calls at these points to a method that quickly checks whether garbage collection is pending, and if so suspends the thread.
Interruptible Code. This is the simplest technique in concept, though probably not in implementation. The garbage collector simply suspends all threads in your application. It then inspects the memory and uses tables produced by the JIT compiler to determine the exact state your application was in when the threads got suspended, and therefore which variables are in scope.

Identifying the Garbage

The garbage collector starts off with what are known as the roots of the program. The roots are those object references that can be immediately and correctly identified as in scope. They can be identified relatively quickly because the JIT compiler creates tables indicating which roots are accessible at specific byte offsets from the start of each method. The program roots consist of the following objects:

All global variables
All static fields of classes that are currently loaded
All variables that are stored on the current stack frame - in other words, the arguments and locals in the current method and in all of its callers, right up to the program entry point
All references that are sitting in an internal garbage collector data structure called the freachable queue (we'll discuss this queue later when we examine finalizers and resurrection)
All member fields of any of the above (where the variables concerned are not primitive types)

Note that only reference types are garbage collected, but value types are still inspected in the search for objects, in case they contain embedded references.

The roots form an initial list of objects that are currently accessible by code in the program. The next task is to identify any other objects that can be accessed indirectly through these roots.

The garbage collector then takes the first root and de-references it (the process is sometimes colloquially called "pointer chasing"). It examines the object so obtained (as well as any embedded value types) to see if there are any embedded references to any other objects among the member fields. The garbage collector will check any such references to make sure they are not already on the list being built of objects in scope - if one is, it means that object has already been examined and there's no need to check it again. For those objects that are not on the list, the garbage collector de-references each in turn. You can see where this is going - the same process happens recursively for every object identified. At the end of this process, when the garbage collector has completed adding all objects referenced indirectly from the last root, it will have a complete list of objects that are accessible by the program. From this list, it can deduce which memory on the heap now contains garbage - it's simply the totality of memory in the managed heap minus the memory occupied by the accessible objects.

Compacting the Heap

Our list now identifies all those objects that have any business existing on the managed heap, so the next task is usually to compact the heap down so it contains only these objects. Note that I say 'usually': the large object heap won't be compacted. It's also possible that, based on its analysis of your application as well as its assessment of memory pressure from other applications, and possibly other factors, the garbage collector might decide there is no point performing a compaction of the main heap either.

Microsoft hasn't documented exactly how the compaction process is implemented, but the logical principle would work like this: new addresses are assigned to each object, so that the objects form a contiguous block starting at the beginning of the managed heap. Each object in turn is then moved to its new location, and any references to that object are updated. (Note that this means that as part of the process of building the list of available objects, the garbage collector will need to have made an additional list, which indicates where all the references to each object are, so it can update them.) In the process, all the objects that are not referenced from anywhere and no longer needed will be lost - quite brutally: the garbage collector will simply stomp over the memory they occupy, writing new data to that memory as it wishes. Although the logical concept of compacting the heap is quite simple, you can probably see that the internal algorithm is going to be fairly complex, since for performance reasons it will need to be optimized to move as few objects as possible. And things could get hairy when you are trying to update references, when those references themselves might have already been moved if they are members of other objects. It's the kind of algorithm that I'm glad I didn't have to code up!

One point about identifying the garbage is that this algorithm clearly shows up the need for objects to be self-describing. In Chapter 3, we saw how the second word in each reference object, the method table, can be used to identify the type of the object and hence the meaning of all the data in it. In principle, the garbage collector has to look up this information for each object it discovers, in order to discover how big the object is, and which words of data in its memory will be references to other objects (in practice, this would be too slow, so a compact GC encoding struct is used). In the case of structs that are located in the stack frame, there is no such object information to identify the type of object that the garbage collector can use. Instead, for this data, the JIT compiler emits various tables of information about the state of the stack frame, which the garbage collector can use to track down the object references it needs.

The two parts of the garbage collection that we've just described, of identifying the garbage and then collecting it, are sometimes referred to as the mark and sweep phases.

Generations

The concept of generations has been introduced for performance reasons. The idea is that you can speed up the process of garbage collecting if, instead of trying to look through everything in the program to identify objects that might not be needed, you only look in the places where you are most likely to be able to find free space most quickly. So where are the places most likely to supply free space? Based on a considerable amount of documented research over many years, on both Windows and on other platforms, it turns out that in most applications, the shortest-lived objects are usually the ones allocated most recently - and this principle is especially true for object-oriented applications. So, at any given point in time, the objects that were most recently allocated are actually the ones most likely to have already moved out of scope.

The garbage collector takes advantage of this principle by assigning each object to a generation, according to how many times it has survived a garbage collection: The first time that a garbage collection happens within the lifetime of a process, all objects in the program space are subject to the collection. At the end of the collection, we have a set of objects that have now survived their first garbage collection. These are regarded as Generation 1. All objects subsequently allocated will be Generation 0. The garbage collector can easily tell which object is in which generation just from its position in the managed heap - the Generation 1 objects will be located before whatever the position of the heap pointer was just after the last collection.

click to expand

The next time a garbage collection happens, only Generation 0 variables will be cleaned up. In other words, the first part of the heap, containing the Generation 1 objects, will not be compacted. This is based on the assumption that the Generation 0 part of the heap is where most free space is to be gained. Compacting Generation 1 is on balance of probabilities likely to lead to fewer gains for the work involved. Only if doing a Generation 0 compact doesn't free up sufficient space will the garbage collector turn to Generation 1. And of course, the next time a collection happens, the Generation 1 objects get moved up to Generation 2, and the Generation 0 objects up to Generation 1. This could continue indefinitely, but Microsoft has imposed a limit on the number of generations. At the time of writing, Generation 2 is the oldest generation permitted, but Microsoft may change that if it decides that performance considerations justify it. Also bear in mind that the concept of generations does not apply to objects on the large object heap (from tests, these appear to be automatically categorized as Generation 2).

Finalizers and Resurrection

The process of removing objects that have finalizers is complicated by the need to execute the finalizers. There are a number of possible ways of dealing with this. The simplest option would be for the garbage collector, after it has constructed a list of objects that can be removed, to check if each object has a finalizer defined and if so to execute the finalizer before compacting the heap. However, Microsoft has chosen not to do this. One reason is that executing the finalizers during the garbage collection and heap compaction has the potential to dramatically increase the time taken for a garbage collection. The garbage collector has no idea how long a given finalizer is going to take to execute - and since finalizers usually do things like closing database connections or network connections, this time could be large. Moreover, the application is for all practical purposes suspended while the collection is happening. The last thing you want is some finalizer taking ages to close a connection (or perhaps taking ages simply because someone coded it up wrong). This could lead to a situation where the garbage collector sits around doing nothing because it's waiting for the finalizer to exit, your whole program sits around doing nothing because it's waiting for the garbage collector to finish and release control of the program threads, and your user sits there clicking a button every few seconds and cursing your application for being so slow. I could go on, but you should get the picture that executing finalizers during a garbage collection is a very bad idea. So instead, the garbage collector uses an algorithm in which removal of finalizable objects is delayed, so that finalizers can be executed on a separate dedicated thread while the program is running normally, in between garbage collections.

Another important reason for running the finalizers on a separate dedicated thread is that the garbage collector executes on whatever thread happened to trigger the collection. If finalizers run on that thread, there would be no way for the developer to predict what thread a finalizer will run on - if that thread happens to own certain locks on objects, there is a risk of deadlocks occurring (see Chapter 9).

In detail what happens is this: when an object is allocated in the first place, the Virtual Execution System (VES) checks whether that object is of a type that has a finalizer defined. (Note that the System.Object.Finalize() method doesn't count here - the VES knows that Object.Finalize() doesn't do anything, and it's only interested in classes that override Object.Finalize()). If an object does have a finalizer, it gets added to a data structure maintained by the garbage collector and known as the Finalization Queue.

Later on, during garbage collection, the garbage collector will obtain its list of live objects. I said earlier that any heap memory not occupied by an object on this list can be overwritten. In fact, that's not quite true: there may be objects that have ceased to be referenced from the program, but which need to have their finalizers executed - and we certainly don't want to remove these objects from memory before that has happened. To take account of this, the garbage collector will cross-reference the list of objects in the finalization queue with its list of live objects. Any object that is not live but is listed in the finalization queue needs to have its finalizer executed: any such object is removed from the finalization queue and placed instead on the freachable queue that we mentioned above (freachable means that the object is reachable but ready to be finalized). We listed the freachable queue above as one of the places where the garbage collector searches for roots. Objects on the freachable queue are regarded as still alive (albeit on .NET's equivalent of death row), along with any objects they refer to directly or indirectly. Such objects will not be garbage-collected this time around.

The garbage collector then sets about doing its work: it compacts the heap, updates all the pointers in the GC roots and referenced objects, and releases control back to the program. Meanwhile, the separate dedicated thread that is devoted to finalization is woken up. This thread sleeps as long as there are no objects on the freachable queue, and is woken when objects get placed there. It processes each of these objects by executing its finalizer, then removing the object from the finalization queue. This means that the next time a collection occurs, that object will be ready to be removed. It all sounds a complex process, but it does mean that the garbage collector doesn't have to worry about finalizers - they'll be executed in time on their dedicated thread, as the next diagram shows:

click to expand

One point to note from this is that if for any reason a finalizer hangs, this won't normally directly affect the rest of the program, but it will mean that no other finalizers will get executed. Ever. Because all finalizers are run on the same thread, each one only gets executed after the previous finalizer has exited. Also, as a minor point, although I've been talking as if the finalizers are executed between garbage collections, the finalization thread runs at a high priority, which means that in most cases each finalizer will be executed as soon as the relevant object is placed on the freachable queue. Only if the finalization code requires the thread to sleep (for example while waiting for a database connection to close) will execution of the finalizer be delayed (unless, of course, your code attaches a higher than normal priority to one of the other threads under its control).

Controlling the Garbage Collector Programmatically

In most cases it's best to leave the garbage collector to do its job, on the basis that the Virtual Execution System can judge a lot better than you when more memory is needed. However, there are occasions when you might feel it is appropriate to force a garbage collection. I should stress that these occasions are very rare, so unless you have a very good reason to force a garbage collection at a particular time, then my strong advice is not to.

In practice, there are normally only two scenarios in which it is appropriate to force a collection:

You have reached a point in your code where you know that your program isn't making heavy demands on the system, but you also know that in a short time your code will be doing some intensive processing that for performance/responsiveness reasons you don't really want to be interrupted. For example, you might be coding up a real-time controller that is about to execute a loop that must repeat exactly every 10 milliseconds.
It is sometimes suggested that calling GC.Collect() can be useful if your code has just stopped using a large number of objects, so that you can be fairly sure that a collection will remove lots of memory, forcing a reduction in the program's virtual memory. Unfortunately, the situation here isn't quite so straightforward: it is certainly true that, provided a compaction actually occurs, this might reduce the number of page faults your application experiences, assuming the objects that would be collected are scattered widely through the heap. However, a collection won't necessarily reduce the program's virtual memory because the garbage collector might choose not to decommit the memory that has been made available. Instead, the GC may decide to hang on to the memory in case it's needed by new objects in the near future, saving on a potentially expensive decommit/commit cycle. The GC will use various heuristic algorithms and statistics to determine in each case whether to decommit memory, so you shouldn't rely on that happening. We'll explain the process of decommitting memory in Chapter 7.

For situations like these, Microsoft has provided a programmatic interface to the garbage collector. This interface is available via the System.GC class, which is defined in mscorlib.dll. We won't go over all the available System.GC methods here, since they are documented in MSDN. But we will show you how to force a garbage collection, and how to control finalization.

Requesting a plain ordinary garbage collection, that is to say one that simply collects Generation 0 objects, can be done like this:

 GC.Collect();

Alternatively, you can explicitly specify the generations you want collected; for example, this code will cause Generations 0 and 1 to be collected:

 GC.Collect(1);

If you are in the situation in which you have just released a large number of objects that you want to be collected, and these objects have had a relatively long lifetime, you are probably better off finding out which generations the objects are now from and then calling GC.Collect(). Here's how you would achieve this:

 int generation = GC.GetGeneration(someObj); someObj = null; GC.Collect(generation);

In this code we assume that someObj refers to one of the objects that you no longer need, and that is representative of the age of the batch you believe has recently died. We call the GC method GetGeneration() to find out what generation this object is from. We then make sure we have no outstanding references to this object, and ask for a collection.

If you need to know the maximum generation number that the current version of the GC allows, the MaxGeneration property will return this. Hence this statement will cause all generations to be collected:

 GC.Collect(GC.MaxGeneratiton);

You can even find out an estimate of how many bytes of memory are currently occupied in the heap using the GC.GetTotalMemory() method:

 Console.WriteLine(GC.GetTotalMemory(true));

Bear in mind, however, that it is not possible to obtain an accurate picture of memory without a performance hit - you can be accurate or quick, but not both. The bool parameter passed in (true in the above snippet) controls this: a value of true means that the collector will take its time, letting finalizers run, and forcing a collection in order to compact the memory - possibly several collections in succession, in order to get a stable value for the amount of memory in use. A value of false will just take a quick estimated value - not much more than reading an internal field in the garbage collector, which may or may not be accurate.

One other useful thing you can do with the garbage collection concerns finalizers. We've already seen that if an object has a finalizer defined, the process of garbage-collecting it is much more complex, and you will not be surprised to learn that this can affect performance if there is a large number of such objects. If a given object is of a type that has a finalizer defined but you know that for some reason there is no reason for the finalizer to execute on this particular object, then you can get this object explicitly removed from the finalization queue by calling the GC.SuppressFinalize() method:

 // obj refers to an object that doesn't now need to be finalized GC.SuppressFinalize(obj);

You would typically do this if your code has just explicitly cleaned up any unmanaged resources referenced by this object, for example using the Dispose() pattern.

You can even get an object added back to the finalization queue if you realize subsequently that it does need to be finalized:

 GC.ReRegisterForFinalize(obj);

Having said that, personally I wouldn't recommend use of this method. You might have a use for it, but it's a messy technique, and if you find that you're using ReRegisterForFinalize() a lot, this suggests that there may be problems with the architecture of your code.