Implementing Dispose() and Finalize() | Advanced .NET Programming

In this section, we will examine the Finalize() and IDisposable.Dispose() methods in more detail. We will look at the situations in which you should implement these methods for a type, and the general principles you should follow when implementing them.

For this section, we will use the term client to mean any other object that instantiates or invokes methods on the class for which we are implementing Dispose() or Finalize(). You shouldn't assume that the client is in a different assembly (though if Dispose() or Finalize() is implemented by a public or protected class in a DLL assembly, it may well be).

Finalize/Dispose() Semantics

Before we examine how to implement Finalize() and Dispose(), we will briefly review the respective purposes and semantics of these methods.

Finalize

There is arguably only ever one reason to implement Finalize() in production code, and that is to clean up unmanaged resources. If you define a Finalize() method, the only thing you should do in it is to free any unmanaged resources that are directly referenced by that object, and perform any associated operations (for example, you'd probably wish to flush an IO buffer prior to closing it). Don't do anything else: don't try to display a message to the user, and don't try to reference any other objects, because you have no way of knowing whether those other objects still exist. Nor should you put in any code that is dependent on being called from a certain thread - as we have seen, the finalizer will be executed on a special dedicated GC thread, and it won't be on any thread that is under your control.

The definition for Finalize() in IL looks roughly like this:

 .method family virtual instance void Finalize() cil managed {    // Code }

The CLR will recognize the method as a finalizer from its vtable slot - in other words, from the fact that it overrides Object.Finalize(). This means that there is no need for it to have the rtspecialname attribute, which you would normally associate with methods that are treated in some special way by the runtime. This does, however, mean that if coding in IL, you must take care to define the method as virtual - otherwise it won't occupy the correct vtable slot. Most high-level languages will take care of that for you when compiling finalizers. Also, for correct behavior, the code in the body of the Finalize() method should call the base class implementation of Finalize(), if there is an implementation in the base class. Also, it's good practice always to define Finalize() as protected, since there is really no reason for outside code to invoke it. Again, most high-level compilers will take care of these points for you.

The syntax for declaring finalizers in different high-level languages varies, since most high-level languages wrap their own syntax around finalizers to make things easier for the developer. For example, both C# and C++ use the ~<ClassName> syntax, while if coding a finalizer in VB, you should actually explicitly define the method as Finalize().

 // C# code class MyClass {    ~MyClass()    {       // Finalization code here    }    // etc. } ' VB Code Class SomeClass    Protected Overrides Sub Finalize()       'Finalization Code Here    End Sub End Class

Bear in mind that finalizer syntax in different high-level languages is purely an artefact of each language. It's the IL version I quoted earlier that represents the true picture. Both of the above snippets will compile to the correct IL.

Dispose()

If your client code is well-behaved, then it will be in Dispose() that resources will normally be freed up - whereas the finalizer really serves as an emergency, point-of-last-recall for if client code fails to call Dispose() .This means that, as a good rule of thumb, if you are implementing a finalizer for a class, you should always implement IDisposable for that class too.

Unlike the finalizer, Dispose() will normally clean up managed and unmanaged resources. Why the difference? When the finalizer has been called, the garbage collector will be dealing with managed objects anyway. This isn't the case for Dispose().

Another difference between Dispose() and Finalize() is that, whereas Finalize() is a method that is known and treated specially by the CLR, Dispose() is a plain ordinary method, which just happens to be a member of an interface (IDisposable), which is commonly understood by client code. There is no intrinsic support for Dispose() in the CLR, though there is specific support for Dispose() in the syntax of C# (the C# using statement provides a simple shorthand syntax to ensure that Dispose() is invoked correctly - we'll see using in action shortly).

The fact that Dispose() is defined by the IDisposable interface is important since it gives client code a simple means of checking whether a given object implements Dispose(): just perform a cast to see if it implements IDisposable. The C# using statement also relies on the presence of this interface.

Some classes define a Close() method instead of Dispose(). Close() is intended to perform exactly the same function as Dispose(), except that it tends to be preferred in situations in which traditional terminology talks of closing an object (such as closing a file). There may also be an implied difference in meaning: closing a resource is often understood by developers to indicate that the resource might later be reopened, whereas disposing an object implies that you have finished with that object for good. You may want to bear in mind how users will probably understand your code when you write implementations for Close() and Dispose(), Other than that, Close() is for all practical purposes another name for a method that does the same thing. Personally, I prefer Dispose() because it is defined by the IDisposable interface. If you write a Close() method instead of Dispose(), you are clearly not implementing that interface, which means you don't get any of the benefits of the interface (such as the C# using statement).

Cleaning up Unmanaged Resources

If your code holds any precious unmanaged resources, it will normally be very important that these resources are freed at the earliest possible moment. This is because so many unmanaged resources either come in a limited supply (for example, database connections from a connection pool) or on a mutual exclusion basis (for example, open file handles). So, for example, a file handle that wasn't freed at the right time could easily be holding up another application that needs to open the same file.

We will start by presenting a sample that illustrates a typical scenario of an application that holds on to an unmanaged resource. The sample is called DisposeUnmanagedResource, and shows how you could write code to ensure that the resource is cleaned up properly in two situations: if the resource is being used locally within a method, or if it is held on a more long-term basis, as a field of a class.

To start off, we need some code that simulates the precious resource itself. This resource could be a database connection, or a file or GDI object, and so on. Of course it's not possible to generalize too much about how a resource is obtained - that will depend very much on the nature of the resource. However, rather than tying the sample down to a specific resource, I want it to be as generic as possible, so I'm going to define a class that represents the programming model that is typically used for those resources that are known to the Windows operating system - which covers quite a wide range. These resources are usually represented by handles, which are integers that normally index into certain internal data structures maintained by Windows, and which identify the particular object being used. Although a handle is just an integer, it is normally represented by the IntPtr class in managed code - recall that this class is intended as a useful wrapper for integers whose size is dependent on the machine architecture.

Here's the class which simulates API functions - you should think of each static member of this class as standing in place of some [DllImport] function:

 class APIFuncticmSimulator {    public static IntPtr GetResource()    {       // Substitute for an API function       return (IntPtr)4;    }    public static void ReleaseResource(IntPtr:handle) ' '   .    {       // In a real app this would call something to release       // the handle    }    public static string UseResource(IntPtr handle)    {       return "The handle, is " + handle.ToString();    } }

The GetResource() method returns a hard-coded number (4), which we are using to represent some handle. We assume that ReleaseResource() frees the precious unmanaged resource represented by this handle, while UseResource() accesses this resource to obtain some information. This model is pretty typical of the way things usually work with Windows API functions, and so you could for example easily envisage replacing GetResource() with a call to some API function such as OpenFile(), CreateCompatibleDC(), or CreateBitmap(), and ReleaseResource() with CloseHandle(), DeleteDC(), or DeleteObject() in real code.

Now for the class that maintains this simulated resource. This class is called ResourceUser, and forms the crux of the sample:

 class ResourceUser : IDisposable {    private IntPtr handle;    public ResourceUser()    {       handle = APIFunctionSimulator.GetResource();       if (handle.ToInt32() ==0)          throw new ApplicationException();    }    public void Dispose()    {       lock(this)       {         if (handle.ToInt32() != 0)         {            APIFunctionSimulator.ReleaseResource(handle);            handle = (IntPtr)0;            GC.SuppressFinalize(this);         }       }    }    public void UseResource()    {       if (handle.ToInt32() == 0)          throw new ObjectDisposedException(               "Handle used in ResourceUser class after object disposed");       string result = APIFunctionSimulator.UseResource(handle);       Console.WriteLine("In ResourceUser.UseResource, 'result is :" +                         result);    }    ~ResourceUser()    {       if (handle.ToInt32() != 0)          APIFunctionSimulator.ReleaseResource(handle);    } }

This class maintains the handle to the resource as a member field, which is initialized in the ResourceUser constructor. Notice that the constructor tests the value of the handle, so that if it is zero, an exception will be thrown and the object won't be created (API functions that return handles normally return 0 as the handle value if the attempt to connect to or create the resource failed). The code that we are interested in is of course the code to clean up the resource. The finalizer simply frees the resource (provided a non-zero handle value indicates we are holding onto a resource) and does nothing else. The Dispose() method is more interesting. We start by locking the ResourceUser instance:

      lock(this)      {

The point of this is that it ensures that the Dispose() method is thread-safe, by preventing more than one thread from executing it at the same time. If you're not familiar with the C# lock statement, we'll explain it more in Chapter 7. The finalizer doesn't need to worry about thread safety since finalizers are executed on their dedicated thread.

Next we check if we are actually holding on to a precious resource, and release it if we are.

          if (handle.ToInt32() != 0)          {             APIFunctionSimulator.ReleaseResource(handle);             handle = (IntPtr)0;             GC.SuppressFinalize(this);          }

Note the call to GC.SuppressFinalize() - that's important. Since the resource is now freed, there is no longer any need for the finalizer of this object to be invoked.

Finally, I've also defined a UseResource() member method of this class, which actually uses the precious resource. Notice that the UseResource() method tests that the handle is not zero, and throws an ObjectDisposedException if a zero handle is detected:

    public void UseResource()    {       if (handle.ToInt32() == 0)          throw new ObjectDisposedException(               "Handle used in ResourceUser class after object disposed");

In our sample class, we can be sure that a zero handle indicates that some client code has already called Dispose() - I've defined the UseResources class in such a way that there is no other situation that can cause the handle to be zero. However, in a more complex class, you might prefer to have a bool variable to test this condition separately, in order that you can throw a more appropriate exception if the handle somehow becomes zero through some other means:

 if (disposed)     // bool disposed set to true in Dispose() method    throw new ObjectDisposedException(         "Handle used in UseResource class after object disposed"); else if (handle.ToInt32() == 0)    throw new MyOwnException("Handle is zero for some reason");

ObjectDisposedException() is an exception provided in the framework base class library for this explicit purpose. Notice that the way I've coded up the Dispose() method means that if Dispose() is accidently called more than once, all subsequent calls simply do nothing and don't throw any exception - this is the behavior that Microsoft recommends.

Now for the "client" class that tests our ResourceUser class. Here's the Main() method:

    static void Main()    {       ResourceUser ru = new ResourceUser();       ru.UseResource();       ru,UseResource();       // WRONG! Forgot to call Dispose()       using (ru = new ResourceUser())       {          ru.UseResource();          ru.UseResource();       }       ru = new ResourceUser();       try       {         ru.UseResource();         ru.UseResource();       }       finally       {          ru.Dispose() ;       }       UseUnmanagedResource();    }

This method first instantiates a ResourceUser instance, and calls UseResource() a couple of times, but doesn't call Dispose(). This is bad behavior by the client, but doesn't do so much harm in our case because we've followed good practice in supplying a finalizer - which means the precious resource will be freed in the finalizer eventually (sometime soon after the next garbage collection occurs). Next, the Main() method instantiates another ResourceUser object - but does this in the context of a C# using statement. The using statement causes the compiler to automatically insert code to ensure that the object's IDisposable.Dispose() method is always invoked at the closing curly brace of the using block. This is the recommended way in C# of calling up objects that implement IDisposable.Dispose().

Finally, the Main() method instantiates another ResourceUser object, then uses the object inside a try block, with the Dispose() method being called from inside a finally block, which guarantees that it will be called even in the event of an exception while the unmanaged resource is in use. This code is exactly equivalent to the previous using block, and illustrates (a) what the using block is expanded to by the compiler, and (b) how client code should use objects that define a Dispose() method in other languages that don't have the equivalent of C#'s using block shortcut syntax. This includes VB and C++ - which means that the VB version of this sample on the Wrox Press web site omits that part of the sample.

Finally, we call a method called UseUnmanagedResource(). I've included this method just for completeness - it illustrates how to use precious resource where the resource is scoped to a method rather than to a member field. Here's the definition of this method:

 static void UseUnmanagedResource() {    IntPtr handle = APIFunctionSimulator.GetResource() ;    try    {       string result = APIFunctionSimulator.UseResource(handle);       Console.WriteLine("In EntryPoint.UseUnmanagedResource, result is :" +                         result);    }    catch (Exception e)    {       Console.WriteLine("Exception in UseUnmanagedResource: " + e.Message);    }    finally    {       if (handle.ToInt32() != 0)       {          APIFunctionSimulator.ReleaseResource(handle);       }    } }

Classes that Contain Managed and Unmanaged Resources

We will now extend the previous sample to illustrate the recommended way of implementing Dispose()/Finalize() if a class holds on to both managed and unmanaged resources. I stress that, although for completeness this sample shows you how to implement such a class, we'll see in the following discussion that this is rarely a good design. The new sample is downloadable as the UseBothResources sample, and is obtained by modifying the ResourceUser class from the previous sample as follows. First we add a large array that is a member field of the class:

 class ResourceUser : IDisposable {    private IntPtr handle;    private int[] bigArray = new int[100000];

Although we won't actually implement any code that makes use of this array, its purpose should be clear: it serves as a large member field to illustrate how to clean up a large member object when we know that the ResourceUser instance is no longer required.

In principle bigArray is a managed object, and so it will be automatically removed when it is no longer referenced, and a garbage collection occurs. However, since the ResourceUser class has a Dispose() method anyway, we may as well use this method to remove the reference to the array with a statement like this:

        bigArray = null;

Doing this will ensure that even if some client maintains its (now dead) reference to the ResourceUser object, then bigArray can still be garbage-collected. The way that we implement this is by defining a new one-parameter Dispose() method, and by modifying the Dispose()/Finalize() methods as follows:

 public void Dispose() {    lock(this)    {       Dispose(true);       GC.SuppressFinalize(this) ;    } } private void Dispose(bool disposing) {    if (handle.ToInt32() ! = 0)    {       APIFunctionSimulator. ReleaseResource(handle);       handle = (IntPtr)0;    }    if (disposing)       bigArray = null; } ~ResourceUser() {    Dispose(false); }

You should be able to see by following through the logic of this code that the finalizer still does nothing except remove the unmanaged resource - this is important because of the principle that finalizers must not contain any code that references other objects that might have been removed - because there is no way of guaranteeing when the finalizer will execute. It is not the responsibility of finalizers to deal with managed objects - the garbage collector does that automatically. Calling Dispose(false) instead of directly cleaning up the resource from the finalizer gives us an extra method call, but means we can keep all our resource cleanup code in one method, making for easier source code maintenance.

On the other hand, our new implementation of IDisposable.Dispose() cleans up both managed and unmanaged resources - though in the case of managed objects, cleaning up simply means setting references to these objects to null in order to allow the garbage collector potentially to do its work more promptly.

Guidelines for Implementing Dispose() and Finalizers

Let's now see what general principles of good practice we can derive from the above samples.

When to Implement Dispose()

When should you implement Dispose()? This question is not always as clear-cut as it is for Finalize(). In the case of Finalize(), it's simple: if your class maintains unmanaged resources, implement Finalize(). If it doesn't, then don't. Period. But for Dispose(), the situation isn't always as clear-cut, since sometimes there's a balance between the benefit that Dispose() helps the garbage collector out, and the problem that Dispose() makes using your class more complicated, since clients have to remember to call it. And if you have a complicated arrangement of classes that refer to each other, figuring out when to call Dispose() can be a non-trivial issue.

The situation is easiest to understand if you have wrapper objects that contain references to other objects, like this:

click to expand

In this diagram we assume that the only references held to objects B and C are held by A - in other words, A serves as the only point of access to B and C. The corollary is that the lifetime of A controls the lifetimes of B and C. With this arrangement, the conditions under which you will probably want A to implement Dispose() are:

If either B or C implement Dispose(), then clearly A must implement Dispose() to call B.Dispose() and C.Dispose() - otherwise there's no way for B and C to be disposed of correctly.
If A is directly holding on to unmanaged resources, it should implement Dispose() to free those resources. Note that if B or C hold unmanaged resources, A does not need to worry about that directly. But in that case B or C ought to implement Dispose() to clean up those resources - in which case rule 1 above forces A to implement Dispose() anyway.
If conditions 1 and 2 don't apply now, but you believe they may apply for some future version of your A class, it's probably a good idea to implement Dispose() now, even if it doesn't do anything. That way people who write client code will know to invoke it, and you won't end up in a situation two years down the line where lots of legacy client code instantiates what's now the new version of A, and doesn't call Dispose() when it should do.
If none of the above cases apply, then you might want A to implement Dispose() just to set the references to B and C to null, especially if B and C are very large objects (such as arrays). The benefit to doing this is that if for some reason client code holds on to a reference to A long after calling Dispose(), then at least this reference isn't propagated to B and C, so that the garbage collector would still be able to remove B and C the next time it occurs. On the other hand, this benefit is more marginal, and you might feel it's more than offset by the added complexity involved with using A, and therefore choose not to implement Dispose().

That covers the situation in which there is a neat hierarchy of wrapper classes. We still need to discuss the situation where there is some more complex arrangement of interrelated classes. Perhaps, with the above diagram, there are other objects in your code that may hold on to references to B and C, so A can't in any way claim ownership of them. Obviously, you should still implement Dispose() on A if A has any unmanaged resources, but A.Dispose() should certainly not call B.Dispose() or C.Dispose(). This illustrates an important principle, that you should only normally implement Dispose() to assist in getting rid of managed objects if you can identify a clear lifetime-containment relationship for the objects. Otherwise, the programming gets complicated and the code is unlikely to be very robust.

Finalizable Class Architecture

The code samples we've just seen covered two means of implementing finalizers, depending on whether the object that directly referenced an unmanaged resource also referenced other large managed objects. I should stress that, although I presented both cases, in most situations the former example shows the best means of designing your classes: it's generally speaking not a good idea to have a finalizable object that contains references to other managed objects, especially large objects. The reason for this is that those other objects will not be garbage-collected as long as this object is on the freachable queue, which means that those other objects will end up unnecessarily missing a garbage collection - or perhaps many, even hundreds, of collections if the object had been previously promoted to a higher generation. If you find yourself writing an object that needs a finalizer and references other objects, consider separating it into two objects: a wrapper object that references all the other managed objects, and a small internal object whose sole purpose is to provide an interface to the unmanaged resource, and which doesn't contain references to any other managed objects. Another related point is that it's generally a good idea to arrange your finalizable classes so that each precious unmanaged resource is clearly associated with just one instance of a finalizable class, giving a one-to-one relationship between the lifetime of this class and the lifetime of the resource. Doing this makes finalizers easier to code, and a robust architecture, since you know that at the time of execution of the finalizer, it's definitely OK to delete the resource.

This means that with the earlier DisposeBothResourcesSample, a more sensible architecture would have involved class definitions similar to the following:

 // This class does NOT implement a finalizer class MmanagedResourceUser : IDisposable {    private int[] bigArray = new int[100000];    private UnmanagedResourceUser handleWrapper;    // etC. } // This class does implement a finalizer class UnmanagedResourceUser : IDisposable {    private IntPtr handle;    // etc. }

In a more general case, if you have more than two managed objects (say E and F) that need to use an unmanaged resource, the actual maintenance of the resource should be the responsibility of a third object (say Y):

click to expand

A corollary of this is that the class that implements the finalizer should keep its reference to the unmanaged resource private, and not expose it to other objects - otherwise you are going to have potential problems because when the finalizer is executed some other object might still be holding on to the resource.

Incidentally, it can be quite an interesting exercise to do a title-only search for Finalize in the MSDN documentation to see which classes have finalizers defined. You'll see that they are always the classes that, based on what they do, will need internally to wrap an unmanaged resource such as a windows handle directly. There is quite simply no other conceivable reason for implementing a finalizer. For example, System.Drawing.Icon implements Finalize() (it will presumably contain an HICON), as do Microsoft.Win32.RegistryKey (which will contain an HKEY), and System.IO.IsolatedStorage.IsolatedStorageFile (HFILE).

There are a few Microsoft-defined classes that allow the ability for other code to get access to handles - for example look at the Graphics.GetHdc() method, which returns a device context handle, HDC. This looks like it breaks our principle of having only one wrapper class. But you'll find that where this happens, severe restrictions are imposed on the use of such methods (for example, after calling GetHdc() on a Graphics object, you basically can't use that graphics object again until you've called ReleaseDc() to release the device context.

Finalizers and Value Types

We finish with a word about the particular problems you will face if you want to implement a finalizer for a value type. In general, my advice would be: don't. If a type needs to implement a finalizer, then define it as a reference type instead. The reason is that finalizers on value types will not be executed unless the type is boxed, since only references can be placed on the finalization queue. For this reason, many high-level languages, such as C#, will not permit you to define a finalizer on a value type, although doing this is possible if you code directly in IL.