Garbage Collection and Finalization

for RuBoard

Garbage Collection and Finalization

Memory management is a critical aspect of programming and can be the source of many errors. Whenever a resource is created, memory must be provided for it. And when the resource is no longer needed, the memory should be reclaimed. If the memory is not reclaimed, the amount of memory available is reduced. If such "memory leaks" recur often enough (which can happen in long-running server programs), the program can crash. Another potential bug is to reclaim memory while it is still required by another part of the program.

.NET greatly simplifies the programming of memory management through an automatic garbage collection facility. The CLR tracks the use of memory that is allocated on the managed heap, and any memory that is no longer referenced is marked as "garbage." When memory is low, the CLR traverses its data structure of tracked memory and reclaims all the memory marked as garbage. Thus the programmer is relieved of this responsibility.

Although a good foundation for resource management, garbage collection by itself does not address all issues. Memory allocated from the managed heap is not the only kind of resource needed in programs. Other resources, such as file handles and database connections, are not automatically deallocated, and the programmer may need to write explicit code to perform cleanup. The .NET Framework provides a Finalize method in the Object base class for this purpose. The CLR calls Finalize when the memory allocated for an object is reclaimed.

Another concern with garbage collection is performance. Is there a big penalty from the automated garbage collection? The CLR provides a very efficient multigenerational garbage collection algorithm. In this section we examine garbage collection and finalization in the .NET Framework, and we provide several code examples.

Finalize

System.Object has a protected method Finalize , which is automatically called by the CLR after an object becomes inaccessible. (As we shall see, finalization for an object may be suppressed by a call to the method SuppressFinalize of the System.GC class.) Since Finalize is protected, it can only be called through the class or a derived class. The default implementation of Finalize does nothing. For any cleanup to be performed, a class must override Finalize . Also, a class's Finalize implementation should call the Finalize of its base class.

C# Destructor Notation

The C# language provides a special tilde notation ~SomeClass to represent the overridden Finalize method, and this special method is called a destructor . The C# destructor automatically calls the base class Finalize . Thus the following C# code

 ~SomeClass()    {      // perform cleanup    } 

generates code that could be expressed

 protected override void Finalize()  {     // perform cleanup     base.Finalize();  } 

The second code fragment is actually not legal C# syntax, and you must use the destructor notation.

Although C# uses the same notation and terminology for destructor as C++, the two are very different. The C++ destructor is called deterministically when a C++ object goes out of scope or is deleted. The C# destructor is called during the process of garbage collection, a process which is not deterministic, as discussed below.

Limitations of Finalization

Finalization is nondeterministic. Finalize for a particular object may run at any time during the garbage collection process, and the order of running finalizers for different objects cannot be predicted . Moreover, under exceptional circumstances a finalizer may not run at all (for example one finalizer goes into an infinite loop, or a process aborts without giving the runtime a chance to clean up).

Also, the thread on which a finalizer runs is not specified.

Another issue with finalization is its effect on performance. There is significantly more overhead associated with managing memory for objects with finalizers, both on the allocation side and on the deallocation side. [15]

[15] Finalization internals and other details of garbage collection are discussed in depth in the two-part article "Garbage Collection" by Jeffrey Richter, MSDN Magazine , November and December 2000.

Thus you should not implement a finalizer for a class unless you have very good reason for doing do. And if you do provide a finalizer, you should probably provide an alternate, deterministic mechanism for a class to perform necessary cleanup. The .NET Framework provides a Dispose design pattern for deterministic cleanup.

Unmanaged Resources and Dispose

The classic case for a finalizer is a class that contains some unmanaged resource, such as a file handle or a database connection. If they are not released when no longer need, the scalability of your application can be affected. As a simple illustration, consider a class that wraps a file object. We want to make sure that a file that is opened will eventually be closed. The object itself will be destroyed by garbage collection, but the unmanaged file will remain open , unless explicitly closed. Hence we provide a finalizer to close the wrapped file.

But as we discussed, finalization is nondeterministic, so a file for a deleted object might hang around open for a long time. We would like to have a deterministic mechanism for a client program to clean up the wrapper object when it is done with it. The .NET Framework provides the generic IDisposable interface for this purpose.

 public interface IDisposable  {     void Dispose();  }; 

The design pattern specifies that a client program should call Dispose on the object when it is done with it. In the Dispose method implementation, the class does the appropriate cleanup. As backup assurance, the class should also implement a finalizer, in case Dispose never gets called, perhaps due to an exception being thrown. [16] Since both Dispose and Finalize perform the cleanup, cleanup code can be placed in Dispose , and Finalize can be implemented by calling Dispose . One detail is that once Dispose has been called, the object should not be finalized, because that would involve cleanup being performed twice. The object can be removed from the finalization queue by calling GC.SuppressFinalize . Also, it is a good idea for the class to maintain a boolean flag such as disposedCalled , so that if Dispose is called twice, cleanup will not be performed a second time.

[16] One of the virtues of the exception handling mechanism is that as the call stack is unwound in handling the exception, local objects go out of scope and so can get marked for finalization. We provide a small demo later in this section.

The example program DisposeDemo provides an illustration of finalization and the dispose pattern. The class SimpleLog implements logging to a file, making use of the StreamWriter class (discussed earlier in this chapter).

 // SimpleLog.cs  using System;  using System.IO;  public class SimpleLog : IDisposable  {     private StreamWriter writer;     private string name;     private bool disposeCalled = false;     public SimpleLog(string fileName)     {        name = fileName;        writer = new StreamWriter(fileName, false);        writer.AutoFlush = true;        Console.WriteLine("logfile " + name + " created");     }     public void WriteLine(string str)     {        writer.WriteLine(str);        Console.WriteLine(str);     }     public void Dispose()     {         if(disposeCalled)           return;        writer.Close();        GC.SuppressFinalize(this);        Console.WriteLine("logfile " + name + " disposed");        disposeCalled = true;     }     ~SimpleLog()     {        Console.WriteLine("logfile " + name + " finalized");        Dispose();     }  } 

The class SimpleLog supports the IDisposable interface, and thus implements Dispose . The cleanup code simply closes the StreamWriter object. To make sure that a disposed object will not also be finalized, GC.SuppressFinalize is called. The finalizer simply delegates to Dispose . To help monitor object lifetime, a message is written to the console in the constructor, in Dispose , and in the finalizer. [17]

[17] The Console.WriteLine in the finalizer is provided purely for didactic purposes and should not be done in production code, for reasons we shall discuss shortly.

Here is the code for the test program:

 // DisposeDemo.cs  using System;  using System.Threading;  public class DisposeDemo  {     public static void Main()     {  SimpleLog log = new SimpleLog(@"log1.txt");  log.WriteLine("First line");        Pause();  log.Dispose();  log.Dispose();  log = new SimpleLog(@"log2.txt");  log.WriteLine("Second line");        Pause();  log = new SimpleLog(@"log3.txt");  log.WriteLine("Third line");        Pause();  log = null;   GC.Collect();  Thread.Sleep(100);     }     private static void Pause()     {         Console.Write("Press enter to continue");        string str = Console.ReadLine();     }  } 

The SimpleLog object reference log is assigned in turn to three different object instances. The first time, it is properly disposed. The second time, log is reassigned to refer to a third object, before the second object is disposed, resulting in the second object becoming "garbage." The Pause method provides an easy way to pause the execution of this console application, allowing us to investigate the condition of the files log1.txt , log2.txt , and log3.txt at various points in the execution of the program.

Running the program results in the following output:

 logfile log1.txt created  First line  Press enter to continue  logfile log1.txt disposed  logfile log2.txt created  Second line  Press enter to continue  logfile log3.txt created  Third line  Press enter to continue  logfile log3.txt finalized  logfile log3.txt disposed  logfile log2.txt finalized  logfile log2.txt disposed 

After the first pause, the file log1.txt has been created, and you can examine its contents in Notepad. If you try to delete the file, you will get a sharing violation, as illustrated in Figure 8-2.

Figure 8-2. Trying to delete an open file results in a sharing violation.

graphics/08fig02.gif

At the second pause point, log1.txt has been disposed, and you will be allowed to delete it. log2.txt has been created (and is open). At the third pause point, log3.txt has been created. But the object reference to log2.txt has been reassigned, and so there is now no way for the client program to dispose of the second object. [18] If Dispose were the only mechanism to cleanup the second object, we would be out of luck. Fortunately, the SimpleObject class has implemented a finalizer, so the next time garbage is collected, the second object will be disposed of properly. We can see the effect of finalization by running the program through to completion. The second object is indeed finalized, and thence disposed. In fact, as the app domain shuts down, Finalize is called on all objects not exempt from finalization, even on objects that are still accessible.

[18] This example illustrates that it is the client's responsibility to help the scalability of the server by cleaning up objects (using Dispose) before reassigning them. Once an object has been reassigned, there is no way to call Dispose, and the object will hang around for an indeterminate period of time until garbage is collected. Effective memory management involves both the server and client.

In our code we explicitly make the third object inaccessible by the assignment log = null , and we then force a garbage collection by a call to GC.Collect . Finally we sleep briefly , to give the garbage collector a chance to run through to completion, before the application domain shuts down. Coding our test program in this way is a workaround for the fact that the order of finalization is nondeterministic. The garbage collector will be called automatically when the program exits and the application domain is shut down. However, at that point, system objects, such as Console , are also being closed. Since you cannot rely on the order of finalizations, you may get an exception from the WriteLine statement within the finalizer. The explicit call to GC.Collect forces a garbage collection while the system objects are still open. If we omitted the last three lines of the Main method, we might well get identical output, but we might also take an exception.

We provide similar code at the end of the Main methods of our other test programs, so that our print statements in finalizers work properly without randomly throwing exceptions.

Alternate Name for Dispose

The standard name for the method that performs cleanup is Dispose . The convention is that once an object is disposed, it is finished. In some cases, the same object instance may be reused, as in the case of a file. A file may be opened, closed, and then opened again. In such a case, an additional cleanup method should be called Close . In other cases some other natural name may be used.

Our SimpleLog class could plausibly have provided an Open method, and then it would have made sense to name our cleanup method Close . For simplicity, we did not provide an Open method, and so we stuck to the name Dispose .

Garbage Collection and Generations

Using the dispose pattern we can mitigate the issue of nondeterministic finalization, but what about the performance of the garbage collector? It turns out that the overall memory management efficiency of .NET is quite good, thanks to two main points:

  • Allocation is very fast. Space on the managed heap is always contiguous, so allocating a new object is equivalent to incrementing a pointer. (By contrast, an allocation on an unmanaged heap is relatively slow, because a list of data structures must be walked to find a block that is large enough.)

  • The CLR uses generations during garbage collecting, reducing the number of objects that are typically checked for being garbage.

Generations

As an optimization, every object on the managed heap is assigned to a generation. A new object is in generation 0 and is considered a prime candidate for garbage collection. Older objects are in generation 1. Since such an older object has survived for a while, the odds favor its having a longer lifetime than a generation 0 object. Still older objects are assigned to generation 2 and are considered even more likely to survive a garbage collection. The maximum generation number in the current implementation of .NET is 2, as can be confirmed from the GC.MaxGeneration property.

In a normal sweep of the garbage collector, only generation 0 will be examined. It is here that the most likely candidates are for memory to be reclaimed. All surviving generation 0 objects are promoted to generation 1. If not enough memory is reclaimed, a sweep will next be performed on generation 1 objects, and the survivors will be promoted. Then, if necessary, a sweep of generation 2 will be performed, and so on up until MaxGeneration .

Finalization and Stack Unwinding

As mentioned earlier, one of the virtues of the exception handling mechanism is that as the call stack is unwound in handling the exception, local objects go out of scope and so can get marked for finalization. The program FinalizeStackUnwind provides a simple illustration. It uses the SimpleLog class discussed previously, which implements finalization.

 // FinalizeStackUnwind.cs  using System;  public class FinalizeStackUnwind  {     public static void Main()     {        try        {           SomeMethod();        }        catch(Exception e)        {           Console.WriteLine(e.Message);        }        GC.Collect();     }     private static void SomeMethod()     {        // local variable        SimpleLog alpha = new SimpleLog("alpha.txt");        // force an exception        throw new Exception("error!!");     }  } 

A local variable alpha of type SimpleLog is allocated in SomeMethod . Before the method exits normally, an exception is thrown. The stack unwinding mechanism of exception handling detects that alpha is no longer accessible, and so is marked for garbage collection. The call to GC.Collect forces a garbage collection, and we see from the output of the program that finalize is indeed called.

 logfile alpha.txt created  error!!  logfile alpha.txt finalized  logfile alpha.txt disposed 

Controlling Garbage Collection with the GC Class

Normally it is the best practice simply to let the garbage collector perform its work behind the scenes. Sometimes, however, it may be advantageous for the program to intervene. The System namespace contains the class GC , which enables a program to affect the behavior of the garbage collector. We summarize a few of the important methods of the class.

SuppressFinalize

This method requests the system to not call Finalize for the specified object. As we saw previously, you should call this method in your implementation of Dispose , to prevent a disposed object from also being finalized. [19]

[19] You should be careful in the case of an object that might be "closed" (like a file) and later reopened again. In such a case it might be better not to suppress finalization. Once finalization is suppressed, it can be made eligible for finalization again by calling GC. ReRegisterForFinalize. For a discussion of advanced issues in garbage collection and finalization, refer to the Jeffrey Richter article previously cited.

Collect

You can force a garbage collection by calling the Collect method. An optional parameter lets you specify which generations should be collected. Use this method sparingly, since normally the CLR has better information on the current state of memory. A possible use would be a case when your program has just released a number of large objects, and you would like to see all this memory reclaimed right away. Another example was provided in the previous section, where a call to Collect forced a collection while system objects were still valid.

MaxGeneration

This property returns the maximum number of generations that are supported.

GetGeneration

This method returns the current generation number of an object.

GetTotalMemory

This method returns the number of bytes currently allocated. A parameter lets you specify whether the system should perform a garbage collection before returning. If no garbage collection is done, the indicated number of bytes is probably larger than the actual number of bytes actually being used by live objects.

Sample Program

The program GarbageCollection illustrates using these methods of the GC class. The example is artificial, simply illustrating object lifetime, and the effect of the various GC methods. The class of objects that are allocated is called Member . This class has a string property called Name . Write statements are provided in the constructor, Dispose , and in the destructor. A Committee class maintains an array list of Member instances. The RemoveMember method simply removes the member from the array list. The DisposeMember method also calls Dispose on the member being expunged from the committee. The ShowGenerations method displays the generation number of each Member object. GarbageCollection.cs is a test program to exercise these classes, showing the results of various allocations and deallocations and the use of methods of the GC class. The code and output should be quite easy to understand.

All the memory is allocated locally in a method Demonstrate Generations . After this method returns and its local memory has become inaccessible, we make an explicit call to GC.Collect . This forces the finalizers to be called before the app domain shuts down, and so we avoid a possible random exception of a stream being closed when a WriteLine method is called in a finalizer. This is the same point mentioned previously for the earlier examples.

for RuBoard


Application Development Using C# and .NET
Application Development Using C# and .NET
ISBN: 013093383X
EAN: 2147483647
Year: 2001
Pages: 158

flylib.com © 2008-2017.
If you may any questions please contact us: flylib@qtcs.net