Garbage Collection and Finalization

Team-Fly    

 
Application Development Using Visual Basic and .NET
By Robert J. Oberg, Peter Thorsteinson, Dana L. Wyatt
Table of Contents
Chapter 10.  .NET Framework Classes


Garbage Collection and Finalization

Memory management is a critical aspect of programming and can be the source of many errors. Whenever a resource is created, memory must be provided for it. And when the resource is no longer needed, the memory should be reclaimed. If the memory is not reclaimed, the amount of memory available for other resources is reduced. If such "memory leaks" recur often enough (which can happen in long-running server programs), the program can crash. Another potential bug is to reclaim memory while it is still required by another part of the program.

.NET greatly simplifies the programming of memory management through an automatic garbage collection facility. The CLR tracks the use of memory that is allocated on the managed heap, and any memory that is no longer referenced is marked as "garbage." When memory is low, the CLR traverses its data structure of tracked memory and reclaims all the memory marked as garbage. Thus the programmer is relieved of this responsibility.

Although a good foundation for resource management, garbage collection by itself does not address all issues. Memory allocated from the managed heap is not the only kind of resource needed in programs. Other resources, such as file handles and database connections, are not automatically deallocated, and the programmer may need to write explicit code to perform cleanup. The .NET Framework provides a Finalize method in the Object base class for this purpose. The CLR calls Finalize when the memory allocated for an object is reclaimed.

Another concern with garbage collection is performance. Is there a big penalty from the automated garbage collection? The CLR provides a very efficient multigenerational garbage collection algorithm. In this section we examine garbage collection and finalization in the .NET Framework, and we provide several code examples.

Finalize

System.Object has a protected method Finalize , which is automatically called by the CLR after an object becomes inaccessible. (As we shall see, finalization for an object may be suppressed by a call to the method SuppressFinalize of the System.GC class.) Since Finalize is protected, it can only be called through the class or a derived class. The default implementation of Finalize does nothing. For any cleanup to be performed, a class must override Finalize . Also, a class's Finalize implementation should call the Finalize of its base class.

Limitations of Finalization

Finalization is non-deterministic . Finalize for a particular object may run at any time during the garbage collection process, and the order of running finalizers for different objects cannot be predicted . Moreover, under exceptional circumstances, a finalizer may not run at all (for example, one finalizer goes into an infinite loop or a process aborts without giving the runtime a chance to clean up). Also, the thread on which a finalizer runs is not specified.

Another issue with finalization is its effect on performance. There is significantly more overhead associated with managing memory for objects with finalizers, both on the allocation side and on the deallocation side. [11]

[11] Finalization internals and other details of garbage collection are discussed in depth in the two-part article "Garbage Collection" by Jeffrey Richter, MSDN Magazine , November and December, 2000.

Thus, you should not implement a finalizer for a class unless you have very good reason for doing so. And if you do provide a finalizer, you should probably provide an alternate, deterministic mechanism for a class to perform necessary cleanup. The .NET Framework provides a Dispose design pattern for deterministic cleanup.

Unmanaged Resources and Dispose

The classic case for a finalizer is a class that contains some unmanaged resource, such as a file handle or a database connection. If they are not released when no longer needed, the scalability of your application can be affected. As a simple illustration, consider a class that wraps a file object. We want to make sure that a file that is opened will eventually be closed. The object itself will be destroyed by garbage collection, but the unmanaged file will remain open unless explicitly closed. Hence, we provide a finalizer to close the wrapped file.

But as we discussed, finalization is non-deterministic, so a file for a deleted object might hang around open for a long time. We would like to have a deterministic mechanism for a client program to clean up the wrapper object when it is done with it. The .NET Framework provides the generic IDisposable interface for this purpose.

 Public Interface IDisposable    Sub Dispose() End Interface 

The design pattern specifies that a client program should call Dispose on the object when it is done with it. In the Dispose method implementation, the class does the appropriate cleanup. As backup assurance, the class should also implement a finalizer in case Dispose never gets called, perhaps due to an exception being thrown. [12] Since both Dispose and Finalize perform the cleanup, cleanup code can be placed in Dispose , and Finalize can be implemented by calling Dispose . One detail is that once Dispose has been called, the object should not be finalized, because that would involve cleanup being performed twice. The object can be removed from the finalization queue by calling GC.SuppressFinalize . Also, it is a good idea for the class to maintain a Boolean flag, such as disposeCalled , so that if Dispose is called twice, cleanup will not be performed a second time.

[12] One of the virtues of the exception-handling mechanism is that as the call stack is unwound in handling the exception, local objects go out of scope and so can get marked for finalization. We provide a small demo later in this section.

The example program DisposeDemo provides an illustration of finalization and the dispose pattern. The class SimpleLog implements logging to a file, making use of the StreamWriter class (discussed earlier in this chapter).

 graphics/codeexample.gif ' SimpleLog.vb Imports System.IO Public Class SimpleLog    Implements IDisposable    Private writer As StreamWriter    Private name As String    Private disposeCalled As Boolean = False    Public Sub New(ByVal fileName As String)       name = fileName       writer = New StreamWriter(fileName, False)       writer.AutoFlush = True       Console.WriteLine("logfile " & name & " created")    End Sub    Public Sub WriteLine(ByVal str As String)       writer.WriteLine(str)       Console.WriteLine(str)    End Sub    Public Sub Dispose() Implements IDisposable.Dispose       If disposeCalled Then          Return       End If       writer.Close()       GC.SuppressFinalize(Me)       Console.WriteLine("logfile " & name & " disposed")       disposeCalled = True    End Sub    Protected Overrides Sub Finalize()       Console.WriteLine("logfile " & name & " finalized")       Dispose()       MyBase.Finalize()    End Sub End Class 

The class SimpleLog supports the IDisposable interface, and thus implements Dispose . The cleanup code simply closes the StreamWriter object. To make sure that a disposed object will not also be finalized, GC.SuppressFinalize is called. The finalizer simply delegates to Dispose . To help monitor object lifetime, a message is written to the console in the constructor, in Dispose , and in the finalizer. [13]

[13] The Console.WriteLine in the finalizer is provided purely for didactic purposes and should not be done in production code, for reasons we shall discuss shortly.

Here is the code for the test program:

 ' DisposeDemo.vb Module DisposeDemo    Sub Main()       Dim log As New SimpleLog("log1.txt")       log.WriteLine("First line")       Pause()       log.Dispose()       log.Dispose()       log = New SimpleLog("log2.txt")       log.WriteLine("Second line")       Pause()       log = New SimpleLog("log3.txt")       log.WriteLine("Third line")       Pause()    End Sub    Private Sub Pause()       Console.Write("Press enter to continue")       Dim str As String = Console.ReadLine()    End Sub End Module 

The SimpleLog object reference log is assigned in turn to three different object instances. The first time, it is properly disposed. The second time, log is reassigned to refer to a third object before the second object is disposed, resulting in the second object becoming garbage. The Pause method provides an easy way to pause the execution of this console application, allowing us to investigate the condition of the files log1.txt , log2.txt , and log3.txt at various points in the execution of the program.

Running the program results in the following output:

 logfile log1.txt created First line Press enter to continue logfile log1.txt disposed logfile log2.txt created Second line Press enter to continue logfile log3.txt created Third line Press enter to continue logfile log3.txt finalized logfile log3.txt disposed logfile log2.txt finalized logfile log2.txt disposed 

After the first pause, the file log1.txt has been created, and you can examine its contents in Notepad. If you try to delete the file, you will get a sharing violation, as illustrated in Figure 10-2.

Figure 10-2. Trying to delete an open file results in a sharing violation.

graphics/10fig02.jpg

At the second pause point, log1.txt has been disposed, and you will be allowed to delete it. log2.txt has been created (and is open). At the third pause point, log3.txt has been created. But the object reference to log2.txt has been reassigned, and so there is now no way for the client program to dispose of the second object. [14] If Dispose were the only mechanism to clean up the second object, we would be out of luck. Fortunately, the SimpleLog class has implemented a finalizer, so the next time garbage is collected, the second object will be disposed of properly. We can see the effect of finalization by running the program through to completion. The second object is indeed finalized, and thence disposed. In fact, as the application domain shuts down, Finalize is called on all objects not exempt from finalization, even on objects that are still accessible.

[14] This example illustrates that it is the client's responsibility to help the scalability of the server by cleaning up objects (using Dispose ) before reassigning them. Once an object has been reassigned, there is no way to call Dispose , and the object will hang around for an indeterminate period of time until garbage is collected. Effective memory management involves both the server and client.

In our code we explicitly make the third object inaccessible by the assignment log = nothing , and we then force a garbage collection by a call to GC.Collect . Finally, we sleep briefly to give the garbage collector a chance to run through to completion before the application domain shuts down. Coding our test program in this way is a workaround for the fact that the order of finalization is non-deterministic. The garbage collector will be called automatically when the program exits and the application domain is shut down. However, at that point, system objects, such as Console , are also being closed. Since you cannot rely on the order of finalizations, you may get an exception from the WriteLine statement within the finalizer. The explicit call to GC.Collect forces a garbage collection while the system objects are still open. If we omitted the last three lines of the Main method, we might well get identical output, but we might also take an exception.

We provide similar code at the end of the Main methods of our other test programs so that our print statements in finalizers work properly without randomly throwing exceptions.

Alternate Name for Dispose

The standard name for the method that performs cleanup is Dispose . The convention is that once an object is disposed, it is finished. In some cases, the same object instance may be reused, as in the case of a file. A file may be opened, closed, and then opened again. In such a case the standard naming convention is that the cleanup method should be called Close . In other cases some other natural name may be used.

Our SimpleLog class could plausibly have provided an Open method, and then it would have made sense to name our cleanup method Close . For simplicity, we did not provide an Open method, and so we stuck to the name Dispose .

Garbage Collection and Generations

Using the dispose pattern, we can mitigate the issue of non-deterministic finalization, but what about the performance of the garbage collector? It turns out that the overall memory management efficiency of .NET is quite good, thanks to two main points:

  • Allocation is very fast. Space on the managed heap is always contiguous, so allocating a new object is equivalent to incrementing a pointer. (By contrast, an allocation on an unmanaged heap is relatively slow, because a list of data structures must be walked to find a block that is large enough.)

  • The CLR uses generations during garbage collecting, reducing the number of objects that are typically checked for being garbage.

Generations

As an optimization, every object on the managed heap is assigned to a generation. A new object is in generation 0 and is considered a prime candidate for garbage collection. Older objects are in generation 1. Since such an older object has survived for a while, the odds favor its having a longer lifetime than a generation 0 object. Still older objects are assigned to generation 2 and are considered even more likely to survive a garbage collection. The maximum generation number in the current implementation of .NET is 2, as can be confirmed from the GC.MaxGeneration property.

In a normal sweep of the garbage collector, only generation 0 will be examined. It is here that the most likely candidates are for memory to be reclaimed. All surviving generation 0 objects are promoted to generation 1. If not enough memory is reclaimed, a sweep will next be performed on generation 1 objects, and the survivors will be promoted. Then, if necessary, a sweep of generation 2 will be performed, and so on up until MaxGeneration .

Finalization and Stack Unwinding

graphics/codeexample.gif

As mentioned earlier, one of the virtues of the exception-handling mechanism is that as the call stack is unwound in handling the exception, local objects go out of scope and so can get marked for finalization. The program Finalize-StackUnwind provides a simple illustration. It uses the SimpleLog class discussed previously, which implements finalization.

 ' FinalizeStackUnwind.vb Module FinalizeStackUnwind    Sub Main()       Try          SomeMethod()       Catch e As Exception          Console.WriteLine(e.Message)       End Try       GC.Collect()    End Sub    Private Sub SomeMethod()       ' local variable       Dim alpha As SimpleLog = New SimpleLog("alpha.txt")       Throw New Exception("error!!")    End Sub End Module 

A local variable alpha of type SimpleLog is allocated in SomeMethod . Before the method exits normally, an exception is thrown. The stack unwinding mechanism of exception handling detects that alpha is no longer accessible, and so is marked for garbage collection. The call to GC.Collect forces a garbage collection, and we see from the output of the program that Finalize is indeed called.

 logfile alpha.txt created error!!  logfile alpha.txt finalized  logfile alpha.txt disposed 

Controlling Garbage Collection with the GC Class

Normally, it is the best practice simply to let the garbage collector perform its work behind the scenes. Sometimes, however, it may be advantageous for the program to intervene. The System namespace contains the class GC , which enables a program to affect the behavior of the garbage collector. We summarize a few of the important methods of the class.

SuppressFinalize

This method requests the system to not call Finalize for the specified object. As we saw previously, you should call this method in your implementation of Dispose to prevent a disposed object from also being finalized. [15]

[15] You should be careful in the case of an object that might be "closed" (like a file) and later reopened again. In such a case it might be better not to suppress finalization. Once finalization is suppressed, it can be made eligible for finalization again by calling GC.ReRegisterForFinalize . For a discussion of advanced issues in garbage collection and finalization, refer to the article by Jeffrey Richter cited in footnote 11.

Collect

You can force a garbage collection by calling the Collect method. An optional parameter lets you specify which generations should be collected. Use this method sparingly, since normally the CLR has better information on the current state of memory. A possible use would be a case when your program has just released a number of large objects, and you would like to see all this memory reclaimed right away. Another example was provided in the previous section, where a call to Collect forced a collection while system objects were still valid.

MaxGeneration

This property returns the maximum number of generations that are supported.

GetGeneration

This method returns the current generation number of an object.

GetTotalMemory

This method returns the number of bytes currently allocated. A parameter lets you specify whether the system should perform a garbage collection before returning. If no garbage collection is done, the indicated number of bytes is probably larger than the actual number of bytes being used by live objects.

Sample Program

graphics/codeexample.gif

The program GarbageCollection illustrates using these methods of the GC class. The example is artificial, simply illustrating object lifetime and the effect of the various GC methods. The class of objects that are allocated is called Member . This class has a String property called Name . Write statements are provided in the constructor Dispose and in the finalizer. A Committee class maintains an array list of Member instances. The RemoveMember method simply removes the member from the array list. The DisposeMember method also calls Dispose on the member being expunged from the committee. The ShowGenerations method displays the generation number of each Member object. GarbageCollection.vb is a test program to exercise these classes, showing the results of various allocations and deallocations and the use of methods of the GC class. The code and output should be quite easy to understand.

All the memory is allocated locally in a method DemonstrateGenerations . After this method returns and its local memory has become inaccessible, we make an explicit call to GC.Collect . This forces the finalizers to be called before the application domain shuts down, and so we avoid a possible random exception of a stream being closed when a WriteLine method is called in a finalizer. This is the same point mentioned previously for the earlier examples.


Team-Fly    
Top
 


Application Development Using Visual BasicR and .NET
Application Development Using Visual BasicR and .NET
ISBN: N/A
EAN: N/A
Year: 2002
Pages: 190

flylib.com © 2008-2017.
If you may any questions please contact us: flylib@qtcs.net