Memory Management


This section looks at one of the larger underlying elements of managed code. One of the reasons why .NET applications are referred to as “managed” is that memory deallocation is handled automatically by the system. The CLR’s memory management fixes the shortcomings of the COM’s memory management. Developers are accustomed to worrying about memory management only in an abstract sense. The basic rule has been that every object created and every section of memory allocated needs to be released (destroyed). The CLR introduces a garbage collector (GC), which simplifies this paradigm. Gone are the days where a misbehaving component - for example, one that failed to properly dispose of its object references or allocated and never released memory - could crash a Web server.

However, the use of a GC introduces new questions about when and if objects need to be explicitly cleaned up. There are two elements in manually writing code to allocate and deallocate memory and system resources. The first is the release of any shared resources such as file handles and database connections. This type of activity needs to be managed explicitly and is discussed shortly. The second ele-ment of manual memory management involves letting the system know when memory is no longer in use by your application. Visual Basic COM developers, in particular, are accustomed to explicitly disposing of object references by setting variables to Nothing. While you can explicitly show your intent to destroy the object by setting it to Nothing manually, this doesn’t actually free resources under .NET.

.NET uses a GC to automatically manage the cleanup of allocated memory, which means that you don’t need to carry out memory management as an explicit action. Since the system is automatic, it’s not up to you when resources are actually cleaned up; thus, a resource you previously used might sit in memory beyond the end of the method where you used it. Perhaps more important is the fact that the GC will at times reclaim objects in the middle of executing the code in a method. Fortunately, the system ensures that collection only happens as long as your code doesn’t reference the object later in the method. As an example, you could actually end up extending the amount of time an object is kept in memory just by setting that object to Nothing. Thus, setting a variable to Nothing at the end of the method prevents the garbage collection mechanism from proactively reclaiming objects, and therefore is generally discouraged. After all, if the goal is simply to document a developer’s intention, then a comment is more appropriate.

Given this change in paradigms, the next few sections take you through a comparison of the challenges of traditional memory management and a real look under the covers of how the garbage collector works, the basics of some of the challenges with COM-based memory management, and then a quick look at how the GC eliminates these challenges from your list of concerns. In particular, you should understand how you can interact with the garbage collector and why the Using command, for example, is recommended over a finalization method in .NET.

Traditional Garbage Collection

The Visual Basic 6 runtime environment provides limited memory management by automatically releasing objects when they are no longer referenced by any application. Once all the references are released on an object, the runtime automatically releases the object from memory. For example, consider the following Visual Basic 6 code that uses the Scripting.FileSystem object to write an entry to a log file:

  ' Requires a reference to Microsoft Scripting Runtime (scrrun.dll) Sub WriteToLog(strLogEntry As String)  Dim objFSO As Scripting.FileSystemObject  Dim objTS As Scripting.TextStream  objTS = objFSO.OpenTextFile("C:\temp\AppLog.log", ForAppending)  Call objTS.WriteLine(Date & vbTab & strLogEntry) End Sub 

WriteToLog creates two objects, a FileSystemObject and a TextStream, which are used to create an entry in the log file. Because these are COM objects, they may live either within the current application process or in their own process. Once the routine exits, the Visual Basic runtime recognizes that they are no longer referenced by an active application and dereference the objects. This results in both objects being deactivated. However, there are situations in which objects that are no longer referenced by an application are not properly cleaned up by the Visual Basic 6 runtime. One cause of this is the circular reference.

Circular References

One of the most common situations in which the Visual Basic runtime is unable to ensure that objects are no longer referenced by the application is when these objects contain a circular reference. An example of a circular reference is when object A holds a reference to object B and object B holds a reference to object A.

Circular references are problematic because the Visual Basic runtime relies on the reference counting mechanism of COM to determine whether an object can be deactivated. Each COM object is responsible for maintaining its own reference count and for destroying itself once the reference count reaches zero. Clients of the object are responsible for updating the reference count appropriately, by calling the AddRef and Release methods on the object’s IUnknown interface. However, in this scenario, object A continues to hold a reference to object B, and vice versa, and thus the internal cleanup logic of these components is not triggered.

In addition, problems can occur if the clients do not properly maintain the COM object’s reference count. For example, an object will never be deactivated if a client forgets to call Release when the object is no longer referenced. To avoid this, the Visual Basic 6 runtime takes care of updating the reference count for you, but the object’s reference count can be an invalid indicator of whether or not the object is still being used by the application. As an example, consider the references that objects A and B hold.

The application can invalidate its references to A and B by setting the associated variables equal to Nothing. However, even though objects A and B are no longer referenced by the application, the Visual Basic runtime cannot ensure that the objects are deactivated because A and B still reference each other. Consider the following (Visual Basic 6) code:

   ' Class:  CCircularRef ' Reference to another object. Dim m_objRef As Object Public Sub Initialize(objRef As Object)   Set m_objRef = objRef End Sub Private Sub Class_Terminate()   Call MsgBox("Terminating.")   Set m_objRef = Nothing End Sub 

The CCircularRef class implements an Initialize method that accepts a reference to another object and saves it as a member variable. Notice that the class does not release any existing reference in the m_objRef variable before assigning a new value. The following code demonstrates how to use this CCircularRef class to create a circular reference:

  Dim objA As New CCircularRef Dim objB As New CCircularRef Call objA.Initialize(objB) Call objB.Initialize(objA) Set objA = Nothing Set objB = Nothing 

After creating two instances (objA and objB) of CCircularRef, both of which have a reference count of one, the code then calls the Initialize method on each object by passing it a reference to the other. Now each of the object’s reference counts is equal to two: one held by the application and one held by the other object. Next, explicitly setting objA and objB to Nothing decrements each object’s reference count by one. However, since the reference count for both instances of CCircularRef is still greater than zero, the objects are not released from memory until the application is terminated. The CLR garbage collector solves the problem of circular references because it looks for a reference from the root application or thread to every class, and all classes that do not have such a reference are marked for deletion, regardless of what other references they might still maintain.

The CLR’s Garbage Collector

The .NET garbage collection mechanism is complex, and the details of its inner workings are beyond the scope of this book. However, it is important to understand the principles behind its operation. The GC is responsible for collecting objects that are no longer referenced. It takes a completely different approach from that of the Visual Basic runtime to accomplish this. At certain times, and based on internal rules, a task will run through all the objects looking for those that no longer have any references from the root application thread or one of the worker threads. Those objects may then be terminated; thus, the garbage is collected.

As long as all references to an object are either implicitly or explicitly released by the application, the GC will take care of freeing the memory allocated to it. Unlike COM objects, managed objects in .NET are not responsible for maintaining their reference count, and they are not responsible for destroying themselves. Instead, the GC is responsible for cleaning up objects that are no longer referenced by the application. The GC periodically determines which objects need to be cleaned up by leveraging the information the CLR maintains about the running application. The GC obtains a list of objects that are directly referenced by the application. Then, the GC discovers all the objects that are referenced (both directly and indirectly) by the application’s “root” objects. Once the GC has identified all the referenced objects, it is free to clean up any remaining objects.

The GC relies on references from an application to objects; thus, when it locates an object that is unreachable from any of the root objects, it can clean up that object. Any other references to that object will be from other objects that are also unreachable. Thus, the GC automatically cleans up objects that contain circular references.

In some environments, such as COM, objects are destroyed in a deterministic fashion. Once the reference count reaches zero, the object destroys itself, which means that you can tell exactly when the object will be terminated. However, with garbage collection, you can’t tell exactly when an object will be destroyed. Just because you eliminate all references to an object doesn’t mean that it will be terminated immediately. It just remains in memory until the garbage collection process gets around to locating and destroying it, a process called nondeterministic finalization.

This nondeterministic nature of CLR garbage collection provides a performance benefit. Rather than expend the effort to destroy objects as they are dereferenced, the destruction process can occur when the application is otherwise idle, often decreasing the impact on the user. Of course, if garbage collection must occur when the application is active, then the system may see a slight performance fluctuation as the collection is accomplished.

It is possible to explicitly invoke the GC by calling the System.GC.Collect method, but this process takes time, so it is not the sort of thing that should be done in a typical application. For example, you could call this method each time you set an object variable to Nothing, so that the object would be destroyed almost immediately. However, this forces the GC to scan all the objects in your application, a very expensive operation in terms of performance.

It’s far better to design applications such that it is acceptable for unused objects to sit in the memory for some time before they are terminated. That way, the garbage collector, too, can run based on its optimal rules, collecting many dereferenced objects at the same time. This means you need to design objects that don’t maintain expensive resources in instance variables. For example, database connections, open files on disk, and large chunks of memory (such as an image) are all examples of expensive resources. If you rely on the destruction of the object to release this type of resource, then the system might be keeping the resource tied up for a lot longer than you expect; in fact, on a lightly utilized Web server, it could literally be days.

The first principle is working with object patterns that incorporate cleaning up such pending references before the object is released. Examples of this include calling the close method on an open database connection or file handle. In most cases, it’s possible for applications to create classes that do not risk keeping these handles open. However, certain requirements, even with the best object design, can create a risk that a key resource will not be cleaned up correctly. In such an event, there are two occasions when the object could attempt to perform this cleanup: when the final reference to the object is released and immediately before the GC destroys the object.

One option is to implement the IDisposable interface. When implemented, this interface is used to ensure that persistent resources are released. This is the preferred method for releasing resources. The second option is to add a method to your class that the system runs immediately before an object is destroyed. This option is not recommended for several reasons, including the fact that many developers fail to remember that the garbage collector is nondeterministic, meaning that you can’t, for example, reference an SQLConnection object from your custom object’s finalizer.

Finally, as part of .NET 2.0, Visual Basic introduced the Using command. The Using command is designed to change the way that you think about object cleanup. Instead of encapsulating your cleanup logic within your object, the Using command creates a window around the code that is referencing an instance of your object. When your application’s execution reaches the end of this window, the system automatically calls the IDIsposable interface for your object to ensure that it is cleaned up correctly.

The Finalize Method

Conceptually, the GC calls an object’s Finalize method immediately before it collects an object that is no longer referenced by the application. Classes can override the Finalize method to perform any necessary cleanup. The basic concept is to create a method that acts as what, in other object-oriented languages, is referred to as a destructor. Similarly, the Class_Terminate available in previous versions of Visual Basic does not have a functional equivalent in .NET. Instead, it is possible to create a Finalize method that is recognized by the GC and that prevents a class from being cleaned up until after the finalization method is completed, as shown in the following example:

  Protected Overrides Sub Finalize()   ' clean up code goes here   MyBase.Finalize() End Sub 

This code uses both Protected scope and the Overrides keyword. Notice that not only does custom cleanup code go here (as indicated by the comment), but this method also calls MyBase.Finalize(), which causes any finalization logic in the base class to be executed as well. Any class implementing a custom Finalize method should always call the base finalization class.

Be careful, however, not to treat the Finalize method as if it were a destructor. A destructor is based on a deterministic system, whereby the method is called when the object’s last reference is removed. In the GC system, there are key differences in how a finalizer works:

  • Because the GC is optimized to clean up memory only when necessary, there is a delay between the time when the object is no longer referenced by the application and when the GC collects it. Therefore, the same expensive resources that are released in the Finalize method may stay open longer than they need to be.

  • The GC doesn’t actually run Finalize methods. When the GC finds a Finalize method, it queues the object up for the finalizer to execute the object’s method. This means that an object is not cleaned up during the current GC pass. Because of how the GC is optimized, this can result in the object remaining in memory for a much longer period.

  • The GC is usually triggered when available memory is running low. As a result, execution of the object’s Finalize method is likely to incur performance penalties. Therefore, the code in the Finalize method should be as short and quick as possible.

  • There’s no guarantee that a service you require is still available. For example, if the system is closing and you have a file open, then .NET may have already unloaded the object required to close the file, and thus a Finalize method can’t reference an instance of any other .NET object.

All cleanup activities should be placed in the Finalize method, but objects that require timely cleanup should implement a Dispose method that can then be called by the client application just before setting the reference to Nothing:

  Class DemoDispose   Private m_disposed As Boolean = False   Public Sub Dispose()     If (Not m_disposed) Then       ' Call cleanup code in Finalize.       Finalize()       ' Record that object has been disposed.       m_disposed = True       ' Finalize does not need to be called.       GC.SuppressFinalize(Me)     End If   End Sub   Protected Overrides Sub Finalize()     ' Perform cleanup here \dots     End Sub End Class 

The DemoDispose class overrides the Finalize method and implements the code to perform any necessary cleanup. This class places the actual cleanup code within the Finalize method. To ensure that the Dispose method only calls Finalize once, the value of the private m_disposed field is checked. Once Finalize has been run, this value is set to True. The class then calls GC.SuppressFinalize to ensure that the GC does not call the Finalize method on this object when the object is collected. If you need to implement a Finalize method, this is the preferred implementation pattern.

This example implements all of the object’s cleanup code in the Finalize method to ensure that the object is cleaned up properly before the GC collects it. The Finalize method still serves as a safety net in case the Dispose or Close methods were not called before the GC collects the object.

The IDisposable Interface

In some cases, the Finalize behavior is not acceptable. For an object that is using an expensive or limited resource, such as a database connection, a file handle, or a system lock, it is best to ensure that the resource is freed as soon as the object is no longer needed.

One way to accomplish this is to implement a method to be called by the client code to force the object to clean up and release its resources. This is not a perfect solution, but it is workable. This cleanup method must be called directly by the code using the object or via the use of the Using statement. The Using statement enables you to encapsulate an object’s life span within a limited range, and automate the calling of the IDisposable interface.

The .NET Framework provides the IDisposable interface to formalize the declaration of cleanup logic. Be aware that implementing the IDisposable interface also implies that the object has overridden the Finalize method. Since there is no guarantee that the Dispose method will be called, it is critical that Finalize trigger your cleanup code if it was not already executed.

Having a custom finalizer ensures that, once released, the garbage collection mechanism will eventually find and terminate the object by running its Finalize method. However, when handled correctly, the IDisposable interface ensures that any cleanup is executed immediately, so resources are not consumed beyond the time they are needed.

Note that any class that derives from System.ComponentModel.Component automatically inherits the IDisposable interface. This includes all of the forms and controls used in a Windows Forms UI, as well as various other classes within the .NET Framework. Since this interface is inherited, let’s review a custom implementation of the IDisposable interface based on the Person class defined in the preceding chapters. The first step involves adding a reference to the interface to the top of the class:

 Public Class Person    Implements IDisposable 

This interface defines two methods, Dispose and Finalize, that need to be implemented in the class. Visual Studio automatically inserts both these methods into your code:

   Private disposed As Boolean = False   ' IDisposable   Private Overloads Sub Dispose(ByVal disposing As Boolean)     If Not Me.disposed Then       If disposing Then         ' TODO: put code to dispose managed resources       End If              ' TODO: put code to free unmanaged resources here     End If     Me.disposed = True   End Sub #Region " IDisposable Support "   ' This code added by Visual Basic to correctly implement the disposable pattern.   Public Overloads Sub Dispose() Implements IDisposable.Dispose     ' Do not change this code.     ' Put cleanup code in Dispose(ByVal disposing As Boolean) above.     Dispose(True)     GC.SuppressFinalize(Me)   End Sub   Protected Overrides Sub Finalize()     ' Do not change this code.     ' Put cleanup code in Dispose(ByVal disposing As Boolean) above.     Dispose(False)     MyBase.Finalize()   End Sub #End Region

Notice the use of the Overloads and Overrides keywords. The automatically inserted code is following a best-practice design pattern for implementation of the IDisposable interface and the Finalize method. The idea is to centralize all cleanup code into a single method that is called by either the Dispose() method or the Finalize() method as appropriate.

Accordingly, you can add the cleanup code as noted by the TODO: comments in the inserted code. As mentioned in Chapter 13, the TODO: keyword is recognized by Visual Studio’s text parser, which triggers an entry in the task list to remind you to complete this code before the project is complete. Because this code frees a managed object (the Hashtable), it appears as shown here:

 Private Overloads Sub Dispose(ByVal disposing As Boolean)   If Not Me.disposed Then     If disposing Then       ' TODO: put code to dispose managed resources       mPhones = Nothing     End If          ' TODO: put code to free unmanaged resources here   End If   Me.disposed = True End Sub

In this case, we’re using this method to release a reference to the object that the mPhones variable points to. While not strictly necessary, this illustrates how code can release other objects when the Dispose method is called. Generally, it is up to our client code to call this method at the appropriate time to ensure that cleanup occurs. Typically, this should be done as soon as the code is done using the object.

This is not always as easy as it might sound. In particular, an object may be referenced by more than one variable, and just because code in one class is dereferencing the object from one variable doesn’t mean that it has been dereferenced by all the other variables. If the Dispose method is called while other references remain, then the object may become unusable and cause errors when invoked via those other references. There is no easy solution to this problem, so careful design is required if you choose to use the IDisposable interface.

Using IDisposable

One way to work with the IDisposable interface is to manually insert the calls to the interface implementation everywhere you reference the class. For example, in an application’s Form1 code, you can override the OnLoad event for the form. You can use the custom implementation of this method to create an instance of the Person object. Then you create a custom handler for the form’s OnClosed event, and make sure to clean up by disposing of the Person object. To do this, add the following code to the form:

  Private Sub Form1_Closed(ByVal sender As Object, _     ByVal e As System.EventArgs) Handles MyBase.Closed   CType(mPerson, IDisposable).Dispose() End Sub 

The OnClosed method runs as the form is being closed, so it is an appropriate place to do cleanup work. Note that because the Dispose method is part of a secondary interface, use of the CType() method to access that specific interface is needed in order to call the method.

This solution works fine for patterns where the object implementing IDisposable is used within a form. However, it is less useful for other patterns, such as when the object is used as part of a Web Service. In fact, even for forms, this pattern is somewhat limited in that it requires the form to define the object when the form is created, as opposed to either having the object created prior to the creation of the form or some other scenario that occurs only on other events within the form.

For these situations, Visual Basic 2005 introduces a new command keyword: Using. The Using keyword is a way to quickly encapsulate the lifecycle of an object that implements IDisposable, and ensure that the Dispose method is called correctly:

 Dim mPerson as New Person() Using (mPerson)    'insert custom method calls End Using

The preceding statements allocate a new instance of the mPerson object. The Using command then instructs the compiler to automatically clean up this object’s instance when the End Using command is executed. The result is a much cleaner way to ensure that the IDisposable interface is called.

Faster Memory Allocation for Objects

The CLR introduces the concept of a managed heap. Objects are allocated on the managed heap, and the CLR is responsible for controlling access to these objects in a type-safe manner. One of the advantages of the managed heap is that memory allocations on it are very efficient. When unmanaged code (such as Visual Basic 6 or C++) allocates memory on the unmanaged heap, it typically scans through some sort of data structure in search of a free chunk of memory that is large enough to accommodate the allocation. The managed heap maintains a reference to the end of the most recent heap allocation. When a new object needs to be created on the heap, the CLR allocates memory on top of memory that has previously been allocated and then increments the reference to the end of heap allocations accordingly. Figure 5-4 is a simplification of what takes place in the managed heap for .NET.

image from book
Figure 5-4

  • State 1 - A compressed memory heap with a reference to the end point on the heap

  • State 2 - Object B, although no longer referenced, remains in its current memory location. The memory has not been freed and does not alter the allocation of memory or of other objects on the heap.

  • State 3 - Even though there is now a gap between the memory allocated for object A and object C, the memory allocation for D still occurs on the top of the heap. The unused fragment of memory on the managed heap is ignored at allocation time.

  • State 4 - After one or more allocations, before there is an allocation failure, the garbage collector runs. It reclaims the memory that was allocated to B and repositions the remaining valid objects. This compresses the active objects to the bottom of the heap, creating more space for additional object allocations, as shown in Figure 5-4.

This is where the power of the GC really shines. Before the CLR is unable to allocate memory on the managed heap, the GC is invoked. The GC not only collects objects that are no longer referenced by the application, but also has a second task: compacting the heap. This is important because if all the GC did was clean up objects, then the heap would become progressively more fragmented. When heap memory becomes fragmented, you can wind up with the common problem of having a memory allocation fail, not because there isn’t enough free memory, but because there isn’t enough free memory in a contiguous section of memory. Thus, not only does the GC reclaim the memory associated with objects that are no longer referenced, it also compacts the remaining objects. The GC effectively squeezes out all of the spaces between the remaining objects, freeing up a large section of managed heap for new object allocations.

Garbage Collector Optimizations

The GC uses a concept known as generations, the primary purpose of which is to improve its performance. The theory behind generations is that objects that have been recently created tend to have a higher probability of being garbage-collected than objects that have existed on the system for a longer time.

To understand generations, consider the analogy of a mall parking lot where cars represent objects created by the CLR. People have different shopping patterns when they visit the mall. Some people spend a good portion of their day in the mall, and others stop only long enough to pick up an item or two. Applying the theory of generations to trying to find an empty parking space for a car yields a situation in which the highest probability of finding a parking space is a place where other cars have recently parked. In other words, a space that was occupied recently is more likely to be held by someone who just needed to quickly pick up an item or two. The longer a car has been parked, the higher the probability that its owner is an all-day shopper and the lower the probability that the parking space will be freed up anytime soon.

Generations provide a means for the GC to identify recently created objects versus long-lived objects. An object’s generation is basically a counter that indicates how many times it has successfully avoided garbage collection. An object’s generation counter starts at zero and can have a maximum value of two, after which the object’s generation remains at this value regardless of how many times it is checked for collection.

You can put this to the test with a simple Visual Basic application. From the File menu, select either File image from book New image from book Project or, if you have an open solution, File image from book Add image from book New Project. This opens the Add New Project dialog box. Select a Console application, provide a name and directory for your new project, and click OK. After you create your new project, you will have a code module that looks similar to the code that follows. Within the Main module, add the highlighted code below. Right-click your second project, and select the Set as Startup Project option so that when you run your solution, your new project is automatically started.

 Module Module1   Sub Main()     Dim myObject As Object = New Object()     Dim i As Integer       For i = 0 To 3         Console.WriteLine(String.Format("Generation = {0}", _                           GC.GetGeneration(myObject)))         GC.Collect()         GC.WaitForPendingFinalizers()     Next i   End Sub End Module

Regardless of the project you use, this code sends its output to the .NET console. For a Windows application, this console defaults to the Visual Studio Output window. When you run this code, it creates an instance of an object and then iterates through a loop four times. For each loop, it displays the current generation count of myObject and then calls the GC. The GC.WaitForPendingFinalizers method blocks execution until the garbage collection has been completed.

As shown in Figure 5-5, each time the GC was run, the generation counter was incremented for myObject, up to a maximum of 2.

image from book
Figure 5-5

Each time the GC is run, the managed heap is compacted, and the reference to the end of the most recent memory allocation is updated. After compaction, objects of the same generation are grouped together. Generation-2 objects are grouped at the bottom of the managed heap, and generation-1 objects are grouped next. Since new generation-0 objects are placed on top of the existing allocations, they are grouped together as well.

This is significant because recently allocated objects have a higher probability of having shorter lives. Since objects on the managed heap are ordered according to generations, the GC can opt to collect newer objects. Running the GC over a limited portion of the heap is quicker than running it over the entire managed heap.

It’s also possible to invoke the GC with an overloaded version of the Collect method that accepts a generation number. The GC will then collect all objects no longer referenced by the application that belong to the specified (or younger) generation. The version of the Collect method that accepts no parameters collects objects that belong to all generations.

Another hidden GC optimization is that a reference to an object may implicitly go out of scope; therefore, it can be collected by the GC. It is difficult to illustrate how the optimization occurs only if there are no additional references to the object and the object does not have a finalizer. However, if an object is declared and used at the top of a module and not referenced again in a method, then in the release mode, the metadata will indicate that the variable is not referenced in the later portion of the code. Once the last reference to the object is made, its logical scope ends, and if the garbage collector runs, the memory for that object, which will no longer be referenced, can be reclaimed before it has gone out of its physical scope.




Professional VB 2005 with. NET 3. 0
Professional VB 2005 with .NET 3.0 (Programmer to Programmer)
ISBN: 0470124709
EAN: 2147483647
Year: 2004
Pages: 267

flylib.com © 2008-2017.
If you may any questions please contact us: flylib@qtcs.net