Configuring the CLR Garbage Collector


The CLR hosting APIs offer two more interfaces that hosts can use to monitor and configure how the CLR uses memory. These two interfaces, ICLRGCManager and IHostGCManager, comprise the hosting interface's garbage collection manager. The ICLRGCManager interface enables a host to initiate garbage collections, gather various statistics related to collections, and partition the garbage collector's heap for optimal performance. Hosts can receive notifications about the timing of garbage collections by providing the CLR with an interface of type IHostGCManager. I discuss the role of IHostGCManager in more detail later in the chapter in the section entitled "Receiving Notifications Through the IHostGCManager Interface."

CLR hosts obtain an interface pointer of type ICLRGCManager through the standard mechanism used to obtain hosting interfaces from the CLRthat is, by calling the GetCLRManager method on the CLR's implementation of ICLRControl, as shown in the following sample main program. (Refer to Chapter 2 for a detailed discussion of how both the host and the CLR exchange pointers to the various interfaces in the hosting API.)

int wmain(int argc, wchar_t* argv[]) {    HRESULT hr = S_OK;    // Start .NET Framework version 2.0 of the CLR.    ICLRRuntimeHost *pCLR = NULL;    hr = CorBindToRuntimeEx(       L"v2.0.41013",       L"wks",       STARTUP_CONCURRENT_GC,       CLSID_CLRRuntimeHost,       IID_ICLRRuntimeHost,       (PVOID*) &pCLR);    assert(SUCCEEDED(hr));    // Get the CLRControl object. Use this to get the pointer of    // type ICLRGCManager.    ICLRControl *pCLRControl = NULL;    hr = pCLR->GetCLRControl(&pCLRControl);    assert(SUCCEEDED(hr));    // Get a pointer to an ICLRGCManager.    ICLRGCManager *pCLRGCManager = NULL;    hr = pCLRControl->GetCLRManager(IID_ICLRGCManager,                                    (void **)&pCLRGCManager);    assert(SUCCEEDED(hr));    // The ICLRGCManager pointer is now ready to use...    // Remember to release it.    pCLRGCManager->Release();    // The rest of the host's code is omitted... }

Once the host has a pointer of type ICLRGCManager, it can use the methods on that interface to customize various aspects of how the garbage collector works. Table 13-3 provides an overview of the methods on ICLRGCManager.

Table 13-3. The Methods on the ICLRGCManager Interface

Method

Description

Collect

Enables a host to initiate a garbage collection

GetStats

Returns various statistics about the garbage collections that have occurred so far in the process

SetGCStartupLimits

Enables a host to partition the garbage collection heap to optimize overall performance for its specific scenario


The next several sections describe how to use the methods in Table 13-3.

Partitioning the Garbage Collector's Heap

The CLR garbage collector uses the notion of generations to optimize the collector based on the expected lifetime of managed objects in the heap. A complete description of how the garbage collector uses generations is described in many other texts, so I won't repeat it here. If you're looking for a great reference on the CLR's garbage collector, refer to Chapter 19 in Applied Microsoft .NET Framework Programming by Jeffery Richter (Microsoft Press, 2002).

The SetGCStartupLimits on ICLRGCManager can be used to specify how much of the garbage collection heap is to be used for generations 0 and 1. The garbage collection heap is divided into segments. The managed objects in generations 0 and 1 are stored in the same segment. The objects in generation 2 are sometimes stored in the same segment as generations 0 and 1, but not always. The decision about where the objects in generation 2 and the objects in the large object heap live is a CLR implementation detail that can change over time. The relationship between segments and generations is shown in Figure 13-2.

Figure 13-2. The garbage collection heap is partitioned into segments.


SetGCStartupLimits enables you to supply two values that control how the garbage collection heap is partitioned, as shown in the following definition from mscoree.idl:

interface ICLRGCManager : IUnknown {     // Other methods omitted...     HRESULT SetGCStartupLimits([in] DWORD SegmentSize,                                [in] DWORD MaxGen0Size); }

The SegmentSize parameter to SetGCStartupLimits controls the size of the segments in the garbage collection heap. The value you supply to this method must be a multiple of 1 MB and at least 4 MB. The MaxGen0Size specifies the size of the space used to store objects in generation 0. MaxGen0Size must be at least 64 KB. Both SegmentSize and MaxGen0Size are specified in bytes. Given values for SegmentSize and MaxGen0Size, the CLR computes the amount of space to use for generation 1 as shown in Figure 13-2. Both values you supplied through SetGCStartupLimits can be set only oncesubsequent calls are ignored. The following sample call to SetGCStartupLimits sets the segment size to 8 MB and the maximum size of generation 0 to 128 KB:

hr = pCLRGCManager->SetGCStartupLimits(8*1024*1024,128*1024);

Now that you know how to partition the garbage collection heap using SetGCStartupLimits, take a look at how you might use the statistics returned from ICLRGCManager::GetStats to determine which values for SegmentSize and MaxGen0Size might work best for your application.

Gathering Garbage Collection Statistics

The ability to partition the garbage collection heap isn't of much use if you don't know which values for SegmentSize and MaxGen0Size make sense in your application. Settling on the right values will likely take several iterations, but the GetStats method on ICLRGCManager can help you get started. GetStats returns a structure that contains various statistics about how the garbage collector is performing in your process. By looking at the values returned from GetStats, you can start to establish patterns that can help you optimize the performance of the garbage collector by adjusting how the heap is partitioned using SetGCStartupLimits.

GetStats returns a structure of type COR_GC_STATS as shown in the following definition from mscoree.idl:

interface ICLRGCManager : IUnknown {     HRESULT GetStats([in][out] COR_GC_STATS *pStats); }

The COR_GC_STATS structure contains fields that report both the number of collections that have occurred and the current status of the memory used by the garbage collector. COR_GC_STATS is defined in gchost.idl in the .NET Framework SDK:

typedef struct _COR_GC_STATS {     ULONG           Flags;     SIZE_T           ExplicitGCCount;     SIZE_T           GenCollectionsTaken[3];     SIZE_T           CommittedKBytes;     SIZE_T           ReservedKBytes;     SIZE_T           Gen0HeapSizeKBytes;     SIZE_T           Gen1HeapSizeKBytes;     SIZE_T           Gen2HeapSizeKBytes;     SIZE_T           LargeObjectHeapSizeKBytes;     SIZE_T           KBytesPromotedFromGen0;     SIZE_T           KBytesPromotedFromGen1; } COR_GC_STATS;

Notice from the definition of GetStats that the pStats parameter is marked as both an in and an out parameter. When calling GetStats, you must first populate the Flags fields of COR_GC_STATS to indicate which of the statistics you'd like populated. Based on the flags you set, the CLR fills in the appropriate fields of the structure that you passed in. The valid values for the Flags field are given by the COR_GC_STAT_TYPES enumeration from gchost.idl:

typedef enum {     COR_GC_COUNTS      = 0x00000001,     COR_GC_MEMORYUSAGE = 0x00000002, } COR_GC_STAT_TYPES;

If COR_GC_COUNTS is added to the Flags field of COR_GC_STATS, the CLR populates the fields of COR_GC_STATS that describe the number of collections that have occurred so far in the process. These fields are ExplicitGCCount and GenCollectionsTaken. ExplicitGCCount indicates the number of times that a garbage collection has been explicitly initiated either through a call to ICLRGCManager::Collect or through the Collect method on the System.GC class in the .NET Framework class libraries. The GenCollectionsTaken array describes the number of collections that have occurred per generation. Element 0 of GenCollectionsTaken contains the number of collections done in generation 0, element 1 contains the number of collections done in generation 1, and so on.

Setting COR_GC_MEMORYUSAGE in the Flags field of COR_GC_STATS causes the CLR to return the values that provide insight into how the garbage collector is using memory in the process. The fields returned when COR_GC_MEMORYUSAGE is set are as follows:

  • CommittedKBytes

  • ReservedKBytes

  • Gen0HeapSizeKBytes

  • Gen1HeapSizeKBytes

  • Gen2HeapSizeKBytes

  • LargeObjectHeapSizeKBytes

  • KBytesPromotedFromGen0

  • KBytesPromotedFromGen1

These fields describe the total amount of memory that has been committed and reserved by the garbage collector, the number of bytes currently used to store the objects in each generation, and the number of bytes promoted from generation 0 to generation 1 and from generation 1 to generation 2.

The following sample call sets both COR_GC_COUNTS and COR_GC_MEMORYUSAGE to return all of the statistics available from GetStats:

   COR_GC_STATS stats;    stats.Flags = COR_GC_COUNTS | COR_GC_MEMORYUSAGE;    hr = pCLRGCManager->GetStats(&stats);    // The stats structure now contains values for the full set    // of garbage collection statistics.

Using the statistics returned from GetStats to tune the CLR garbage collector requires several iterations and an extensive amount of testing. It's very easy to hurt performance instead of help it if you're not careful. If you see that an excessive number of generation 0 collections are happening in your particular scenario, you might try adjusting the amount of space the CLR is using to store generation 0 objects. However, be sure you follow up with enough benchmarking to ensure you aren't inadvertently making matters worse.

Initiating Garbage Collections

The CLR uses several heuristics to determine when to initiate a garbage collection. For example, a collection is done when generation 0 is full. These heuristics work great for the vast majority of application scenarios, and the general guidance is to leave it up to the CLR to determine the optimal time to do a collection. That said, available APIs enable you to force a garbage collection to happen. The CLR hosting API offers the ability to initiate a garbage collection by calling the Collect method on ICLRGCManager. Collect takes a single parameter that identifies the generation you'd like collected, as shown in the following definition from mscoree.idl:

interface ICLRGCManager : IUnknown {     HRESULT Collect([in] LONG Generation);     // Other methods omitted... }

To force a collection for a particular generation, simply pass the number of that generation (0, 1, or 2) as the Generation parameter. You can force all generations to be collected by passing 1.

As with all techniques available to configure the CLR garbage collector, use the ability to initiate collections programmatically with care. Just the act of preparing for a garbage collection can be an expensive operation. To prepare, the CLR must bring all threads to a known safe state and ensure that several internal data structures cannot be modified while the collection is performed. Calling Collect too often can easily degrade performance by causing the CLR unnecessarily to bring itself to a state in which it's safe to perform a collection. Again, be sure to test thoroughly to make sure you're not hurting performance when you intend to make it better.

Note

The CLR hosting APIs from .NET Framework 1.0 and .NET Framework 1.1 include an interface in gchost.idl called IGCHost. This interface has many of the same capabilities that ICLRGCManager now has. IGCHost is now deprecated and should not be used going forward. Always use ICLRGCManager instead.


Receiving Notifications Through the IHostGCManager Interface

In Chapter 14, you'll see that the CLR hosting APIs provide a set of interfaces that enable a host to integrate the CLR with custom task-scheduling schemes. To schedule tasks most efficiently, a host must know when the CLR suspends a thread either to do a garbage collection or for other activities. A host can use the knowledge of when a thread is about to be suspended to avoid scheduling any tasks on that thread until the CLR is ready to let the thread run again.

The CLR notifies the host when a thread is about to be suspended (and when it resumes) by calling methods on the host's implementation of IHostGCManager. The methods on IHostGCManager are shown in Table 13-4.

Table 13-4. The Methods on the IHostGCManager Interface

Method

Description

ThreadIsBlockingForSuspension

Notifies the host that the thread making this call is about to block for a garbage collection or other activity. At this point, the host should not schedule any managed code to run on the thread.

SuspensionStarting

Notifies the host that a thread suspension is beginning.

SuspensionEnding

Notifies the host that a thread suspension is ending SuspensionEnding includes a parameter that indicates for which generation garbage was collected while the thread was suspended.


Hosts provide the CLR with an implementation of IHostGCManager using the standard technique employed for all of the hosting interfaces implemented by the host. Specifically, a host must do the following:

  1. Define a class that derives from IHostGCManager.

  2. Return an instance of that class when the CLR calls the host's implementation of IHostControl::GetHostManager, passing in the interface identifier for IHostGCManager (IID_IHostGCManager).

I provide many more details on how to integrate the CLR with a custom scheduler in Chapter 14.



    Customizing the Microsoft  .NET Framework Common Language Runtime
    Customizing the Microsoft .NET Framework Common Language Runtime
    ISBN: 735619883
    EAN: N/A
    Year: 2005
    Pages: 119

    flylib.com © 2008-2017.
    If you may any questions please contact us: flylib@qtcs.net