Introduction to the Profiling API | Debugging Applications for MicrosoftВ® .NET and Microsoft WindowsВ® (Pro-Developer)

The documentation and examples for the .NET Profiling API are not available from MSDN, but they are on your machine if you've installed Visual Studio .NET. The magic place is <Visual Studio .NET Installation Dir>\SDK\v1.1\Tools Developers Guide. In that directory, you'll find the Docs directory, which contains all the Word documents that describe everything from the Profiling API, to the Debugging API, to the Metadata API, as well as the complete ECMA specifications for the Common Language Infrastructure (CLI). The Samples directory contains examples of .NET compilers, Profile API examples, and an assembly dependency walker. There are many hidden gems among the documents and examples, and if you're at all curious as to how things work in .NET, the Samples directory is an excellent place to start your research. The document that describes the Profiling API is, appropriately enough, Profiling.DOC.

There are two ways to do profiling. The first way is though a process called sampling, in which the profiler peeks at the profilee at a specific number of millisecond intervals and checks what's running—hence the name sampling profiler. The other method is nonsampling, where the profiler monitors every call and returns synchronously so that it can track everything that occurs in the profilee. The .NET Profiling API handles both types of profiling very easily. As I mentioned in the introduction to this chapter, the Profiling API allows you to do much more than simple profiling. Table 10-1 provides the complete list of items you can be notified about when you write a program using the Profiling API. It's relatively trivial to get these notifications, so you'll probably see all sorts of very neat tools in the future.

Table 10-1: Profiling API Support
Item	Notification Types
Run time	Managed execution (all threads) suspended and resumed, individual managed thread suspend and resume
AppDomain	Startup, shutdown
Assembly	Load, unload
Module	Load, unload, attach
Class	Load, unload
Function	JIT Compilation, cache function search, pitched (removed from memory), inlined, unload
Thread	Created, destroyed, assigned to an OS thread
Remoting	Client invocation, client message sending, client receiving reply, server receiving message, invocation, server sending reply
Transitions	Managed to unmanaged, unmanaged to managed, COM VTable creation, COM VTable destruction
Run time suspension	Suspend, suspend aborted, resume, thread suspended, thread resumed
Garbage collection	Object allocated, allocations by class, moved reference, object references, root references
Exception	Thrown, search, filter, catcher entered, catcher found, call OS handler, unwind function, unwind finally, CLR catcher found, CLR catcher executed

The interface you'll implement to write a profiler is ICorProfilerCallback. Although writing profilers in managed code would be wonderful, because of the architecture supported by the Profiling API, you can't. Your profiler runs in the address space of the managed application you're profiling. If you could use managed code, you'd end up in all sorts of extremely dangerous situations. For example, if you were notified of a garbage collection operation taking place and you needed to allocate managed memory to store the items being collected, you'd end up triggering recursive garbage collection. Needless to say, Microsoft's architects chose the smarter route, which allows for minimal impact. To support managed profilers, all the notifications would have to occur cross-process, which would really slow down the profilee.

Since profilers are just COM DLLs, anyone who's been doing Windows development since 2000 should be familiar with the concepts. I encapsulated all the drudge work into a library that allows you to concentrate on the important stuff instead of messing around with the COM goo. I'll discuss ProfilerLib more in the next section. One key COM point I do want to make is that your profile COM code will be called in a completely free-threaded model, so you'll need to do all the work to protect your data structures from multithreaded corruption.

In the ICorProfilerCallback interface, the only two methods that are always required are Initialize and Shutdown. Initialize is the very first method called. You are passed an IUnknown interface, at which point you'll immediately query for the ICorProfilerInfo interface and store the returned interface so that you can request information about the profilee.

Many of the ICorProfilerCallback methods are passed an ID of some kind. You'll use your stored ICorProfilerInfo interface to change the ID into a useful value. For example, the ICorProfilerCallback:: ModuleLoadFinished method is passed a ModuleID value, which is the ID of the module just loaded. To determine the module name as well as other useful information such as the load address and the assembly ID, you call the ICorProfilerInfo::GetModuleInfo method. Additional tasks you can perform with the interface methods include getting the metadata interfaces, forcing garbage collection, and starting up in process debugging. I won't discuss everything about the ICorProfilerInfo interface but rather will refer you to the Profiling.DOC file for the whole scoop.

After saving on the ICorProfilerInfo interface in your ICorProfilerCallback::Initialize method, your next step is to tell the CLR which notifications you're interested in seeing. The beauty of the ICorProfilerCallback system is that you'll get notified for only the items you request, so the CLR can minimize resource usage and run the profilee as fast as possible. To indicate which items you'd like notifications on, you'll call the ICorProfilerInfo::SetEventMask method, which takes a bit field indicating the particular items you're interested in.

Table 10-2 lists each bit flag you can set. Most are self-explanatory. Some values—those with a Yes entry in the Immutable column—can be set only during the ICorProfilerCallback::Initialize method call. If a notification bit flag is not immutable, you can toggle the notifications at any time your profiler is running. If you want to see which notification flags are turned on, you can call the ICorProfilerInfo::GetEventMask method. Most flags are self-explanatory, but COR_PRF_ENABLE_OBJECT_ALLOCATED and COR_PRF_MONITOR_OBJECT _ALLOCATED need a little explanation. The former is set in your ICorProfilerCallback::Initialize method to indicate you want the CLR to set up to monitor object allocation. The latter is to toggle the notification on and off.

Table 10-2: SetMethod Notification Flags
Flag^[*]	Immutable	Description
ALL	Yes	Turn on all notification flags.
APPDOMAIN_LOADS	No	Notify on each AppDomain load or unload.
ASSEMBLY_LOADS	No	Notify on each assembly load or unload.
CACHE_SEARCHES	No	Notify whenever the install-time code finds functions that have been run through Native Image Generator (NGEN).
CCW	No	Notify on each COM-callable wrapper.
CLASS_LOADS	No	Notify on each class load or unload.
CLR_EXCEPTIONS	No	Notify on each internal CLR exception handling.
CODE_TRANSITIONS	Yes	Notify on each transition from managed to unmanaged code or the reverse.
DISABLE_INLINING	Yes	Turn off method inlining for the entire process. If left enabled (i.e., not set), inlining notifications come through the ICorProfilerCallback.JITInlining notification.
DISABLE_OPTIMIZATIONS	Yes	Force the JIT compiler to disable optimizations.
ENABLE_IN_PROC_DEBUGGING	Yes	Enable in-process debugging to be used with the Profiling API.
ENABLE_JIT_MAPS	Yes	Enable JIT-map tracking.
ENABLE_OBJECT_ALLOCATED	Yes	Notify on each object allocated from the garbage collected heap.
ENABLE_REJIT	Yes	Force rejitting of install-time (NGEN) code generation so that JIT notifications are enabled for those functions.
ENTERLEAVE	No	Call function entry and exit hooks.
EXCEPTIONS	No	Notify on each non-CLR exception (i.e., all general exceptions).
FUNCTION_UNLOADS	No	Notify when functions are being unloaded.
GC	Yes	Notify when a garbage collection is about to occur.
JIT_COMPILATION	No	Notify on each function just before and after it's JIT-compiled.
MODULE_LOADS	No	Notify on each module load and unload.
NONE	No	Send no notifications.
OBJECT_ALLOCATED	No	Notify on each object being allocated on the garbage collected heap.
REMOTING	Yes	Notify on each remoting context crossing.
REMOTING_ASYNC	Yes	Notify on each remoting asynchronous event.
REMOTING_COOKIE	Yes	Generate cookies so that the profiler can pair remoting callbacks.
SUSPENDS	No	Notify when the CLR is suspended.
THREADS	No	Notify on each thread creation and destruction.
^[*]COR_PRF_ or COR_PRF_MONITOR_ have been removed from flag names for clarity.

Once you return S_OK from your ICorProfilerCallback::Initialize method, you'll receive the notifications you requested through the appropriate ICorProfilerCallback method. I'll discuss more about what you'll do with those in a moment because I want to make a point about the only other required method, the ICorProfilerCallback::Shutdown method.

If the process you're profiling starts life as a managed application, your Shutdown method will always be called. However, if your application starts running as a native application that loads the CLR, such as Visual Studio .NET, your Shutdown method will never be called. To fully handle your profiler being stopped, you'll need to process the DLL_PROCESS_DETACH flag in your profiler's DllMain and check whether your Shutdown method has been called. If it hasn't, you'll need to manually clean up, keeping in mind that because the application is ending, you need to be cognizant of what operations you perform. For an example of how to handle this situation, see the ExceptionMon code.

Other than the specific algorithms necessary to implement your particular profile, the bulk of your work will be looking up the values passed to the different ICorProfilerCallback notification methods. Many of the notification methods are passed an ID value that you can use to retrieve the particular object information. These IDs, which are unique to the Profiling API, are simply memory addresses to the items. Fortunately, the ICorProfilerInfo interface offers methods to help turn these IDs into real values. This generally involves calling the appropriate ICorProfilerInfo method, getting the metadata interface directly related to the ID, and using the metadata interface to do the heavy lifting.

Metadata refers to the data that describes each .NET object. Making objects self-describing with metadata is the crux of .NET. When doing managed development, the metadata is accessible through reflection. When doing native development that needs to access metadata, there's a reader interface, IMetaDataImport, and a writer interface, IMetaDataEmit. Most of the work you'll be doing with your profilers will involve reading data with IMetaDataImport. The IMetaDataEmit interface is what compilers use to create the metadata in a .NET compiled binary. The metadata interfaces are discussed in detail in the Metadata Unmanaged API.DOC file, so I'll refer you there because much of the metadata manipulation is pure grunt work.

Probably the best way to show you how to deal with the IDs and metadata is to show how to return the class and method name from a function ID. Function ID values are passed to numerous ICorProfilerCallback methods such as ExceptionUnwindFunctionEnter (to indicate which function is being unwound), JITCompilationFinished (to indicate which function was just JIT compiled), and ManagedToUnmanagedTransition (to indicate which function is transitioning native code). The code in listing 10-1 shows the GetClassAndMethodFromFunctionId method from ProfilerLib, which takes care of getting the class and method name from a function ID. As you can see, it's just a matter of grinding through the metadata interface.

Listing 10-1: GetClassAndMethodFromFunctionId

 BOOL CBaseProfilerCallback ::             GetClassAndMethodFromFunctionId ( FunctionID uiFunctionId ,                                               LPWSTR     szClass      ,                                               UINT       uiClassLen   ,                                               LPWSTR     szMethod     ,                                               UINT       uiMethodLen   ) {     // The magic of metadata is how I'll find this information.         // The return value.     BOOL bRet = FALSE ;          // The token for the function id.     mdToken MethodMetaToken = 0 ;     // The metadata interface.     IMetaDataImport * pIMetaDataImport = NULL ;         // Ask ICorProfilerInfo for the metadata interface for this     // functionID     HRESULT hr = m_pICorProfilerInfo->                 GetTokenAndMetaDataFromFunction ( uiFunctionId        ,                                                   IID_IMetaDataImport ,                                      (IUnknown**) &pIMetaDataImport   ,                                                   &MethodMetaToken    );     ASSERT ( SUCCEEDED ( hr ) ) ;     if ( SUCCEEDED ( hr ) )     {         // The token for the class.         mdTypeDef ClassMetaToken ;         // The total chars copies.         ULONG ulCopiedChars ;                  // Look up the method information from the metadata.         hr = pIMetaDataImport->GetMethodProps ( MethodMetaToken ,                                                 &ClassMetaToken ,                                                 szMethod        ,                                                 uiMethodLen     ,                                                 &ulCopiedChars  ,                                                 NULL            ,                                                 NULL            ,                                                 NULL            ,                                                 NULL            ,                                                 NULL             ) ;         ASSERT ( SUCCEEDED ( hr ) ) ;         ASSERT ( ulCopiedChars < uiMethodLen ) ;         if ( ( SUCCEEDED ( hr )             ) &&              ( ulCopiedChars < uiMethodLen )   )         {             // Armed with the class meta data token, I can look up the             // class.             hr = pIMetaDataImport->GetTypeDefProps ( ClassMetaToken ,                                                      szClass        ,                                                      uiClassLen     ,                                                      &ulCopiedChars ,                                                      NULL           ,                                                      NULL            ) ;             ASSERT ( SUCCEEDED ( hr ) ) ;             ASSERT ( ulCopiedChars < uiClassLen ) ;             if ( ( SUCCEEDED ( hr )           ) &&                  ( ulCopiedChars < uiClassLen )   )             {                 bRet = TRUE ;             }             else             {                 bRet = FALSE ;             }         }         else         {             bRet = FALSE ;         }         pIMetaDataImport->Release ( ) ;     }     else     {         bRet = FALSE ;     }         return ( bRet ) ; }

Getting Your Profiler Started

Up to now I've been discussing how profilers work but still haven't mentioned how you can get them started. Unfortunately, this process is the weakest link of the whole profiling system.

Two environment variables determine which profiler is loaded. The first environment variable you need to set to a nonzero value is Cor_Enable_Profiling, which tells the CLR that it's supposed to turn on profiling. The second environment variable is Cor_Profiler, which you'll set to either the CLSID or the ProgID of your profiler. The following shows how to set up the ExceptionMon profiler from the command line:

set Cor_Enable_Profiling=0x1 set COR_PROFILER={F6F3B5B7-4EEC-48f6-82F3-A9CA97311A1D}

Setting environment variables works great for Windows Forms and .NET console applications, but what about profiling Microsoft ASP.NET applications? Ah, there's the big rub. Since we're forced to set environment variables, you'll need to set the two environment variables in the system environment as shown in Figure 10-1 because that's where Microsoft Internet Information Services (IIS) and, in turn, ASPNET_WP.EXE/W3WP.EXE, will read the environment variables.

click to expand
Figure 10-1: Setting the system environment variables

With Visual Studio .NET 2003 and the .NET Framework 1.1, you can restart IIS so that the new instance of ASPNET_WP.EXE picks up the new global environment variables. To restart IIS, bring up the Internet Information Services console, right-click on the machine name, point to All Tasks, and select Restart IIS from the shortcut menu. In the Start/Stop/Reboot dialog box, choose Restart Internet Services On <computer name> from the drop-down list and click OK. ASPNET_WP.EXE/W3WP.EXE won't start until you ask IIS for an ASP.NET application. To ensure your environment variables are set, you can use Process Explorer, which I introduced in Chapter 3: double-click on ASPNET_WP.EXE/W3WP.EXE in the upper window, and in the Properties dialog box, move to the Environment tab.

By setting the system environment variables, you'll run into another issue. Because these are system-wide, any process that loads the CLR is automatically profiled. That might be your intention, but with more and more processes loading the CLR, you can really get into trouble quickly. For example, if you have a bug in your profiler (which I know won't happen, but just humor me), and you go to debug your profiler with Visual Studio .NET, you'll also get your profiler loaded into Visual Studio .NET, which can end up ruining your debugging experience. One option is to use remote debugging, but the option I prefer is to set up another environment variable to indicate which process or processes you want your profiler to run in. That way you can check a specific environment variable at startup and determine whether you're supposed to run in that particular process. If you don't want to run in the process, simply call ICorProfilerInfo::SetEventMask, passing COR_PRF_MONITOR_NONE as the mask in your ICorProfilerCallback::Initialize method.

Since checking to see whether you want your profiler running in a particular process is such a common operation, I implemented the method CBaseProfilerCallback::SupposedToProfileThisProcess in ProfilerLib to do the check for you. Pass the environment variable to check because the parameter and the function will return TRUE in the following conditions:

The environment variable is not set, which assumes you want to profile all processes.
The value of the environment variable matches the current process drive, path, and name completely.
The value of the environment variable matches just the filename of the current process.

At this point, I want to end the basic introduction to the Profiling API. You can do quite a bit more with it, which I didn't cover. However, instead of making your eyes glaze over with detail after detail and leaving you to wonder about how you'd apply the Profiling API, I think the best approach to explaining it is to show you some of the more advanced features as they apply to solving problems. By the end of these two chapters that touch on the Profiling API, you'll have a much more in-depth understanding than you'd have from just the Profiling.DOC.