Using the Team Developer Profiler | Professional Visual Studio 2005 Team System (Programmer to Programmer)

The Team System developers have done a good job making the profiler in Team Developer easy to use. You follow four basic steps to profile your application:

Create a performance session, selecting a profiling method (sampling or instrumentation) and its target(s).
Use the Performance Explorer to view and set the session's properties.
Launch the session, executing the application and profiler.
Review the collected data as presented in performance reports.

Each step is described in the following sections.

Creating a sample application

Before describing how to profile an application, we'll create a sample application that you can use to work through the content of this chapter. Of course, this is only for demonstration, and you can certainly use your own existing applications instead.

Create a new C# Console Application and name it DemoConsole. This application will demonstrate some differences between using a simple class and a structure. First, add a new class file called WidgetClass.cs with the following class definition:

      namespace DemoConsole      {          public class WidgetClass          {              private string _name;              private int _id;              public int ID              {                  get { return _id; }                  set { _id = value; }              }              public string Name              {                  get { return _name; }                  set { _name = value; }              }              public WidgetClass(int id, string name)              {                  _id = id;                  _name = name;              }          }      }

Now we'll slightly modify that class to make it a value type. Make a copy of the WidgetClass.cs file named WidgetValueType.cs and open it. To make WidgetClass into a structure, change the word class to struct. Now rename the two places you see WidgetClass to WidgetValueType and save the file.

You should have a Program.cs already created for you by Visual Studio. Open that file and add the following code:

      using System;      using System.Collections;      namespace DemoConsole      {          class Program          {              static void Main(string[] args)              {                  ProcessClasses(2000000);                  ProcessValueTypes(2000000);              }              public static void ProcessClasses(int count)              {                  ArrayList widgets = new ArrayList();                  for (int i = 0; i < count; i++)                     widgets.Add(new WidgetClass(i, "Test"));                  string[] names = new string[count];                  for (int i = 0; i < count; i++)                     names[i] = ((WidgetClass)widgets[i]).Name;              }              public static void ProcessValueTypes(int count)              {                  ArrayList widgets = new ArrayList();                  for (int i = 0; i < count; i++)                     widgets.Add(new WidgetValueType(i, "Test"));                  string[] names = new string[count];                  for (int i = 0; i < count; i++)                     names[i] = ((WidgetValueType)widgets[i]).Name;              }          }      }

You now have a simple application that performs many identical operations on a class and a similar structure. First, it creates an ArrayList and adds 2,000,000 copies of both WidgetClass and WidgetValueType. It then reads through the ArrayList, reading the Name property of each copy and storing that name in a string array. You'll see how the seemingly minor differences between the class and structure affect the speed of the application, the amount of memory used, and its effect on the .NET garbage collection process.

Creating a performance session

To begin profiling an application, you first need to create a performance session. This is normally done using the Performance Wizard, which walks you through the most common settings. You may also create a blank performance session or base a new performance session on a unit test result. Each of these methods is described below.

Using the Performance Wizard

The easiest way to create a new performance session is to use the Performance Wizard. Select Tools Performance Tools Performance Wizard. A three-step wizard will guide you through the creation of your session.

The first step, shown in Figure 12-1, is to select the target for your profiling session.

image from book
Figure 12-1

By default, the first entries in the list will be any projects in your current solution. You may also choose an executable, an assembly, or a local ASP.NET application. Select the DemoConsole application and click Next.

The second step, shown in Figure 12-2, is to select the profiling method you wish to use, Sampling or Instrumentation. You will usually want to begin with sampling for your applications, so select that for the DemoConsole profiling session.

image from book
Figure 12-2

The final step is a simple confirmation of your selections. Click Finish to complete the wizard and create your new performance session. Although you can now run your performance session, you may want to change some settings. We describe these settings in the section "Setting general session properties."

Adding a blank performance session

There may be times — for example, when you're profiling a Windows Service — when manually specifying all of the properties of your session is useful or necessary. In those cases, you can skip the Performance Wizard and manually create a performance session.

Create a blank performance session by selecting Tools Performance Tools New Performance Session. You will see a new performance session, named "Performance1," in the Performance Explorer window. This window is described in detail in the section "Performance Explorer" later in this chapter.

After creating the blank performance session, you will need to manually specify the profiling mode, target(s), and settings for the session. We describe performance session settings in the section "Setting general session properties."

Creating a performance session from a unit test

The third option for creating a new performance session is from a unit test. Refer to Chapter 14 for a full description of the unit testing features in Team System.

There may be times when you have a unit test that verifies the processing speed (perhaps relative to another method or a timer) of a target method. Perhaps a test is failing due to system memory issues. In such cases, you might want to use the profiler to determine what code is causing problems.

To create a profiling session from a unit test, first run the unit test. Then, in the Test Results window, right-click on the test and choose Create Performance Session from the context menu, as shown in Figure 12-3. A unit test does not have to fail for you to create a performance session from it.

image from book
Figure 12-3

Visual Studio will then create a new performance session with the selected unit test automatically assigned as the session's target. When you run this performance session, the unit test will be executed as normal, but the profiler will be activated and collect metrics on its performance.

Performance Explorer

Once you have created your performance session, you can view it using the Performance Explorer. The Performance Explorer, shown in Figure 12-4, is used to configure and execute performance sessions and to view the results of those sessions.

image from book
Figure 12-4

The Performance Explorer features two folders for each session: Targets and Reports. Targets specify which application(s) will be profiled when the session is launched. Reports list the results from each of the current session's runs. We describe these reports in detail later in this chapter.

Performance Explorer also supports multiple sessions. For example, you might have one session configured for sampling and another for instrumentation. We recommend you rename them from the default "Performance X" names for easier identification.

If you accidentally close a session in Performance Explorer, you can reopen it by using the Open option of the File menu. You will likely find the session file (ending with .psess) in your solution's folder.

Setting general session properties

Whether you used the Performance Wizard to create your session or added a blank one, you may want to review and modify the session's settings. Right-click on the session name (e.g., Performance Session.psess) and choose Properties (refer to Figure 12-4). You will see the Property Pages dialog for the session. It features several sections, described next.

Note

In this section, we focus on the property pages that are applicable to either type of profiling sessions. These include the General, Launch, Counters, and Events pages. The other pages each apply only to one type of profiling. The Sampling page is described in the section "Configuring a sampling session," and the Binary, Instrumentation, and Advanced pages are described in the section "Configuring an instrumentation session," later in the chapter.

General property page

The General page of the Property Pages dialog is shown in Figure 12-5.

image from book
Figure 12-5

The "Profiling collection" panel of this dialog reflects your chosen profiling type (i.e., Instrumentation or Sampling).

The ".NET memory profiling collection" panel enables the tracking of managed types. When the first option, "Collect .NET object allocation information" is enabled, the profiling system will collect details about the managed types that are created during the application's execution. The profiler will track the number of instances, the amount of memory used by those instances, and which members created the instances. If the first option is selected, the second option, "Also collect .NET object lifetime information" will be enabled. If selected, additional details about the amount of time each managed type instance remains in memory will be collected. This will enable you to view further impacts of your application, such as its effect on the .NET garbage collector.

The options in the memory profiling panel are off by default. Turning them on adds substantial overhead and will cause both the profiling and report generation processes to take additional time to complete. When the first option is selected, the Allocation View of the session's report is available for review. The second option enables display of the Objects Lifetime View. These reports are described in the section "Reading and interpreting session reports."

Finally, you can use the Report panel to set the name and location of the reports that are generated after each profiling session. By default, a timestamp is used after the report name so you can easily see the date of the session run. Another default appends a number after each subsequent run of that session on a given day. (You can see the effect of these settings in Figure 12-11 later in the chapter, where multiple report sessions were run on the same day.)

image from book
Figure 12-11

For example, the settings in Figure 12-5 will run an instrumented profile without managed type allocation profiling. If run on November 19, 2005, it will produce a report named "SampleApp051119.vsp." Another run on the same day would produce a report named "SampleApp051119(1).vsp."

Launch property page

While our sample application has only one binary to execute and analyze, your projects may have multiple targets. In those cases, use the Launch property page to specify which targets should be executed when the profiling session is started or "launched." You can set the order in which targets will be executed using the Move Up and Move Down arrow buttons.

Targets are described in the section "Configuring session targets" later in this chapter.

Counters property page

The Counters property page, shown in Figure 12-6, is used to enable the collecting of CPU-related performance counters as your profiling sessions run. Enable the counters by checking Collect Chip Performance Counter Data. Then, select the counters you wish to track from the Available Counters list and click the right-pointing arrow button to add them to the Selected Counters list.

image from book
Figure 12-6

Events property page

The Events property page enables you to collect additional trace information from a variety of event providers. This can include items from Windows itself, such as disk and file I/O as well as the .NET Framework itself. If you're profiling an ASP.NET application, for example, you can collect information from IIS and ASP.NET.

Configuring session targets

If you used the Performance Wizard to create your session, you will already have a target specified. You can modify your session's targets with the Performance Explorer. Simply right-click on the Targets folder and choose Add Target Binary, or, if you have a valid candidate project in your current solution, Add Target Project. You can also add an ASP.NET website target by selecting Add Existing Web Site.

Each session target can be configured independently. Right-click on any target and you will see a context menu like the one shown in Figure 12-7.

image from book
Figure 12-7

Note

The properties of a target are different from those of the overall session, so be careful to right-click on a target and not the performance session's root node.

If the session's mode is instrumentation, an Instrument option will also be available. This indicates that when you run this session, that target will be included and observed. This option is not shown if your collection mode is set to sampling because sampling automatically observes any executing code.

The other option is Set as Launch. When you have multiple targets in a session, you should indicate which of the targets will be started when the session is launched. For example, you could have several assembly targets, each with launch disabled (unchecked), but one application EXE that uses those assemblies. In that case, you would mark the application's target with the Set as Launch property. When this session is launched, the application will be run and data will be collected from the application and the other target assemblies.

If you select the Properties option, you will see a Property Pages dialog for the selected target, as shown in Figure 12-8. Remember that these properties only affect the currently selected target, not the overall session.

image from book
Figure 12-8

If you choose Override Project Settings, you can manually specify the path and name of an executable to launch. You can provide additional arguments to the executable and specify the working directory for that executable as well.

Note

If the selected target is an ASP.NET application, this page will instead contain a "Url to launch" field.

The Instrumentation property page, shown in Figure 12-9, enables you to optionally indicate executables to run before and/or after the instrumentation process occurs for the current target. You may exclude the specified executable from instrumentation as well.

image from book
Figure 12-9

Note

Because instrumenting an assembly changes it, instrumenting signed assemblies will break them because the assembly will no longer match the signature originally generated. In order to work with signed assemblies, you need to add a post-instrument event, which calls to the strong naming tool, sn.exe. In the Command-line field, call sn.exe, supplying the assembly to sign and the keyfile to use for signing. You will also need to check the Exclude from Instrumentation option. Adding this step will sign those assemblies again, allowing them to be used as expected.

The Advanced property page is identical to the one under the General project settings. It is used to supply further command-line options to VSInstr.exe, the utility used by Visual Studio to instrument assemblies when running an instrumentation profiling session. You can see the available switches in the "Command-line Execution" section later in this chapter

Configuring a sampling session

Sampling is a very lightweight method of investigating an application's performance characteristics. Sampling causes the profiler to periodically interrupt the execution of the target application, noting which code is executing and taking a snapshot of the call stack. When sampling completes, the report will include data such as function call counts. You can use this information to determine which functions might be bottlenecks or critical paths for your application, and then create an instrumentation session targeting those areas.

Because you are taking periodic snapshots of your application, the resulting view might be inaccurate if the duration of your sampling session is too short. For development purposes, you could set the sampling frequency very high, enabling you to obtain an acceptable view in a shorter time. However, if you are sampling against an application running in a production environment, you might wish to minimize the sampling frequency to reduce the impact of profiling on the performance of your system. Of course, doing so will require a longer profiling session run to obtain accurate results.

By default, a sampling session will interrupt the target application every 10,000,000 clock cycles. If you open the session property pages and click the Sampling page, as shown in Figure 12-10, you may select other options as well.

image from book
Figure 12-10

You can use the Sampling Interval field to adjust the number of clock cycles between snapshots. Again, you may want a higher value, resulting in less frequent sampling, when profiling an application running in production, or a lower value for more frequent snapshots in a development environment. The exact value you should use will vary depending on your specific hardware and the performance of the application you are profiling.

Three other sampling methods are available. If you have an application that is memory intensive, you may try a session based on page faults. This causes sampling to occur when memory pressure triggers a page fault. From this, you will be able to get a good idea of which code is causing those memory allocations.

You can also sample based on system calls. In these cases, samples will be taken after the specified number of system calls (as opposed to normal user-mode calls) have been made. You may also sample based on a specific CPU performance counter (such as misdirected branches or cache misses).

Note

These alternative sampling methods are used to identify very specific conditions, so sampling based on clock cycles is what you need most of the time.

Configuring an instrumentation session

Instrumentation is the act of inserting probes or markers in a target binary, which when hit during normal program flow cause the logging of data about the application at that point. This is a more invasive way of profiling an application, but because you are not relying on periodic snapshots, it is also more accurate.

Important

Instrumentation can quickly generate a large amount of data, so you should begin by sampling an application to find potential problem areas, or hotspots. Then, based on those results, instrument specific areas of code that require further analysis.

When you're configuring an instrumentation session, three additional property pages can be of use: Instrumentation, Binary, and Advanced. The Instrumentation tab is identical to the Instrumentation property page that is available on a per-target basis, shown in Figure 12-9. The difference is that the target settings are specific to a single target, whereas the session's settings specify executables to run before/after all targets have been instrumented.

The Binary property page is used to manage the location of your instrumented binaries. By checking Relocate Instrumented Binaries and specifying a folder, Team System will take the original target binaries, instrument them, and place them in the specified folder.

For instrumentation profiling runs, Team System automatically calls the VSInstr.exe utility to instrument your binaries. Use the Advanced property page to supply additional options and arguments (such as /VERBOSE) to that utility. The available switches are described in the section "Command-line Execution" later in this chapter.

Executing a performance session

Once you have configured your performance session and assigned targets, you can execute, or launch, that session. Use the Performance Explorer window (refer to Figure 12-4), right-click on a specific session, and choose Launch.

Note

Before you launch your performance session, ensure that your project and any dependent assemblies have been generated in Release Configuration mode. Profiling a Debug build will not be as accurate because such builds are not optimized for performance and will have additional overhead.

Because Performance Explorer can hold more than one session, you will designate one of those sessions as the current session. By default, the first session is marked as current. This enables you to click the green launch button at the top of the Performance Explorer window to invoke that current session.

You may also run a performance session from the command line. For details, see the section "Command-line Execution" later in this chapter.

When a session is launched, you can monitor its status via the output window. You will see the output from each of the utilities invoked for you. If the target application is interactive, you can use the application as normal. When the application completes, the profiler will shut down and generate a report.

When profiling an ASP.NET application, an instance of Internet Explorer is launched, with a target URL as specified in the target's "Url to launch" setting. Use the application as normal through this browser instance and Team System will monitor the application's performance. Once the Internet Explorer window is closed, Team System will stop collecting data and generate the profiling report.

Important

You are not required to use the browser for interaction with the ASP.NET application. If you have other forms of testing for that application, such as web and load tests (described in Chapter 15), simply minimize the Internet Explorer window and execute those tests. When you're done, return to the browser window and close it. The profiling report will then be generated and will include usage data resulting from those web and load tests.

Managing session reports

When a session run is complete, a new session report will be added to the Reports folder for the executed session. For details about how to modify how these report files are generated, see the General property page description in the "Setting general session properties" section.

As shown in Figure 12-11, the Reports folder holds all of the reports for the executions of that session.

Double-click on a report file to generate and view the report. You can right-click on a report and select Export Report, which will display the Export Report dialog box shown in Figure 12-12. You can then select one or more sections of the report to send a target file in XML or comma-delimited format. This can be useful if you have another tool that parses this data, or to transform via XSL into a custom report view.

image from book
Figure 12-12

The items contained in the Reports folders are simply data files. The generation of the reports from that data is performed as you open each file, so expect a delay when viewing a report, especially if it came from a long or highly sampled run.

Reading and interpreting session reports

A performance session report is composed of a number of different views. These views offer different ways to inspect the large amount of data collected during the profiling process. The data in many views are interrelated, and you will see that entries in one view can lead to further detail in another view. Note that some views will have content only if you have enabled optional settings before running the session.

The amount and kinds of data collected and displayed by a performance session report can be difficult to understand and interpret at first. In the following sections, we'll walk though each section of a report, describing its meaning and how to interpret the results.

In any of the report views, you can select which columns appear and their order by right-clicking in the report and selecting Choose Columns. Select the columns you wish to see and how you want to order them using the move buttons.

Report statistic types

The specific information displayed by each view will depend on the settings used to generate the performance session. Sampling and instrumentation will produce different contents for most views, and including .NET memory profiling options will affect the display as well. Before describing the individual views that make up a report, it is important to understand some key terms.

Elapsed time includes all of the time spent between the beginning and end of a given function. Application time is an estimate of the actual time spent executing your code, subtracting system events. Should your application be interrupted by another during a profiling session, elapsed time will include the time spent executing that other application, but application time will exclude it.

Inclusive time combines the time spent in the current function with time spent in any other functions that it may call. Exclusive time will remove the time spent in other functions called from the current function.

Note

If you forget these definitions, hover your mouse pointer over the column headers and a tool tip will give you a brief description of the column.

Summary View

When you view a report, Summary View is displayed by default. There are two types of summary reports, depending on whether you ran a sampling or instrumented profile. Figure 12-13 shows a Summary View from a sampling profile of the DemoConsole application.

image from book
Figure 12-13

The sampled Summary View has two sections: Top Inclusive Sampled Functions and Top Exclusive Sampled Functions. The Samples column shows the number of times the application was actively executing the given function when samples were taken.

A function is considered exclusive if it is the actively running function when a sampling occurs. The functions that are higher than the current function on the call stack — in other words, the functions that called the current function — are counted as inclusive functions.

Note

Notice that several of the functions aren't function names, but names of DLLs — for example, [mscorwks.dll]. This occurs when a function is sampled for which the system does not have debugging symbols. This frequently happens when running sampling profiles and occasionally with instrumented profiles. We describe this issue and how to correct it in the section "Common Profiling Issues" later in the chapter.

For the DemoConsole application, this view isn't showing much of interest. At this point, you would normally investigate the other views, but because the DemoConsole application is trivially small, sampling to find hotspots will not be as useful as the information you can gather using instrumentation. Let's change the profiling type to instrumentation and see what information is revealed.

In the Performance Explorer window, find the drop-down field on the toolbar that currently reads Sampling and change it to instrumentation. Click the Launch button on the same toolbar to execute the profiling session, this time using instrumentation. When profiling and report generation are complete, you will see a Summary View similar to that shown in Figure 12-14.

image from book
Figure 12-14

The Summary View of an instrumented session has three sections. The three most commonly called functions are first, ordered by the number of calls. Next are the functions with the most individual work. These are the functions that required the most time, not counting the time spent in any other functions they called, similar to the concept of exclusive time. The final list is of the functions that took the longest, including all activity and system time, similar to application inclusive time.

In this view, you can begin to see some interesting results. Note that ArrayList.Add and ArrayList.get_Item were each called 4,000,000 times. This makes sense because ProcessValueTypes and ProcessClasses, which use that method, were each called 2,000,000 times. However, in the Functions Taking Longest section, there is a noticeable difference in the amount of time spent in ProcessingValue Types over ProcessClasses. Remember that the code for each is basically the same — the only difference is that one works with structures and the other with classes. You can use the other views to investigate further.

Right-click on any function and you will be able to go to that function's source, view it in Functions View, or see the function in Caller/Callee View. You can double-click on any function to switch to the Functions view. You can also select one or more functions, right-click, and choose Copy to add the name, time, and percentage of time to the Clipboard for pasting to other documents.

The Summary View has an alternate layout that is used when the ".NET memory profiling collection" options are enabled on the General page of the session properties. You can see this view in Figure 12-15.

image from book
Figure 12-15

Notice that the three main sections are different. The first section, Functions Allocating Most Memory, shows the top three functions in terms of bytes allocated to managed types. The second section, Types With Most Memory Allocated, shows the top three types by bytes allocated without regard to the functions involved. Finally, Types With Most Instances shows the top three types in terms of number of instances, without regard to the size of those instances.

Finally, notice in Figure 12-15 that a warning symbol appears at the bottom of the view, warning you that a large number of objects were collected in generation 2, leading to possible inefficiency. You'll see what this means in the section "Objects Lifetime View" later in this chapter.

Using these three views, you can quickly get a sense for the highest-use functions and types within your application. You'll see in the following sections how to use the other views to dive into further detail.

Functions View

From the Summary View of the instrumented profiling session, choose the DemoConsole.Program.ProcessValueTypes method and switch to Functions View. The Functions View, shown in Figure 12-16, lists all functions sampled or instrumented during the session. For instrumentation, this will be functions in targets that were instrumented and called during the session. For sampling, this will include any other members/assemblies accessed by the application.

image from book
Figure 12-16

As with most of the views, you can click on a column heading to sort by that column. This is especially useful for the four time columns. Right-clicking in the Functions View and selecting Group By Module will cause the functions to be grouped under their containing binary.

Reviewing the data for ProcessValueTypes and ProcessClasses, it seems ProcessValueTypes is clearly slower. However, there are some other interesting differences in performance here. Notice that both WidgetClass and WidgetValueType have a get_Name method. This is the Name property get accessor for each. Interestingly, according to Elapsed Exclusive and Application Exclusive time, the system spent over three times longer in the class's version than the structure's. You may wonder why these methods have identical values in the inclusive time columns and the exclusive time columns. Inclusive time is a measure of time spent in a function in addition to the time spent in child functions. Because the Name property get accessors call no other instrumented functions, the time is identical.

Another difference between the value type and class processing can be seen in the last column, Application Inclusive Time. This column shows all time spent in that function as well as functions that it calls, minus system overhead. In this column, the ProcessClasses method takes substantially longer than ProcessValueTypes. The difference is not explained solely by the Name property. The Application Inclusive Time column makes it clear that the constructor for the class also takes longer than the value type's constructor.

To reveal more information, double-click on the ProcessClasses method or right-click and choose Show in Caller/Callee View.

Caller/Callee View

Caller/Callee View, shown in Figure 12-17, displays a particular function in the middle, with the function(s) that call into it in the section above it and any functions that it calls in the bottom section.

image from book
Figure 12-17

This is particularly useful for pinpointing the execution flow of your application, helping to identify hotspots. In Figure 12-17, the ProcessClasses method is in focus and shows that the only caller is the Main method. You can also see that ProcessClasses directly calls five functions. The sum of times in the caller list will match the time shown for the set function. For example, select the ArrayList.get_Item accessor by double-clicking or right-clicking and choosing Set Function. The resulting window will then display a table similar to what is shown in Figure 12-18.

image from book
Figure 12-18

You saw ArrayList.get_Item in the main Functions View, but couldn't tell how much of that time resulted from calls by ProcessValueTypes or ProcessClasses. Caller/Callee View enables you to see this detail. Notice that there are two callers for this function, and that the sum of their time equals the time of the function itself. In this table, you can see that the ArrayList.get_Item method actually took about 67 percent longer to process the 2,000,000 requests from ProcessValueTypes than those from ProcessClasses.

What could account for this difference in performance? The only real difference between WidgetClass and WidgetValueType is that one is a reference type and one is a value type. Remember that ArrayList works by treating everything it contains as an object reference. In order for a value type to behave like an object, it must be boxed. Boxing creates an object instance and references the value from the value type. After boxing, the value looks like an object and can be used anywhere an object reference is necessary, such as when adding members to an ArrayList. To read items back from an ArrayList, the process must be reversed, called unboxing, to access the original value type.

The performance impact of boxing and unboxing is typically minor, but when performed many times, such as with a large collection, the impact can be substantial, as you can see in Figure 12-18. In fact, the cost associated with boxing was a motivating factor for the addition of generics with .NET 2.0.

Call Tree View

The Call Tree View, shown in Figure 12-19, shows a hierarchical view of the calls executed by your application. The concept is somewhat similar to the Caller/Callee View, but in this view a given function may appear twice if it is called by independent functions. For example, in Figure 12-19, System.Collections.ArrayList.Add appears under both the ProcessValueTypes and ProcessClasses nodes. If that same method were viewed in Caller/Callee View, it would appear once, with both parent functions listed at the top. You can quickly switch to Caller/Callee View by right-clicking on a function and choosing Show in Caller/Callee View. The same option is available for Functions View.

image from book
Figure 12-19

By default, the view will have a root (the function at the top of the list) of the entry point of the instrumented application. To quickly expand the details for any node, right-click and choose Expand All. Any function with dependent calls can be set as the new root for the view by right-clicking and choosing Set Root. This will modify the view to show that function at the top, followed by any functions that were called directly or indirectly by that function. To revert the view to the default, right-click and choose Reset Root.

Allocation View

If you configured your session for managed allocation profiling by choosing "Collect .NET object allocation information" on the General property page for your session, you will have access to the Allocation View, shown in Figure 12-20. This view displays the managed types that were created during the execution of the profiled application.

image from book
Figure 12-20

You can quickly see how many instances, the total bytes of memory used by those instances, and the percentage of overall bytes consumed by the instances of each managed type.

Expand any type to see the functions that caused the instantiations of that type. You will see the breakdown of instances by function as well, so if more than one function created instances of that type, you can quickly determine which created the most. This view is most useful when sorted by Total Bytes Allocated or Percent of Total Bytes. This quickly tells you which types are consuming the most memory when your application runs.

Note

An instrumented profiling session will track and report only the types allocated directly by the instrumented code. A sampling session may show other types of objects. This is because samples can be taken at any time, even while processing system functions, such as security. Try comparing the allocations from sampling and instrumentation sessions for the same project. You will likely notice more object types in the sampling session.

As with the other report views, you can also right-click on any function to switch to an alternative view such as source code, Functions View, or Caller/Callee View.

In the case of DemoConsole, the details in Figure 12-20 don't indicate any major discrepancies or items of concern. The bytes consumed and instances allocated by both branches of the application seem to be the same. However, we can dig a little deeper into how those instances affected the system by using the Objects Lifetime View, described next.

Objects Lifetime View

The Objects Lifetime View, shown in Figure 12-21, is available only if you have selected the "Also collect .NET object lifetime information" option of the General properties for your session. This option is only available if you have also selected the "Collect .NET object allocation information" option.

image from book
Figure 12-21

Important

The information in this view becomes more accurate the longer the application is run. If you are concerned about results you see, increase the duration of your session run to help ensure that the trend is accurate.

Several of the columns are identical to the Allocation View table, including Instances, Total Bytes Allocated, and Percent of Total Bytes. However, in this view you can't break down the types to show which functions created them. The value in this view lies in the details about how long the managed type instances existed and their effect on garbage collection.

The first three columns show the number of instances of each type that were collected during specific generations of the garbage collector. With COM, objects were immediately destroyed and memory freed when the count of references to that instance became zero. However, .NET relies on a process called garbage collection to periodically inspect all object instances to determine whether the memory they consume can be released. Objects are placed into groups, called generations, according to how long each instance has remained referenced. Generation zero contains new instances, generation one instances are older, and generation two contains the oldest instances. New objects are more likely to be temporary or short in scope than objects that have survived previous collections, so having objects organized into generations enables .NET to more efficiently find objects to release when additional memory is needed.

The next column is Large Object Heap Instances Collected, which refers to object instances that receive different treatment in the garbage collection process.

The last two columns in Figure 12-21 are Instances Alive At End and Instances. The latter is the total count of instances of that type over the life of the profiling session. The former indicates how many instances of that type were still in memory when the profiling session terminated. This might be because the references to those instances were held by other objects. It may also occur if the instances were released right before the session ended, before the garbage collector acted to remove them. Having values in this column does not necessarily indicate a problem; it is simply another data item to consider as you evaluate your system.

Having a large number of generation-zero instances collected is normal, fewer in generation one, and the fewest in generation two. Anything else indicates there might be an opportunity to optimize the scope of some variables. For example, a class field that is only used from one of that class's methods could be changed to a variable inside that method. This would reduce the scope of that variable to live only while that method is executing.

Looking at the data generated from the DemoConsole application, you can see a number of interesting things. First, the WidgetClass instances are all collected in generation two. By itself, this doesn't indicate a problem, but it does mean that the WidgetClass instances survived collection at least twice. This is likely because the instances of WidgetClass were being referenced by the ArrayList throughout the first part of the program's execution. Once the program began processing the value types, the garbage collector could begin reclaiming the memory allocated to the WidgetClass instances.

Second, note that the WidgetValueType instances were collected in small amounts during generation zero and one, and more were collected in generation two. Most importantly, notice that about 75 percent of them were never collected by the garbage collector and were alive at the end of the session. This can be partially attributed to the fact that value types are based on the stack and not the managed heap, where the garbage collector does its work.

Like the data shown in the other report views, use the data in this view not as definitive indicators of problems, but as pointers to places where improvements might be realized. Also, keep in mind that with small or quickly-executing programs, allocation tracking might not have enough data to provide truly meaningful results.