Advanced Rule Development | Debugging Microsoft .NET 2.0 Applications

As it turns out with rule development, you can approach your rule in several different ways. In this section, I'll show you those ways. What's amazed me about rule development is how little code you end up writing to produce some very complicated rules.

DoNotUseTraceAssertRule and CallAssertMethodsWithMessageParametersRule Rules

The two easiest-to-understand rules are the DoNotUseTraceAssertRule and CallAssertMethodsWithMessageParametersRule from the Wintellect.FxCop.UsageRules.dll assembly. As I discussed in the section "Assertions in .NET", in Chapter 3, you never want to call Trace.Assert in your application because it will remain in release builds, and having a customer staring at an assertion is a recipe for disaster. I also mentioned in that chapter how you should always use Debug. Assert with the message parameters so you can get more information in the assertion reporting than just that the assertion failed. Because those are on my list of rules that should be added to the Design Guidelines, I wanted a Code Analysis rule to ensure that you didn't violate them.

I started with the CallAssertMethodsWithMessageParametersRule because I thought it might be the easiest to develop. When you override the Check method that takes a Member parameter, you can convert the parameter into a Method, which has an Instructions property containing all the IL instructions as an array through the InstructionList class. As I was prototyping up the rule, I saw that each Instruction has a Value property defined as an Object. Using the great feature in the Watch window that shows the real type for anything in an Object value, for CALL and CALLVIRT instructions, the Value field was a Method. That turns out to be the method being called, and just as viewing the disassembly in ILDASM, its FullName property contains the complete name of the method. In about five minutes, I had the Check method implementing the rule as shown here:

public override ProblemCollection Check ( Member member ) {     // Safely convert the member into a method.     Method method = member as Method;     if ( null != method )     {         // It's simple enough to walk the instructions.         for ( int i = 0 ; i < method.Instructions.Length ; i++ )         {             OpCode currOp = method.Instructions [ i ].OpCode;             // Is this a call?             if ( ( currOp == OpCode.Call ) ||                  ( currOp == OpCode.Callvirt ) )             {                 Method callee = method.Instructions [ i ].                                                         Value as Method;                 if ( null != callee )                 {                     // Is this a call to Debug.Assert with a single                     // parameter?                     if ( callee.FullName ==                      "System.Diagnostics.Debug.Assert(System.Boolean)" )                     {                         // Yep. Report the error.                         Resolution res = GetResolution (                                   RuleUtilities.Format ( method ) );                         Problem prob = new Problem ( res );                         Problems.Add ( prob );                     }                 }             }         }     }     return ( Problems ); }

When I turned to the DoNotUseTraceAssertRule, it was obvious that the nearly identical approach would work. In fact, the only difference is that instead of looking for just a single overload, I needed to look for all System.Diagnostics.Trace.Assert methods. Fortunately, the Method type inherits from Member, which has a DeclaringType field, which will give you the namespace, and the Name field, which is an Identifier, will give you just the name of the method. This rule took all of two minutes to develop.

DoNotLockOnPublicFields, DoNotLockOnThisOrMe, DoNotLockOnTypes, and DoNotUseMethodImplAttributeWithSynchronized Rules

If you read my good friend Jeffrey Richter's seminal book, CLR via C#, Second Edition (Microsoft Press, 2005), you'd probably agree that his discussion in the section "The Monitor class and SyncBlocks" in Chapter 24, "Thread Synchronization," is both funny and sad at the same time. He discusses the "Great Idea" the CLR team had for how the designers thought SyncBlocks should work. Unfortunately, the way they were implemented was botched, and Jeffrey exposes all sorts of problems if you use the SyncBlocks as Microsoft intended. The way Jeffrey recommends to fix these issues is to do your SyncBlock locking with a private instance field in the class. When I read that section of the chapter, I immediately saw that all the problems Jeffrey mentioned should be Code Analysis rules to ensure that you don't accidentally expose those issues with your synchronization.

I first tackled the DoNotUseMethodImplAttributeWithSynchronized rule to look for the bad case of using MethodImplAttribute with the MethodImplAttributes.Synchronized. That's bad because static methods cause you to lock on all instances of that type, and instance methods will lock on the this/Me pointer. I coded up the rule to look for MethodImplAttribute attributes on the method using the techniques I used in the AssembliesHave*AttributesRules, but it didn't work. Poking around a bit with Reflector and ILDASM, I realized that there's something special about how MethodImplAttribute is handled by the compiler and the CLR because it is not through standard attributes. Fortunately, I also found that the Method type has a property, ImplFlags, which holds the Microsof.Cci.MethodImplFlags, which map directly to the real MethodImplAttribute. I simply ANDed the Method.ImplFlags with MethodImplFlags.Synchronized and would know if the invalid attribute was used so I could report the error.

When I turned to testing the rule on as much code as I could find, it turned out that there was a major problem with the rule. There was no bug in my code, but there's a problem in some automatically generated code. When you declare an event handler in your class, under the covers, you're actually generating two methods, add and remove, for your event that allow callers to hook up their event handlers. That's what makes it trivial to use the C# += operator on an event.

Sadly, that automatically generated code has the MethodImplFlags.Synchronized value plastered on both of those methods to do the synchronization. In an ideal world, everyone would implement their own add and remove methods for events and properly handle their synchronization. Alas, that's not going to happen in reality. Even though the auto-generated code is an error, seeing several dozen events always being flagged will cause people to turn off the rule, which is the worst thing that can happen in a diagnostic tool.

To make the rule more usable, I had to add code that specifically looked for the automatically generated event add and remove methods. The pattern is a simple regular expression: (add_|remove_).+\(.+EventHandler. If I see that pattern in the method's full name, I won't flag the method as an error. If you want more background information on why you want to explicitly do your own event add and remove methods, see the section "Explicitly Controlling Event Registration and Unregistration" in Chapter 10 of CLR via C#, Second Edition.

Whereas looking for an attribute is relatively easy, I figured that because the other three rules would require figuring out the parameter to Monitor.Enter, they would take a little more work, which they certainly did. My first thought was to look at the possibilities of "visiting" the appropriate methods. The BaseIntrospectionRule derives from the Microsoft.Cci.StandardVisitor, which has methods that enable you to ask the Introspection engine to walk the IL instructions for you. For example, if you were interested only in binary expressions, you'd override the virtual VisitBinaryExpression method, and your code gets called back when encountering them. With the Visit* methods, you can save yourself a huge amount of code because you don't have to do the manual analysis of every single instruction.

I coded up a prototype that overrode VisitMethodCall to see if I could tease out the parameter passed to Monitor.Enter in order to do the analysis on it. Unfortunately, the parameter data showed only that the data was an Object, which is the exact parameter that Monitor.Enter takes. When I looked at the IL in Reflector, I realized that the VisitMethodCall was working as it should because the code prior to the call had already done whatever conversion was necessary to force the data into an Object type. That's when I realized that I was going to have to do some IL analysis all on my own to determine the exact parameter types.

I whipped up some code in C# and Visual Basic that did all the synchronization wrong to see if I could find a pattern to the IL generation so I could look for that to produce the errors. After compiling the code every possible way to ensure that there weren't any optimization differences, I found the patterns. For example, if you your code makes a call to lock(this), the following code is produced:

The C# IL that indicates the this/Me pointer is passed is the following: 1. ldarg.0 2. dup 3. stloc.0 4. call  void [mscorlib]System.Threading.Monitor::Enter(object) For Visual Basic, the IL looks like the following: 1. ldarg.0 2. stloc.0 3. ldloc.0 4. call void [mscorlib]System.Threading.Monitor::Enter(object)

The actual instruction that puts the this pointer on the stack is LDARG.0. It turns out that in the case of lock(this) and lock(<public field>), three instructions before the CALL to Monitor.Enter is the key data about what's being passed to the Monitor.Enter call. A call to lock(<type>) was slightly different code in that the third instruction before is the CALL to Type.GetTypeFromHandle, and the fourth instruction before the CALL to Monitor.Enter is loading the token to the actual class on the stack.

Because these patterns are so similar, I developed a base class, BaseLockAnalysisRule, in .\Wintellect.FxCop.UsageRules\UsageRules.cs that does all the work to look for calls to Monitor .Enter, and if found, calls an abstract method, RuleLogic, which does the particular rule's problem reporting. In Listing 8-4, I extracted the BaseLockAnalysisRule.Check method so you could see the common work. Also in the listing is the DoNotLockOnPublicFields.RuleLogic method as an example. The RuleLogic method for DoNotLockOnThisOrMe and DoNotLockOnTypes are self-explanatory.

Listing 8-4. BaseLockAnalysis.Check and DoNotLockOnPublicFields.RuleLogic Methods

[View full width]
  //////////////////////////////////////////////// //////////////////   // BaseLockAnalysis.Check method (and helpers)  shared by   // DoNotLockOnPublicFields, DoNotLockOnThisOrMe,  and DoNotLockOnTypes   //////////////////////////////////////////////// //////////////////   public override ProblemCollection Check ( Member  member )   {       // Safely convert the member to a method.       Method method = member as Method;       if ( null != method )       {           // Save off if this is a Visual Basic  module.           Boolean isVisualBasic = RuleUtilities .IsVisualBasicModule (                                    method .DeclaringType.DeclaringModule );           // It turns out that calling the  VisitMethodCall does not give           // sufficient information about the  parameter. I'll look through           // the IL so I can extract the actual  item being passed. This           // is a little messy and works for C#  and Visual Basic. It may           // not work for other languages.           for ( int i = 0 ; i < method .Instructions.Length ; i++ )           {               OpCode currOp = method.Instructions  [ i ].OpCode;               // Is this a call?               if ( ( currOp == OpCode.Call ) ||                    ( currOp == OpCode.Callvirt ) )               {                   Method callee = method .Instructions [ i ].                                                            Value as Method;                   if ( null != callee )                   {                       // Is this a call to Monitor .Enter?                       if ( callee.FullName ==                          "System.Threading.Monitor .Enter(System.Object)" )                       {                           // Get the location of  the call to Monitor.Enter                           // so it can be output  in any error message.                           SourceContext ctx =  method.Instructions [ i ].                                                              SourceContext;                           if ( true == isVisualBasic )                           {                                HandleVisualBasicCode ( method , i , ctx );                           }                           else                           {                               HandleCSharpCode (  method , i , ctx );                           }                       }                   }               }           }       }       return ( Problems );   }   private void HandleVisualBasicCode ( Method method ,                                        int  callToEnter ,                                         SourceContext sourceContext )   {       // If two instructions above the call to  Monitor.Enter is a call       // to Microsoft.VisualBasic.CompilerServices .ObjectFlowControl::       // CheckForSyncLockOnValueType(object), the  actual parameter is       // passed to that item.       // Here's an example of the IL from a Visual  Basic method:       // L_0002: ldarg.0       // L_0003: ldfld object VBErrors.VBErrors .Class1::obj       // L_0008: stloc.0       // L_0009: ldloc.0       // L_000a: call void [Microsoft .VisualBasic]Microsoft.VisualBasic.       //                      CompilerServices .ObjectFlowControl::       //                                CheckForSyncLockOnValueType(object)       // L_000f: nop       // L_0010: ldloc.0       // L_0011: call void [mscorlib]System .Threading.Monitor::       //                                                      Enter(object)       int callIndex = 2;       Instruction callCheck = method.Instructions  [ callToEnter -                                                                callIndex ];       // If it's an NOP, it's a debug build, so go  back one more       // instruction.       if ( callCheck.OpCode == OpCode.Nop )       {           callIndex++;           callCheck = method.Instructions [  callToEnter - callIndex ];       }       Method callCheckMethod = callCheck.Value as  Method;       if ( null != callCheckMethod )       {           if ( true == callCheckMethod.FullName.                               Contains (  "CheckForSyncLockOnValueType" ) )           {               // The actual parameter is what's  passed to               // CheckForSyncLockOnValueType,  which I can handle in the               // C# code logic.               HandleCSharpCode ( method ,                                  callToEnter -  callIndex ,                                  sourceContext );           }           else           {               // There's no  CheckForSyncLockOnValueType call, so the               // code is generated like C#.               HandleCSharpCode ( method ,  callToEnter , sourceContext );           }       }       else       {           // If it's not a call instruction, treat  the code as normal.           HandleCSharpCode ( method , callToEnter  , sourceContext );       }   }   private void HandleCSharpCode ( Method method ,                                   int callToEnter ,                                   SourceContext  sourceContext )   {       Instruction fourthBack = null;       if ( ( callToEnter - 4 ) >= 0 )       {           fourthBack = method.Instructions [  callToEnter - 4 ];       }       Instruction thirdBack = method.Instructions  [ callToEnter - 3 ];       // Let the derived class do its magic.       RuleLogic ( fourthBack , thirdBack , method  , sourceContext );   }   //////////////////////////////////////////////// //////////////////   // DoNotLockOnPublicFields.RuleLogic Method   //////////////////////////////////////////////// //////////////////   protected override void RuleLogic ( Instruction  lockParameterOne ,                                       Instruction  lockParameterTwo ,                                       Method method ,                                        SourceContext sourceContext )   {       // This rule relies only on lockParameterTwo.       Debug.Assert ( null != lockParameterTwo ,                      "null != lockParameterTwo" );       if ( null == lockParameterTwo )       {           throw new ArgumentNullException (  Resources.LockParamTwoNull );       }       // The C# IL looks like the following:       // 1. ldarg.0       // 2. ldfld object UsageRulesLockTests .Class1::obj       // 3. dup       // 4. stloc.0       // 5. call void [mscorlib]System.Threading .Monitor::Enter(object)       // The Visual Basic IL looks like the  following. The caller takes       // care of getting the ldfld value from  CheckForSyncLockOnValueType       // before calling this method.       // 1. ldarg.0       // 2. ldfld object VBErrors.VBErrors.Class1::obj       // 3. stloc.0       // 4. ldloc.0       // 5. call void [Microsoft .VisualBasic]Microsoft.VisualBasic.       //                  CompilerServices .ObjectFlowControl::       //                                CheckForSyncLockOnValueType(object)       // 6. nop <-- Only there in a debug build!       // 7. ldloc.0       // 8. call void [mscorlib]System.Threading .Monitor::Enter(object)       Field field = null;       // Is this a ldfld instruction?       if ( lockParameterTwo.OpCode == OpCode.Ldfld )       {           // Grab the value out of the instruction.           field = lockParameterTwo.Value as Field;           if ( null != field )           {               // If the field is visible outside  the assembly, it's a               // public, so flag the error.               if ( true == field .IsVisibleOutsideAssembly )               {                   // Yep. Report the error.                   String fmt = RuleUtilities .Format ( method );                   Resolution res = GetResolution (  fmt , field.FullName );                   Problem prob = new Problem ( res  , sourceContext );                   Problems.Add ( prob );               }           }       }   }   }

When I first developed the DoNotLockOnPublicFields.RuleLogic method, I missed the IsVisibleOutsideAssembly property, so I went through a huge amount of work to look up the assembly containing the type so I could get the TypeNode for the field to determine if it was public. After struggling through some very tough crashes from deep inside the Introspection engine, I was wondering if I would be able to get the rule working. Following my own advice in "The Debugging Process" section of Chapter 1, "Bugs: Where They Come From and How You Solve Them," I thought creatively and walked away for a while. While working on another rule, I stumbled on the IsVisibleOutsideAssembly property in Reflector and realized that solved all the problems.

Such is the fun of working completely without documentation. When the future version of Visual Studio that comes with the new analysis engine and full documentation is released, it won't be as challenging to develop rules. The good news is that today you have rules that work to find very nasty SyncBlock issues so you can make sure that your code avoids them.

AvoidBoxingAndUnboxingInLoops Rule

As has been beaten into the head of nearly every developer using .NET, converting from value types to objects (boxing) and objects to value types (unboxing) is a bad thing to do frequently because it can kill your performance. However, unless you are disassembling your binary to text and manually searching for the BOX and UNBOX instructions after every build, it's hard to know exactly when boxing and unboxing happen. An occasional BOX instruction is unavoidable, such as when you pass an integer as a parameter to String.Format, but having that BOX or UNBOX instruction inside a loop can be a major performance killer.

When I first developed AvoidBoxingAndUnboxingInLoops from the Wintellect.FxCop.PerformanceRules.dll assembly, I thought that all I would have to do was look through the instructions for the method, and if I found any BOX or UNBOX instructions, I'd report the error. I liked that approach, because it turned out that implementing the rule was as simple as the following code snippet:

private void WalkInstructionsManually ( Method memberMethod ) {     // Look at each instruction in the method.     for ( int branchIndex = 0 ;           branchIndex < memberMethod.Instructions.Length ;           branchIndex++ )     {         switch ( memberMethod.Instructions [ branchIndex ].OpCode )         {             // Is it a box?             case OpCode.Box:                 boxInstuctCount++;                 break;             // Is it any kind of unbox?             case OpCode.Unbox:             case OpCode.Unbox_Any:                 unboxInstructCount++;                 break;         }     } }

The only problem with simply looking for the raw instructions is that you end up with more false positives than I felt you should. When talking to people who used the rule written with the above loop, they found it rather annoying and had a tendency to ignore it. My next thought was to report an error only if there were three BOX instructions and one UNBOX instruction. That eliminated many of the false positive warnings, but it came back to bite me on a big performance-tuning job.

The client had slow code, and one of my standard tricks is to run Code Analysis or the stand-alone FxCop across the code with my rules enabled. The analysis reported nothing out of the ordinary, so we rolled up our sleeves and started digging like crazy through performance counters in PerfMon and other performance tools. After several days of hard digging, we found that the problem was a single BOX instruction inside a critical loop. Changing a structure, which is a value type, into a class gave us the performance boost the client needed. I felt horrible that the client had to wait several days for me to fix the problem when a smarter rule would have narrowed this problem down in minutes. I immediately started looking at how I could report only when I saw BOX or UNBOX instructions that occurred in any looping construct.

Handling all the looping constructs (do, for, foreach, and while) was going to be a big job. However, I saw that the Microsoft.Cci.StandardVisitor from the Code Analysis Microsoft.Cci.dll had very cool methods called VisitDoWhile, VisitFor, VisitForEach, and VisitWhile, so I immediately overrode each of them and ran through a quick test to see what I would have to do to find the BOX and UNBOX instructions. Sadly, it turns out that those sexy methods are never called. In fact, when I looked at FxCop 1.35, which was released after Visual Studio 2005, those methods no longer appear on the Microsoft.Cci.StandardVisitor class. That meant that I was going to have to do the instruction stream analysis myself.

After hours of writing small methods and disassembling everything in sight, I found the patterns in IL generation that indicate a loop versus a conditional. Conditionals will move to offsets forward in the IL stream, whereas branches move backwards in the stream. Now that I had the patterns, it was possible to look for the BOX and UNBOX instructions that occur inside just the loops. Listing 8-5 shows all the code for the AvoidBoxingAndUnboxingInLoops rule. In the code are the comments that describe every action I performed.

The main item to notice in the code is that I took a bit of the easy way out in the algorithm in which I analyze loops. I don't do any special processing for nested loops, so if you have a BOX in an inner loop, it will be reported as an error for both the inner and outer loops. When I sketched out what it would take to handle all the recursion to account for them, I felt that it was easiest to look through each loop even though I might be looking through various instruction sections multiple times.

Listing 8-5. AvoidBoxingAndUnboxingInLoops Rule Code

[View full width]
  /*------- ----------- ----------- -------------------------------------------------    * Debugging Microsoft .NET 2.0 Applications    * Copyright © 1997-2006 John Robbins -- All  rights reserved.    ----------------------------------------------- ------------------------------*/   using System;   using System.Collections;   using System.Globalization;   using System.Xml;   using Microsoft.Cci;   using Microsoft.FxCop.Sdk;   using Microsoft.FxCop.Sdk.Introspection;   using System.Diagnostics;   using System.Diagnostics.CodeAnalysis;   namespace Wintellect.FxCop.PerformanceRules   {       /// <summary>       /// The rule to check for box and unbox  instructions inside loops.       /// </summary>       [CLSCompliant ( false )]       public class AvoidBoxingAndUnboxingInLoops :  BasePerformanceRule       {           /// <summary>           /// Gets the base class hooked up.           /// </summary>           public AvoidBoxingAndUnboxingInLoops ( )               : base (  "AvoidBoxingAndUnboxingInLoops" )           {           }           /// <summary>           /// Sets the visibility to all methods.           /// </summary>           public override TargetVisibilities  TargetVisibility           {               get { return ( TargetVisibilities .All ); }           }           /// <summary>           /// Checks for excessive box and unbox  instructions inside loops.           /// </summary>           /// <param name="member">           /// The method to check.           /// </param>           /// <returns>           /// null  - The assembly has a box/unbox  problem in loops.           /// !null - The error message to report.           /// </returns>           public override ProblemCollection Check  ( Member member )           {               // Safely convert the member to a  method.               Method method = member as Method;               if ( null != method )               {                   // Look at each instruction in  the method.                   for ( int iCurr = 0 ;                         iCurr < method .Instructions.Length ;                         iCurr++ )                   {                       // Get the current instruction.                       Instruction ins = method .Instructions [ iCurr ];                       // Is this a looping branch?                       if ( true == IsLoopingBranch  ( ins ) )                       {                           // Calculate the index  into this array of instructions                           // where the branch goes.                           int branchToIndex =  FindBranchOffsetIndex ( iCurr ,                                                              method.Instructions );                           // Now I've got the  starting and ending indexes of the                           // loop range. Run  through and look for all BOX and                           // UNBOX instructions.  In case you're wondering, my                           // algorithm here does  not pay attention to nested loops.                           // In other words, if  there are two nested loops, I'll                           // be looking through  the inner loop code twice. I                           // looked at trying to  handle nested loops, but it                           // quickly degenerates  into a nasty situation in which                           // I'd have to create  numerous data structures to track                           // which indexes in the  instruction array I've already                           // looked at and  processed. In the big scheme of things,                           // I thought this  simpler route was more than sufficient                           // to do the work.                           if ( true ==  HasBoxOrUnboxInLoop ( branchToIndex ,                                                               iCurr ,                                                             method.Instructions ) )                           {                               // Got an error.                               // Put the name in  the same format as FxCop does.                               String typeName =  RuleUtilities.Format ( method );                               // Get the one  resolution.                               Resolution res =  GetResolution ( typeName );                               Problem prob = new  Problem ( res );                               Problems.Add ( prob );                           }                       }                   }               }               return ( Problems );           }           private static bool HasBoxOrUnboxInLoop  ( int branchToIndex ,                                                      int branchIndex ,                                                   InstructionList instructionList )           {               for ( int i = branchToIndex ; i <  branchIndex ; i++ )               {                   switch ( instructionList [ i ] .OpCode )                   {                       // Is it a box?                       case OpCode.Box:                       // Is it any kind of unbox?                       case OpCode.Unbox:                       case OpCode.Unbox_Any:                           // Just one of these is  bad enough to trigger the error.                           return ( true );                       //break;                       default:                           break;                   }               }               return ( false );           }           private static int FindBranchOffsetIndex  ( int branchIndex ,                                                   InstructionList instructionList )           {               // Get the offset we're looking for.               int branchToOffset =  (int)instructionList [ branchIndex ].Value;               int iCurr = branchIndex - 1;               // Loop backwards until we find the  top offset for the loop.               while ( iCurr > 0 )               {                   if ( branchToOffset ==  instructionList [ iCurr ].Offset )                   {                       return ( iCurr );                   }                   iCurr--;               }               Debug.Assert ( iCurr != 0 , "iCurr  != 0" );               return ( 0 );           }           // I can't really help the complexity  here. The switch statement in the           // method is the problem.           [SuppressMessage ( "Microsoft .Maintainability" ,                              "CA1502 :AvoidExcessiveComplexity" )]           private static Boolean IsLoopingBranch (  Instruction ins )           {               Boolean retValue = false;               switch ( ins.OpCode )               {                   // Is this a branching instruction?                   case OpCode.Bge:                   case OpCode.Bge_Un:                   case OpCode.Bgt:                   case OpCode.Bgt_Un:                   case OpCode.Ble:                   case OpCode.Ble_Un:                   case OpCode.Blt:                   case OpCode.Blt_Un:                   case OpCode.Bne_Un:                   case OpCode.Brfalse:                   case OpCode.Brfalse_S:                   case OpCode.Brtrue:                   case OpCode.Brtrue_S:                       {                           // From *lots* of  staring at all sorts of IL code                           // generation, the  pattern is that conditionals (if                           // statements jump  forward in the IL stream), whereas                           // loops (for, do, while , foreach) move backwards.                           // Therefore, if the  Value field, which is the branch                           // to offset, is less  than this instruction's offset,                           // we're looking at a loop.                           int branchToOffset =  (int)ins.Value;                           if ( branchToOffset <  ins.Offset )                           {                               retValue = true;                           }                       }                       break;               }               return ( retValue );           }       }   }

ExceptionDocumentationInvalidRule and ExceptionDocumentationMissingRule Rules

The final rules I want to discuss are the most important in my opinion. As you could tell from Chapter 2, I'm a huge believer in XML documentation, especially the <exception> tag. I wanted rules that would tell you when you were missing that tag in your documentation for your direct throws and if you documented an exception but didn't actually throw the exception.

Poking through the Visit* methods, I saw that the VisitThrow method looked as if it fit the bill, but I wasn't so sure it would work because every other time I tried to use the Visit* methods, they didn't work. Fortunately, in this case, they do work. As you've seen, the Node class, which is the base type for all parameters to the Visit* methods, contains lots of descriptive information about operands, values, and other important items you'll need to build great analysis rules.

The Throw class, which is passed to the VisitThrow method, has an Expression property that can access the TypeNode that's being thrown in the Type property. Consequently, the statement Throw.Expression.Type.FullName is all that it takes to get the type being thrown, regardless of whether it's a throw followed by a new exception type or a throw of a local variable.

Of course, there was no way developing this rule was going to be straightforward. When I started the work to process the RETHROW instructions, the Expression property was null, so I had a problem finding the type. I could have given up, deciding not to handle rethrows, but I really felt they were important to document. After a good bit of poking around, I realized that I needed to look at the catch blocks in the method, which contain information on the exception type to be rethrown, to see if I could match up the rethrow to its actual block.

Getting the exception handlers for a method is trivial because the Introspection engine hands them to you right in the Method.ExceptionHandlers property. One thing I should tell you at this point about the Introspection engine metadata reader is that it's rather lazyit won't read in the metadata until it is absolutely necessaryso I accessed only Method.ExceptionHandlers in the VisitThrow method. The act of visiting gets the metadata all read in.

Although having all the catch blocks was great, I wasn't sure how I was going to exactly match a RETHROW instruction to the catch block where it occurred. After some poking around, I found a very interesting property, UniqueKey, that all Node values contain. Node is the root base class of most of the types in Microsoft.Cci namespace. I wondered how unique it was because if everything provided by the Introspection engine (from individual instructions through higher constructs) had a unique value, I could look for the particular RETHROW instruction inside each catch block's instructions so I could match them up. Sure enough, that's exactly how everything works out.

Of course, there was one last trip-up. The ExceptionHandler contains the first basic block of the entire catch statement and the block after the catch block ends. It does not have any of the blocks between those two points. To make things even more bizarre, there's no way that I found to loop through the instructions starting at an arbitrary place in the stream. This means that unless the UniqueKey property for the rethrow was in the first basic block of the catch statement, I wasn't going to be able to figure out the type. Therefore, it was going to make the rule almost worthless in my opinion.

After nearly rubbing my head bald trying to figure out how to get the blocks between the start of the catch and the end, I finally realized that sometimes the only way is the brute-force way. What I needed to do was save the method being processed, and when looking for the rethrow UniqueKey, start at the first statement in the method and then skip to the start of the catch block. Once there, I could look for the UniqueKey value indicating the rethrow. It's not very pretty, but it certainly works. Considering that the entire rule-writing system is undocumented, I'm quite happy about how the solution turned out. Listing 8-6 shows the Check, VisitThrow, and IsRethrowInCatchBlock methods from the BaseExceptionDocumentationRule class, which is the base class for ExceptionDocumentationInvalidRule and ExceptionDocumentationMissingRule.

Listing 8-6. BaseExceptionDocumentationRule.Check, BaseExceptionDocumentationRule.VisitThrow, and BaseExceptionDocumentationRule.IsRethrowInCatchBlock methods

[View full width]
  //////////////////////////////////////////////// //////////////////   // BaseExceptionDocumentationRule.Check Method   //////////////////////////////////////////////// //////////////////   public override ProblemCollection Check ( Member  member )   {       if ( null == member )       {           throw new ArgumentNullException (  Constants.InvalidMember );       }       Method memberMethod = member as Method ;       if ( null == memberMethod )       {           return ( null ) ;       }       // Start by saving off the method under  evaluation.       currentMethod = memberMethod ;       // Allocate a new array list to fill with  exceptions for this       // method.       methodExceptions = new ArrayList ( ) ;       // Get the declaring module so I can look up  the XML Documentation       // file.       String mod = memberMethod.DeclaringType .DeclaringModule.Location ;       // Get the XML Doc Comment file for this module.       XmlDocCommentsFileDictionary currXmlDocs =                                              OpenModuleXmlDocFile ( mod ) ;       // Start the walking of this method's code.  I'm interested only in       // the body of the method, not all the  attributes and such, so I'll       // walk just the method body instead of the  whole thing.       VisitBlock ( memberMethod.Body ) ;       // Now that pounding through the IL is done,  get the documented       // exception tags for this method into an  array list.       ArrayList docdExceptions = null ;       if ( null == currXmlDocs )       {           docdExceptions = new ArrayList ( ) ;       }       else       {           // Get the documented exceptions for  this method. Notice           // that Method.DocumentationId.Name  contains the XML Doc           // Comment decorated name. However, the  name returned by           // Cci is correct for straight methods  but not for properties,           // indexers, or nested items, so I need  to correct it.           String xmlName = FixCciDocumentationName (                                       memberMethod .DocumentationId.Name);           docdExceptions = currXmlDocs .GetExceptionTypesThrown(xmlName);       }       // Because the RuleLogic methods could use  ArrayList.BinarySearch, I       // need to sort the elements in both arrays  before passing them on.       methodExceptions.Sort ( ) ;       docdExceptions.Sort ( ) ;       // Ask the derived type to do its logic. If  there are any items in       // the problemExceptTypes array, those are  the ones with problems.       ArrayList problemExceptTypes = RuleLogic (  methodExceptions ,                                                   docdExceptions    ) ;       // Were there any problems to report?       if ( problemExceptTypes.Count > 0 )       {           // Yep, create the Problem report.           foreach ( String exVal in  problemExceptTypes )           {               Resolution res =                   GetResolution ( RuleUtilities .Format ( memberMethod ),                                   exVal                                );               Problem prob = new Problem ( res ) ;               Problems.Add ( prob ) ;           }       }       return ( Problems ) ;   }   //////////////////////////////////////////////// //////////////////   // BaseExceptionDocumentationRule.VisitThrow Method   //////////////////////////////////////////////// //////////////////   public override Statement VisitThrow ( Throw Throw )   {       if ( null == Throw )       {           throw new ArgumentNullException (  Constants.InvalidThrowType );       }       // Nothing like a check that's a palindrome!  Here I'm checking if       // the Throw parameter is a THROW or a  RETHROW instruction.       if ( Throw.NodeType == NodeType.Throw )       {           // The easy case! Grab the Expression .Type field, and I've got           // the type being thrown.           methodExceptions.Add ( Throw.Expression .Type.FullName ) ;       }       else if ( Throw.NodeType == NodeType.Rethrow )       {           // A little harder case. There's no type  associated with the           // rethrow instruction, so I need to  grind through the           // exception handlers and match up this  instruction to the           // appropriate handle and extract the  type that way.           // Get the ExceptionHandlerList from the  method. The reason I           // don't do this in the Check method is  that the FxCOP metadata           // reader is lazy and won't read the  metadata unless necessary.           // If I save off the  ExceptionHandlerList *BEFORE* calling           // VisitBlock, *AND* this is the only  rule run, the metadata           // reader hasn't filled it out yet. By  grabbing the           // ExceptionHandler list here, I'm  assured that it's always           // filled in.           ExceptionHandlerList exceptionHandlerList =                                          currentMethod.ExceptionHandlers;           // Just to make sure....           Debug.Assert ( exceptionHandlerList .Length > 0 ,                          "exceptionHandlerList .Length > 0" );           // Signals I've found the appropriate block.           bool foundType = false;           // Loop through the exception handlers.           for ( int i = 0 ; i <  exceptionHandlerList.Length ; i++ )           {               ExceptionHandler exh =  exceptionHandlerList [ i ];               // I care only if it's a catch block.               if ( NodeType.Catch == exh.HandlerType )               {                   // Do the hard work of looking  through the whole catch                   // block.                   foundType =  IsRethrowInCatchBlock ( Throw.UniqueKey ,                                                        exh );                   if ( true == foundType )                   {                       methodExceptions.Add ( exh .FilterType.FullName );                       break;                   }               }           }           Debug.Assert ( true == foundType , "true  == foundType" );       }   #if DEBUG       // Just because I'm paranoid...       else       {           Debug.Assert ( false , "Invalid Throw  type!!" );       }   #endif       return ( base.VisitThrow ( Throw ) ) ;   }   //////////////////////////////////////////////// //////////////////   // BaseExceptionDocumentationRule .IsRethrowInCatchBlock Method   //////////////////////////////////////////////// //////////////////   private Boolean IsRethrowInCatchBlock ( int  rethrowId ,                                            ExceptionHandler exh )   {       // Here's the deal: The ExceptionHandler has  just the *start* block       // in it, not all the blocks. Fortunately,  it also has the block       // that occurs after the catch. This method  will get all the blocks       // between those ranges (excluding the  start) and look to see if       // rethrow's unique ID appears between those  ranges. If so, it       // returns true.       // This whole gyration is necessary because  there's no way to ask       // for the next block.       // The big assumption here is that the code  blocks are in logical       // order. I feel that's a pretty safe  assumption to make. ;)       // For a bit of an optimization, I'll go  ahead and look through the       // exh.HandlerStartBlock for the rethrow.  This will find those       // simple cases [catch(...){ throw; }] and  avoid grinding through       // the big list of Statements looking for  the particular catch.       StatementList startBlockStatements =                            exh.HandlerStartBlock .Statements;       for ( int quickLookStatement = 0 ;             quickLookStatement <  startBlockStatements.Length ;             quickLookStatement++ )       {           if ( rethrowId ==                    startBlockStatements [  quickLookStatement ].UniqueKey )           {               // Got it!               return ( true );           }       }       // The throw is not in the first basic block , so grind through all       // the blocks in the catch.       // Save off the unique ids for the starting  block and the after the       // catch block.       int startBlockKey = exh.HandlerStartBlock .UniqueKey;       int afterBlockKey = exh.BlockAfterHandlerEnd .UniqueKey;       // We have to start at the beginning  statement in the method and       // grind all the way through.       int currentMethodStatement = 0 ;       // Run through until we find the starting block.       for ( ;             currentMethodStatement < currentMethod .Body.Statements.Length ;             currentMethodStatement++ )       {           if ( startBlockKey ==currentMethod.Body.                          Statements [  currentMethodStatement ].UniqueKey )           {               break;           }       }       // I've already looked at the current block  above, so I'll just bump       // right on past it here.       currentMethodStatement++;       // Loop through all the blocks in the catch  statement until I find       // the block after the catch.       for ( ;             currentMethodStatement < currentMethod .Body.Statements.Length ;             currentMethodStatement++ )       {           Block currBlock = (Block)                  currentMethod.Body.Statements [  currentMethodStatement ];           // Have I reached the end?           if ( currBlock.UniqueKey == afterBlockKey )           {               // Yep, leave.               return ( false );           }           // Run through the instructions in this  block looking for my           // matching rethrow.           int currInstruction = 0;           for ( ;                 currInstruction < currBlock .Statements.Length ;                 currInstruction++ )           {               if ( currBlock.Statements [  currInstruction ].UniqueKey ==                                                                 rethrowId )               {                   return ( true );               }           }       }       // This is pretty serious!       return ( false );   }

Once I figured out the secret to getting all the information I needed, the actual work on the rules was easy. I wrote a class, .\Wintellect.FxCop.DesignRules\XmlDocCommentsFileDictionary.cs, which makes handling the XML documentation file for an assembly quite easy. For the ExceptionDocumentationMissingRule, after I've found all throw and rethrow types, I build another list of all the documented throws from the XML documentation file. I remove all the items documented from the throw list, and any left over are the missing documentation items. Since ExceptionDocumentationInvalidRule is the opposite action, I put all the core logic in the BaseExceptionDocumentationRule class so the work in the actual rule classes is quite simple.