Protecting the Code | Inside Microsoft .NET IL Assembler

Protecting the Code

Thus far, we could have been quite confident that nothing bad would happen when we called the unmanaged function sscanf from the managed code, so we simply called it. But who knows what terrible dangers lurk in the deep shadows of unmanaged code? I don’t. So we’d better take steps to make sure that our application behaves in an orderly manner. For this purpose, we can employ the mechanism of structured exception handling, well known to C++ and Visual C# .NET programmers.

Examine the following light modifications of the sample (source file Simple2.il). As before, the modifications are marked with the comment CHANGE!.

//----------- Program header .assembly extern mscorlib { } .assembly OddOrEven  { } .module OddOrEven.exe //----------- Class declaration .namespace Odd.or {     .class public auto ansi Even extends [mscorlib]System.Object { //----------- Field declaration         .field public static int32 val //----------- Method declaration         .method public static void check( ) cil managed {             .entrypoint             .locals init (int32 Retval)         AskForNumber:             ldstr "Enter a number"             call void [mscorlib]System.Console::WriteLine(string)             .try {  // CHANGE!                 // Guarded block begins             call string [mscorlib]System.Console::ReadLine()                 // pop  // CHANGE!                 // ldnull // CHANGE!                 ldstr "%d"                  ldsflda int32 Odd.or.Even::val                 call vararg int32 sscanf(string,string,...,int32*)                 stloc.0                  leave.s DidntBlowUp  // CHANGE!                 // Guarded block ends             } // CHANGE!             // CHANGE! --->             catch [mscorlib]System.Exception             { // Exception handler begins                 pop                 ldstr "KABOOM!"                 call void [mscorlib]System.Console::WriteLine(string)                 leave.s Return             } // Exception handler ends         DidntBlowUp:         // <--- CHANGE!             ldloc.0              brfalse.s Error               ldsfld int32 Odd.or.Even::val             ldc.i4.1               and             brfalse.s ItsEven               ldstr "odd!"             br.s PrintAndReturn           ItsEven:             ldstr "even!"             br.s PrintAndReturn           Error:             ldstr "How rude!"         PrintAndReturn:             call void [mscorlib]System.Console::WriteLine(string)             ldloc.0               brtrue.s AskForNumber           Return:  // CHANGE!             ret         } // End of method     } // End of class } // End of namespace //----------- Calling unmanaged code .method public static pinvokeimpl("msvcrt.dll" cdecl)      vararg int32 sscanf(string,string) cil managed { }

What are these changes? One involves enclosing the “dangerous” part of the code in the scope of the so-called try block (or guarded block), which prompts the runtime to watch for exceptions thrown while executing this code segment. The exceptions are thrown if anything out of order happens—for example, a memory access violation or a reference to an undefined class or method.

      .try {             // Guarded block begins           call string [mscorlib]System.Console::ReadLine()           ldstr "%d"            ldsflda int32 Odd.or.Even::val           call vararg int32 sscanf(string,string,...,int32*)           stloc.0            leave.s DidntBlowUp             // Guarded block ends       }

Note that the try block ends with the instruction leave.s DidntBlowUp. This instruction—leave.s being a short form of leave—switches the computation flow to the location marked with the label DidntBlowUp. We cannot use a branching instruction here because, according to the rules of the common language runtime exception handling mechanism, strictly enforced by the JIT compiler, the only legal way out of a try block is via a leave instruction.

This limitation is caused by an important function performed by the leave instruction: before switching the computation flow, it unwinds the stack (strips off all the items currently on the stack) and, if these items are references to object instances, disposes of them. That is why we need to store the value returned by the sscanf function in the local variable Retval before using the leave instruction; if we tried to do it later, the value would be lost.

catch [mscorlib]System.Exception indicates that we plan to intercept any exception thrown within the protected segment and handle this exception:

    {                  leave.s DidntBlowUp           // Guarded block ends     }      catch [mscorlib]System.Exception     { // Exception handler begins         pop              }

Because we are intercepting any exception, we specified a generic managed exception type ([mscorlib]System.Exception), a type from which all managed exception types are derived. Technically, we could call [mscorlib]System.Exception the “mother of all exceptions,” but the proper term is somehow less colloquial: the “inheritance root of all exceptions.”

Mentioning another, more specific, type of exception in the catch clause—that is, [mscorlib]System.NullReferenceException—would indicate that we are prepared to handle only this particular type of exception and that exceptions of other types should be handled elsewhere. This approach is convenient if you want to have different handlers for different types of exceptions, and it’s the reason this mechanism is referred to as structured exception handling.

Immediately following the catch clause is the exception handler scope (the handler block):

    catch [mscorlib]System.Exception     { // Exception handler begins         pop         ldstr "KABOOM!"         call void [mscorlib]System.Console::WriteLine(string)         leave.s Return     } // Exception handler ends

When an exception is intercepted and the handler block is entered, the only thing present on the stack is always the reference to the intercepted exception—an instance of the exception type. In implementing the handler, we don’t want to take pains analyzing the caught exception, so we can simply get rid of it using the instruction pop. In this simple application, it’s enough to know that an exception has occurred, without reviewing the details.

Then we load the string constant "KABOOM!" onto the stack, print this string by using the console output method [mscorlib]System.Console::WriteLine(string), and switch to the label Return by using the instruction leave.s. The rule “leave only by leave” applies to the handler blocks as well as to the try blocks. We could not simply load the string "KABOOM!" onto the stack and leave to PrintAndReturn; the leave.s instruction would remove this string from the stack, leaving nothing with which to call WriteConsole.

You might be wondering why, if we are trying to protect the call to the unmanaged function sscanf, we included three preceding instructions in the try block? Why not include only the call to sscanf in the scope of .try?

      ldstr "Enter a number"       call void [mscorlib]System.Console::WriteLine(string)       .try {             // Guarded block begins           call string [mscorlib]System.Console::ReadLine()           ldstr "%d"            ldsflda int32 Odd.or.Even::val           call vararg int32 sscanf(string,string,...,int32*)           stloc.0            leave.s DidntBlowUp             // Guarded block ends       }

According to the exception handling rules, a guarded segment (a try block) can begin only when the method stack is empty. The closest such moment before the call to sscanf was immediately after the call to [mscorlib]System.Console::WriteLine(string), which took the string "Enter a number" from the stack and put nothing back. Because the three instructions immediately preceding the call to sscanf are loading the call arguments onto the stack, we must open the guarded segment before any of these instructions are executed.

Perhaps you’re puzzled by what seems to be a rather strict limitation. We cannot begin and end a try block anywhere we want, as we can in C++? Well, the truth is that you can do it the same way you do it in C++, but no better.

The high-level language compilers work in such a way that every completed statement in a high-level language is compiled into a sequence of instructions that begins and ends with the stack empty. In C++, our try block would look like this:

    try {         Retval = sscanf(System.Console::ReadLine(),                         "%d", &val);     }

This feature of high-level language compilers is so universal that all high-level language decompilers use these empty-stack points within the instruction sequence to identify the beginnings and ends of completed statements.

The last task remaining is to test our protection. Copy the source file Simple2.il from the companion CD into your working directory, and compile it with the console command ilasm simple2 into the executable Simple2.exe. Test it to ensure that it runs exactly as the previous samples do. Now let’s simulate A Horrible Disaster Within Unmanaged Code. Load the source file Simple2.il into any text editor, and uncomment the instructions pop and ldnull within the try block:

      .try {             // Guarded block begins           call string [mscorlib]System.Console::ReadLine()           pop             ldnull            ldstr "%d"            ldsflda int32 Odd.or.Even::val           call vararg int32 sscanf(string,string,...,int32*)           stloc.0            leave.s DidntBlowUp  // CHANGE!           // Guarded block ends       } // CHANGE!

The instruction pop removes from the stack the string returned by ReadLine, and ldnull loads a null reference instead. The null reference is marshaled to the unmanaged sscanf as a null pointer. Sscanf is not prepared to take it and will try to dereference the null pointer. The platform operating system will throw the unmanaged exception Memory Access Violation, which is intercepted by the common language runtime and converted to a managed exception of type System.NullReferenceException, which in turn is intercepted by our protection. The application will then terminate gracefully.

Recompile Simple2.il and try to run the resulting executable. You will get nothing worse than KABOOM! displayed on the console.

You can then modify the source code in Simple.il or Simple1.il, adding the same two instructions pop and ldnull after the call to System.Console::ReadLine. Recompile the source file to see how it runs without structured exception handling protection.