Code Verifiability
The verification algorithm associates IL instructions with valid evaluation stack states and, first of all, with the number of stack slots occupied and available at each moment. Stack overflows and underflows render the code not only unverifiable but invalid as well. The verification algorithm also presumes all local variables are zero-initialized before the method execution begins. As a result, the .locals directive—at least one, if several of these are used throughout the method—must have the init clause in order for the method to be verifiable.
The verification algorithm simulates all possible control flow paths and branchings, checking to see whether legal stack states correspond to every reachable instruction. It is impossible, of course, to predict the actual values stored on the evaluation stack at every moment, but the number of stack slots occupied and the types of the slots can be assessed.
As mentioned, the evaluation stack type system is coarser than the metadata type system used for field, argument, and local variable types. Hence, the type validity of instructions transferring data between the stack and other typed memory categories depends on the type conversion performed during such transfers. Table 10-5 lists type conversions between different type systems.
Metadata Type | Stack Type | Managed Pointer to Type |
[unsigned] int8, bool | int32 | int8& |
[unsigned] int16, char | int32 | int16& |
[unsigned] int32 | int32 | int32& |
[unsigned] int64 | int64 | int64& |
native [unsigned] int, function pointer | native int | native int& |
float32 | Float | float32& |
float64 | Float | float64& |
Value type | Same type (see substitution rules in this section) | Same type& |
Object | Same type (see substitution rules in this section) | Same type& |
According to verification rules, type A can be replaced with type B in the following cases only:
If A is a class and B is the same class or any class derived from A
If A an interface and B is a class implementing this interface
If both A and B are interfaces and the implementation of B requires the implementation of A
If A is a class or an interface and B is a null reference
If both A and B are vectors and their element types can be respectively substituted
If both A and B are arrays of the same rank and their element types can be respectively substituted
If both A and B are function pointers and the signatures of their respective methods match
These substitution rules set the limits of “type leeway” allowed for the IL code to remain verifiable. As the verification algorithm proceeds from one instruction to another along every possible path, it checks the simulated stack types against the types expected by the next instruction. Failure to comply with the substitution rules results in verification failure and possibly indicates invalid IL code.
A few verification rules, rather heuristic than formal, are based on the question, “Is it possible in principle to do something unpredictable using this construct?”:
Any code containing embedded native code is unverifiable.
Any code using unmanaged pointers is unverifiable.
Any code containing calls to methods that return managed pointers is unverifiable. The reason: theoretically, the called method might return a managed pointer to one of its local variables or another “perishable” item.
An instance constructor of a class must call the instance constructor of the base class. The reason: until the base class .ctor is called, the instance pointer (this) is considered uninitialized; and until this is initialized, no instance methods should be called.
When a delegate is being instantiated—its constructor takes a function pointer as the last argument—the newobj instruction must be immediately preceded by the ldftn or ldvirtftn instruction, which loads the function pointer. If anything appears between these two instructions, the code becomes unverifiable.
A great many additional rules regulate structured exception handling, but the place to discuss them is the next chapter.