Signatures

Signatures

Now that you know more about type encoding, let’s look at how the item types are set in the common language runtime. Program items such as fields, methods, and local variables are not characterized by encoded types; rather, they are characterized by signatures. A signature is a binary object containing one or more encoded types and residing in the #Blob stream of metadata.

The following metadata tables refer to the signatures:

  • Field table  Field declaration signature

  • Method table  Method declaration signature

  • Property table  Property declaration signature

  • MemberRef table  Field or method referencing signature

  • StandAloneSig table  Local variables or indirect call signature

  • TypeSpec table  Type specification signature

Calling Conventions

The first byte of a signature defines the calling convention of the signature, which in turn identifies the type of the signature. The CorHdr.h file defines the following calling convention constants in the enumeration CorCallingConvention:

  • IMAGE_CEE_CS_CALLCONV_DEFAULT (0x0)  Default (“nomal”) method with a fixed-length argument list. ILAsm has no keyword for this calling convention.

  • IMAGE_CEE_CS_CALLCONV_VARARG (0x5)  Method with a variable-length argument list. The ILAsm keyword is vararg.

  • IMAGE_CEE_CS_CALLCONV_FIELD (0x6)  Field. ILAsm has no keyword for this calling convention.

  • IMAGE_CEE_CS_CALLCONV_LOCAL_SIG (0x7)  Local variables. ILAsm has no keyword for this calling convention.

  • IMAGE_CEE_CS_CALLCONV_PROPERTY (0x8)  Property. ILAsm has no keyword for this calling convention.

  • IMAGE_CEE_CS_CALLCONV_UNMGD (0x9)  Unmanaged calling convention, not currently used by the common language runtime and not recognized by ILAsm.

  • IMAGE_CEE_CS_CALLCONV_HASTHIS (0x20)  Instance method that has an instance pointer (this) as an implicit first argument. The ILAsm keyword is instance.

  • IMAGE_CEE_CS_CALLCONV_EXPLICITTHIS (0x40)  Method call signature. The first explicitly specified parameter is the instance pointer. The ILAsm keyword is explicit.

The calling conventions instance and explicit are the modifiers of the default and vararg method calling conventions. The calling convention explicit can be used only in conjunction with instance and only at the call site, never in the method declaration.

Calling conventions for field, property, and local variables signatures don’t need special ILAsm keywords because they are inferred from the context.

Field Signatures

A field signature is the simplest kind of signature. It consists of a single encoded type (SET), which of course follows the calling convention byte:

   <field_sig> ::= <callconv_field> <SET> 

Although this type encoding (SET) can be quite long, especially in the case of a multidimensional array or a function pointer, it is nevertheless a single type encoding. In a field signature, SET cannot have & or pinned or sentinel modifiers, and it cannot be void.

The field calling convention is always equal to IMAGE_CEE_CS_CALL CONV_FIELD, regardless of whether the field is static or instance. The information is inferred from the context in which the field is referenced.

Method and Property Signatures

The structures of method and property signatures (and I am talking about method and property declarations here) are similar:

   <method_sig> ::= <callconv_method> <num_of_args> <return_type>                     [<arg_type>[,<arg_type>*] ]

   <prop_sig> ::= <callconv_prop> <num_of_args> <return_type>                   [<arg_type>[,<arg_type>*] ]

The difference is in the calling convention. The calling convention for a method signature is the following:

   < callconv_method > ::= <default>  // Static method, default                                        // calling convention                              vararg      // Static vararg method                               instance    // Instance method, default                                          // calling convention                              instance vararg // Instance vararg method

The calling convention for a property signature is always equal to IMAGE_CEE_CS_CALLCONV_PROPERTY.

Having noted this difference, we might as well forget about property signatures and concentrate on method signatures. The truth is that a property signature—excluding the calling convention—is a composite of signatures of the property’s access methods, so it is no great wonder that method and property signatures have similar structures.

Remember that in the method calling convention, the combined calling conventions, such as instance vararg, are the products of bitwise OR operations performed on the respective calling convention constants.

The value <num_of_args>, a compressed unsigned integer, is the number of parameters, not counting the return type. The values <return_type> and <arg_type> are SETs. The difference between them and the field’s SET is that the modifier & is allowed in both <return_type> and <arg_type>. The difference between <return_type> and <arg_type> is that <return_type> can be void and <arg_type> cannot.

Instance methods have the implicit first argument this, which is not reflected in the signature. This implicit argument is a reference to the instance of the method’s parent type. It has a class reference type for classes and interfaces and a managed pointer for value types.

MemberRef Signatures

Member references, which are kept in the MemberRef metadata table, are the references to fields and methods, usually those defined outside the current module. There are no specific MethodRefs and FieldRefs, so you must look at the calling convention of a MemberRef signature to tell a field reference from a method reference.

MemberRef signatures for field references are the same as the field declaration signatures discussed earlier; see “Field Signatures.” MemberRef signatures for method references are structurally similar to method declaration signatures, although you should note two differences concerning the values of signature components:

  • The calling convention can contain the modifier explicit, which indicates that the instance pointer of the parent object (this) is explicitly specified in the method signature as the first parameter.

  • In the argument list of a vararg method reference, a sentinel can precede the optional arguments. The sentinel itself does not count as an additional argument, so if you call a vararg method with one mandatory argument and two optional arguments, the MemberRef signature will have an argument count of three and an argument list structure that looks like this:

    <mandatory_arg> <sentinel><opt_arg1><opt_arg2>

Indirect Call Signatures

To call methods indirectly, IL has the special instruction calli. This instruction takes argument values plus a function pointer from the stack and uses the StandAloneSig token as a parameter. The signature indexed by the token is the signature by which the call is made. Effectively, calli takes a function pointer and a signature and presumes that the signature is the correct one to use in calling this function:

   ldc.i4.0      // Load first argument    ldc.i4.1      // Load second argument    ldftn   void Foo::Bar(int32, int32) // Load function pointer    calli   void(int32, int32)   // Call Foo::Bar indirectly

Indirect call signatures are similar to the method signatures of MemberRefs, but their calling convention might be one of the unmanaged calling conventions, if the method called indirectly is in fact unmanaged.

Unmanaged calling conventions are defined in CorHdr.h in the CorUnmanagedCallingConvention enumeration as follows:

  • IMAGE_CEE_UNMANAGED_CALLCONV_C (0x1)  C/C++-style calling convention. The call stack is cleaned up by the caller. The ILAsm notation is unmanaged cdecl.

  • IMAGE_CEE_UNMANAGED_CALLCONV_STDCALL (0x2)  Win32 API calling convention. The call stack is cleaned up by the callee. The ILAsm notation is unmanaged stdcall.

  • IMAGE_CEE_UNMANAGED_CALLCONV_THISCALL (0x3)  C++ member method (non-vararg) calling convention. The callee cleans the stack, and the this pointer is pushed on the stack last. The ILAsm notation is unmanaged thiscall.

  • IMAGE_CEE_UNMANAGED_CALLCONV_FASTCALL (0x4)  Argu- ments are passed in registers when possible. The ILAsm notation is unmanaged fastcall. This calling convention is not supported in the first release of the runtime.

Local Variables Signatures

Local variables signatures are the second type of signatures referenced by the StandAloneSig metadata table. Each such signature contains type encodings for all local variables used in a method. The method header can contain the Stand- AloneSig token, which identifies the local variables signature. This signature is retrieved by the loader when it prepares the method for JIT compilation.

Local variables signatures are to some extent similar to method declaration signatures, with two differences:

  • The calling convention is IMAGE_CEE_CS_CALLCONV_LOCAL_SIG.

  • Local variables signatures have no return type. The local variable count is immediately followed by the sequence of encoded local variable types:

       <locals_sig> ::= <callconv_locals> <num_of_vars>                     <var_type>[,<var_type>*] ]

    <var_type> is the same SET as <arg_type> in method declaration signatures—it can be anything except void.

Type Specifications

Type specifications are special metadata items residing in the TypeSpec table and representing type constructs—as opposed to TypeDefs and TypeRefs, which represent types (classes, interfaces, and value types).

A common example of a type construct is a vector or an array of classes or value types. Consider the following code snippet:

   .locals init(int32[0 ,0 ] iArr) // Declare 2-dim array reference    ldc.i4 5      // Load size of first dimension    ldc.i4 10   // Load size of second dimension    // Create array by calling array constructor:    newobj instance void int32[0 ,0 ]::.ctor(int32,int32)    stloc iArr   // Store reference to new array in iArr

In the newobj instruction, we specified a MemberRef of the constructor method, parented not by a type but by a type construct, int32[0 ,0 ]. The question is, “Whose .ctor is it, anyway?”

You might recall that arrays and vectors are generics and can be actualized only in conjunction with some nongeneric type, producing a new class—in our case, a two-dimensional array of 4-byte integers with zero lower bounds. So the constructor we called was the constructor of this class.

And, of course, a natural way to represent such a type construct is by a signature. That’s why TypeSpec records have only one entry, containing an offset in the #Blob stream, pointing at the signature. Personally, I think it’s a pity the TypeSpec record contains only one entry; a Name entry could be of some use. We could go pretty far with named TypeSpecs.

The TypeSpec signature has no calling convention and consists of one SET, which, however, can be fairly long. Consider, for example, a multidimensional array of function pointers that have function pointers among their arguments.

TypeSpec tokens can be used with all IL instructions that accept TypeDef or TypeRef tokens. In addition, as you’ve seen, MemberRefs can be scoped to TypeSpecs as well as TypeRefs. The only places where TypeSpecs cannot replace TypeDefs or TypeRefs are the extends and implements clauses of the class declaration.

Two additional kinds of TypeSpecs, other than vectors and arrays, are unmanaged pointers and function pointers which are not true generics, in that no abstract class exists from which all pointers inherit. Of course, both types of pointers can be cast to the value type int ([mscorlib]System.IntPtr), but this can hardly help—the int value type is oblivious to the type being pointed at, so such casting results only in loss of information. Pointer kinds of TypeSpecs are rarely used, compared to array kinds, and have limited application.



Inside Microsoft. NET IL Assembler
Inside Microsoft .NET IL Assembler
ISBN: 0735615470
EAN: 2147483647
Year: 2005
Pages: 147
Authors: SERGE LIDIN

flylib.com © 2008-2017.
If you may any questions please contact us: flylib@qtcs.net