14 Defining, Referencing, and Calling Methods


Methods may be defined at the global level (outside of any type):

<decl> ::= ...

 

| .method <methodHead> { <methodBodyItem>* }

as well as inside a type:

<classMember> ::= ...

 

| .method <methodHead> { <methodBodyItem>* }

14.1 Method Descriptors

There are four constructs in ilasm connected with methods. These correspond with different metadata constructs, as described in Partition II, section 21.

14.1.1 Method Declarations

A MethodDecl, or method declaration, supplies the method name and signature (parameter and return types), but not its body. That is, a method declaration provides a <methodHead> but no <methodBodyItem>'s. These are used at call sites to specify the call target (call or callvirt instructions; see Partition III) or to declare an abstract method. A MethodDecl has no direct logical couterpart in the metadata; it can be either a Method or a MethodRef.

14.1.2 Method Definitions

A Method, or method definition, supplies the method name, attributes, signature, and body. That is, a method definition provides a <methodHead>, as well as one or more <methodBodyItem>'s. The body includes the method's CIL instructions, exception handlers, local variable information, and additional runtime or custom metadata about the method. See Partition II, section 14.

14.1.3 Method References

A MethodRef, or method reference, is a reference to a method. It is used when a method is called whose definition lies in another module or assembly. A MethodRef shall be resolved by the VES into a Method before the method is called at runtime. If a matching Method cannot be found, the VES shall throw a System.MissingMethodException. See Partition II, section 21.23.

14.1.4 Method Implementations

A MethodImpl, or method implementation, supplies the executable body for an existing virtual method. It associates a Method (representing the body) with a MethodDecl or Method (representing the virtual method). A MethodImpl is used to provide an implementation for an inherited virtual method or a virtual method from an interface when the default mechanism (matching by name and signature) would not provide the correct result. See Partition II, section 21.25.

14.2 Static, Instance, and Virtual Methods

Static methods are methods that are associated with a type, not with its instances.

Instance methods are associated with an instance of a type: within the body of an instance method it is possible to reference the particular instance on which the method is operating (via the this pointer). It follows that instance methods may only be defined in classes or value types, but not in interfaces or outside of a type (globally). However, notice that

  1. Instance methods on classes (including boxed value types) have a this pointer [which is implicit] that is by default an object reference to the class on which the method is defined.

  2. Instance methods on (unboxed) value types have a this pointer that is by default a managed pointer to an instance of the type on which the method is defined.

  3. There is a special encoding (denoted by the syntactic item explicit in the calling convention; see Partition II, section 14.3) to specify the type of the this pointer, overriding the default values specified here.

  4. The this pointer may be null.

Virtual methods are associated with an instance of a type in much the same way as for instance methods. However, unlike instance methods, it is possible to call a virtual method in such a way that the implementation of the method shall be chosen at runtime by the VES, depending upon the type of object used for the this pointer. The particular Method that implements a virtual method is determined dynamically at runtime (a virtual call) when invoked via the callvirt instruction, while the binding is decided at compile time when invoked via the call instruction (see Partition III).

ANNOTATION

It is important to distinguish between virtual and instance methods, and virtual (callvirt) and instance (call) calls. Both kinds of methods can be called with either kind of call. For a more complete discussion of this topic, see Partition I, section 8.4.2.


With virtual calls (only) the notion of inheritance becomes important. A subclass may override a virtual method inherited from its base classes, providing a new implementation of the method. The method attribute newslot specifies that the CLI shall not override the virtual method definition of the base type, but shall treat the new definition as an independent virtual method definition.

Abstract virtual methods (which shall only be defined in abstract classes or interfaces) shall be called only with a callvirt instruction. Similarly, the address of an abstract virtual method shall be computed with the ldvirtftn instruction, and the ldftn instruction shall not be used.

RATIONALE

With a concrete virtual method there is always an implementation available from the class that contains the definition; thus there is no need at runtime to have an instance of a class available. Abstract virtual methods, however, receive their implementation only from a subtype or a class that implements the appropriate interface, hence an instance of a class that actually implements the method is required.


14.3 Calling Convention

<callConv> ::= [instance [explicit]] [<callKind>]

A calling convention specifies how a method expects its arguments to be passed from the caller to the called method. It consists of two parts; the first deals with the existence and type of the this pointer, while the second relates to the mechanism for transporting the arguments.

If the attribute instance is present, it indicates that a this pointer shall be passed to the method. It shall be used for both instance and virtual methods.

ANNOTATION

Implementation-Specific (Microsoft): For simplicity, the assembler automatically sets or clears the instance bit in the calling convention for a method definition based on the method attributes static and virtual. In a method reference, however, the instance bit shall be specified directly because the information about static or virtual is not captured in a reference.


Normally, a parameter list (which always follows the calling convention) does not provide information about the type of the this pointer, since this can be deduced from other information. When the combination instance explicit is specified, however, the first type in the subsequent parameter list specifies the type of the this pointer, and subsequent entries specify the types of the parameters themselves.

<callKind> ::=

 

default

| unmanaged cdecl

| unmanaged fastcall

| unmanaged stdcall

| unmanaged thiscall

| vararg

ANNOTATION

For more information, see sections 22.2.1, 22.2.2, and 22.2.3 in Partition II.


Managed code shall have only the default or vararg calling kind. default shall be used in all cases except when a method accepts an arbitrary number of arguments, in which case vararg shall be used.

When dealing with methods implemented outside the CLI, it is important to be able to specify the calling convention required. For this reason there are 16 possible encodings of the calling kind. Two are used for the managed calling kinds. Four are reserved with defined meaning across many platforms:

  • unmanaged cdecl is the calling convention used by standard C.

  • unmanaged stdcall specifies a standard C++ call.

  • unmanaged fastcall is a special optimized C++ calling convention.

  • unmanaged thiscall is a C++ call that passes a this pointer to the method.

Four more are reserved for existing calling conventions, but their use is not portable. Four more are reserved for future standardization, and two are available for non-standard experimental use.

(By "portable" is meant a feature that is available on all conforming implementations of the CLI.)

14.4 Defining Methods

<methodHead> ::=

 
 <methAttr>* [<callConv>] [<paramAttr>*] <type>             [marshal ( [<nativeType>] )]             <methodName> ( <parameters> ) <implAttr>* 

The method head (see also Partition II, section 14) consists of

  • The calling convention (<callConv>; see Partition II, section 14.3)

  • Any number of predefined method attributes (<paramAttr>; see Partition II, section 14.4.2)

  • A return type with optional attributes

  • Optional marshalling information (see Partition II, section 7.4)

  • A method name

  • A signature

  • And any number of implementation attributes (<implAttr>; see Partition II, section 14.4.3)

Methods that do not have a return value shall use void as the return type.

<methodName> ::=

 

.cctor

| .ctor

| <dottedname>

Method names are either simple names or the special names used for instance constructors and type initializers.

ANNOTATION

Both .ctor, the object instance constructor name, and .cctor, the type initializer (also called a class constructor), are required names and cannot be changed.


<parameters> ::= [<param> [, <param>]*]

<param> ::=

 

...

| [<paramAttr>*] <type> [marshal ( [<nativeType>] )] [<id>]

The <id>, if present, is the name of the parameter. A parameter may be referenced either by using its name or the zero-based index of the parameter. In CIL instructions it is always encoded using the zero-based index (the name is for ease of use in ilasm).

Note that, in contrast to calling a vararg method, the definition of a vararg method does not include any ellipsis ("").

<paramAttr> ::=

 

[in]

| [opt]

| [out]

The parameter attributes shall be attached to the parameters (see Partition II, section 21.30) and hence are not part of a method signature.

NOTE

Unlike parameter attributes, custom modifiers (modopt and modreq) are part of the signature. Thus, modifiers form part of the method's contract, while parameter attributes are not.


in and out shall only be attached to parameters of pointer (managed or unmanaged) type. They specify whether the parameter is intended to supply input to the method, return a value from the method, or both. If neither is specified, in is assumed. The CLI itself does not enforce the semantics of these bits, although they may be used to optimize performance, especially in scenarios where the call site and the method are in different application domains, processes, or computers.

opt specifies that this parameter is intended to be optional from an end-user point of view. The value to be supplied is stored using the .param syntax (see Partition II, section 14.4.1.4).

ANNOTATION

"The value to be supplied," in the previous paragraph, is the <fieldinit> value.


14.4.1 Method Body

The method body shall contain the instructions of a program. However, it may also contain labels, additional syntactic forms, and many directives that provide additional information to ilasm and are helpful in the compilation of methods of some languages.

<methodBodyItem> ::=

Description

Section in Partition II

 

.custom <customDecl>

Definition of custom attributes.

20

| .data <datadecl>

Emits data to the data section.

15.3

| .emitbyte <unsigned int8>

Emits a byte to the code section of the method.

14.4.1.1

| .entrypoint

Specifies that this method is the entry point to the application (only one such method is allowed).

14.4.1.2

 | .locals  [init]      ( <localsSignature> ) 

Defines a set of local variables for this method.

14.4.1.3

| .maxstack <int32>

int32 specifies the maximum number of elements on the evaluation stack during the execution of the method.

14.4.1, [4.1]

| .override <typeSpec>::<methodName>

Use current method as the implementation for the method specified.

9.3.2

 | .param  [ <int32> ]           [= <fieldInit>] 

Store a constant <fieldInit> value for parameter <int32>.

14.4.1.4

| <externSourceDecl>

.line or #line

5.7

| <instr>

An instruction.

See Partition V [Annex C]

| <id> :

A label.

5.4

| <scopeBlock>

Lexical scope of local variables.

14.4.4

| <securityDecl>

.permission or .permissionset

19

| <sehBlock>

An exception block.

18

14.4.1.1 .emitbyte

<methodBodyItem> ::= ...

 

| .emitbyte <unsigned int8>

Emits an unsigned 8-bit value directly into the CIL stream of the method. The value is emitted at the position where the directive appears.

NOTE

The .emitbyte directive is used for generating tests. It is not required in generating regular programs.


14.4.1.2 .entrypoint

<methodBodyItem> ::= ...

 

| .entrypoint

The .entrypoint directive marks the current method, which shall be static, as the entry point to an application. The VES shall call this method to start the application. An executable shall have exactly one entry point method. This entry point method may be a global method or may appear inside a type. (The effect of the directive is to place the metadata token for this method into the CLI header of the PE file.)

The entry point method shall accept either no arguments or a vector of strings. If it accepts a vector of strings, the strings shall represent the arguments to the executable, with index 0 containing the first argument. The mechanism for specifying these arguments is platform-specific and is not specified here.

The return type of the entry point method shall be void, int32, or unsigned int32. If an int32 or unsigned int32 is returned, the executable may return an exit code to the host environment. A value of 0 shall indicate that the application terminated ordinarily.

The accessibility of the entry point method shall not prevent its use in starting execution. Once started, the VES shall treat the entry point as it would any other method.

 

Example (informative): The following example prints the first argument and returns successfully to the operating graphics/ccc.gif system: .method public static int32 MyEntry(string[] s) CIL managed { .entrypoint .maxstack 2 ldarg.0 // load and print the first argument ldc.i4.0 ldelem.ref call void [mscorlib]System.Console::WriteLine(string) ldc.i4.0 // return success ret }

ANNOTATION

If you have a managed entry point, the entry point token is either the method that is the entry point or, if the entry point method is in another file, a file token for that file. That file must have the entry point. For more information on entry points, refer to Partition II, section 24.3.3, which describes the EntryPointToken field in the CLI header of the PE file.


14.4.1.3 .locals

The .locals statement declares local variables (see Partition I [section 12.3.2.2]) for the current method.

<methodBodyItem> ::= ...

 

| .locals [init] ( <localsSignature> )

<localsSignature> ::= <local> [, <local>]*

<local> ::= <type> [<id>]

The <id>, if present, is the name of the local.

If init is specified, the variables are initialized to their default values according to their type. Reference types are initialized to null, and value types are zeroed out.

NOTE

Verifiable methods shall include the init keyword. See Partition III [section 1.8].


ANNOTATION

In the file format, the <localsSignature> is formatted as a LocalVarSig, described in Partition II, section 22.2.6. The flag to set for initializing locals to zero is CorILMethod_InitLocals, described in Partition II, section 24.4.4.

Implementation-Specific (Microsoft): ilasm allows nested local variable scopes to be provided and allows locals in nested scopes to share the same location as those in the outer scope. The information about local names, scoping, and overlapping of scoped locals is persisted to the PDB (debugger symbol) file rather than the PE file itself.

 
 <local> ::= [[<int32>]] <type> [<id>] 

The integer in brackets that precedes the <type>, if present, specifies the local number (starting with 0) being described. This allows nested locals to reuse the same location as a local in the outer scope. It is not legal to overlap two local variables unless they have the same type. When no explicit index is specified, the next unused index is chosen. That is, two locals never share an index unless the index is given explicitly.

If init is used, all local variables will be initialized to their default values, even variables in another .locals directive in the same method, which does not have the init directive.


14.4.1.4 .param

<methodBodyItem> ::= ...

 

| .param [ <int32> ] [= <fieldInit>]

[The .param directive] stores in the metadata a constant value associated with method parameter number <int32> (see Partition II, section 21.9). While the CLI requires that a value be supplied for the parameter, some tools may use the presence of this attribute to indicate that the tool rather than the user is intended to supply the value of the parameter. Unlike CIL instructions, .param uses index 0 to specify the return value of the method, index 1 is the first parameter of the method, and so forth.

NOTE

The CLI attaches no semantic whatsoever to these values it is entirely up to compilers to implement any semantic they wish (e.g., so-called "default argument values").


ANNOTATION

Partition II, section 21.9 discusses the layout of the Constant table in metadata.

This section requires a little clarification. It means that the CLI requires that when a method is called, all parameters must have values. Some compilers supply default values. In this case, the compiler can extract a value from the <fieldinit> value at compile time, and generate code to pass it to the CLI at runtime.


14.4.2 Predefined Attributes on Methods

<methAttr> ::=

Description

Section in Partition II

 

abstract

The method is abstract (shall also be virtual).

14.4.2.4

| assembly

Assembly accessibility.

14.4.2.1

| compilercontrolled

Compiler-controlled accessibility.

14.4.2.1

| famandassem

Family-and-assembly accessibility.

14.4.2.1

| family

Family accessibility.

14.4.2.1

| famorassem

Family-or-assembly accessibility.

14.4.2.1

| final

This virtual method cannot be overridden by subclasses.

14.4.2.2

| hidebysig

Hide by signature. Ignored by the runtime.

14.4.2.2

| newslot

Specifies that this method shall get a new slot in the virtual method table.

14.4.2.3

 | pinvokeimpl (     <QSTRING> [as <QSTRING>]     <pinvAttr>* ) 

Method is actually implemented in native code on the underlying platform.

14.4.2.5

| private

Private accessibility.

14.4.2.1

| public

Public accessibility.

14.4.2.1

| rtspecialname

The method name needs to be treated in a special way by the runtime.

14.4.2.6

| specialname

The method name needs to be treated in a special way by some tool.

14.4.2.6

| static

Method is static.

14.4.2.2

| virtual

Method is virtual.

14.4.2.2

ANNOTATION

For the metadata MethodAttributes that correspond to these ilasm attributes, refer to Partition II, section 22.1.9.

For more information on newslot, refer to Partition I, section 8.10.4, and Partition II, section 9.3.1.

Implementation-Specific (Microsoft): The following syntax is supported:

 
 <methAttr> ::= ... | unmanagedexp | reqsecobj 

unmanagedexp indicates that the method is exported to unmanaged code using COM interop; reqsecobj indicates that the method calls another method with security attributes.

Note that in the first release of Microsoft's CLR, ilasm does not recognize the compilercontrolled keyword. Instead, use privatescope.


The following combinations of predefined attributes are illegal:

  • static combined with any of final, virtual, or newslot

  • abstract combined with any of final or pinvokeimpl

  • compilercontrolled combined with any of virtual, final, specialname, or rtspecialname

14.4.2.1 Accessibility Information

<methAttr> ::= ...

| assembly

| compilercontrolled

| famandassem

| family

| famorassem

| private

| public

Only one of these attributes shall be applied to a given method. See Partition I, section 8.5.3 and its subsections.

14.4.2.2 Method Contract Attributes

<methAttr> ::= ...

| final

| hidebysig

| static

| virtual

These attributes may be combined, except a method shall not be both static and virtual; only virtual methods may be final; and abstract methods shall not be final.

final methods shall not be overridden by subclasses of this type.

hidebysig is supplied for the use of tools and is ignored by the VES. It specifies that the declared method hides all methods of the parent types that have a matching method signature; when omitted, the method should hide all methods of the same name, regardless of the signature.

RATIONALE

Some languages use a hide-by-name semantic (C++) while others use a hide-by-name-and-signature semantic (C#, Java).


Static and virtual are described in Partition II, section 14.2.

14.4.2.3 Overriding Behavior

<methAttr> ::= ...

 

| newslot

newslot shall only be used with virtual methods. See Partition II, section 9.3.

14.4.2.4 Method Attributes

<methAttr> ::= ...

 

| abstract

abstract shall only be used with virtual methods that are not final. It specifies that an implementation of the method is not provided but shall be provided by a subclass. Abstract methods shall only appear in abstract types (see Partition II, section 9.1.4).

14.4.2.5 Interoperation Attributes

<methAttr> ::= ...

 

| pinvokeimpl ( <QSTRING> [as <QSTRING>] <pinvAttr>* )

See Partition II, sections 14.5.2 and 21.20.

14.4.2.6 Special Handling Attributes

<methAttr> ::= ...

 

| rtspecialname

 

| specialname

The attribute rtspecialname specifies that the method name shall be treated in a special way by the runtime. Examples of special names are .ctor (object constructor) and .cctor (type initializer).

specialname indicates that the name of this method has special meaning to some tools.

14.4.3 Implementation Attributes of Methods

<implAttr> ::=

Description

Section in Partition II

 

cil

The method contains standard CIL code.

14.4.3.1

| forwardref

The body of this method is not specified with this declaration.

14.4.3.3

| internalcall

Denotes [that] the method body is provided by the CLI itself.

14.4.3.3

| managed

The method is a managed method.

14.4.3.2

| native

The method contains native code.

14.4.3.1

| noinlining

The runtime shall not expand the method inline.

14.4.3.3

| runtime

The body of the method is not defined but produced by the runtime.

14.4.3.1

| synchronized

The method shall be executed in a single threaded fashion.

14.4.3.3

| unmanaged

Specifies that the method is unmanaged.

14.4.3.2

ANNOTATION

The <forwardref> attribute is not used by the VES. It is intended for use by compilers to communicate with their linkers. At runtime, there must be no forward reference.

Implementation-Specific (Microsoft): The following syntax is accepted:

 
 <implAttr> ::= ... | preservesig 

preservesig specifies the method signature is mangled to return HRESULT, with the return value as a parameter.


14.4.3.1 Code Implementation Attributes

<implAttr> ::= ...

 

| cil

 

| native

 

| runtime

These attributes are exclusive; they specify the type of code the method contains.

cil specifies that the method body consists of CIL code. Unless the method is declared abstract, the body of the method shall be provided if cil is used.

native specifies that a method was implemented using native code, tied to a specific processor for which it was generated. Native methods shall not have a body but instead refer to a native method that declares the body. Typically, the PInvoke functionality (see Partition II, section 14.5.2) of the CLI is used to refer to a native method.

runtime specifies that the implementation of the method is automatically provided by the runtime and is primarily used for the method of delegates (see Partition II, section 13.6).

14.4.3.2 Managed or Unmanaged

<implAttr> ::= ...

 

| managed

 

| unmanaged

These shall not be combined. Methods implemented using CIL are managed. Unmanaged is used primarily with PInvoke (see Partition II, section 14.5.2).

14.4.3.3 Implementation Information

<implAttr> ::= ...

 

| forwardref

 

| internalcall

 

| noinlining

 

| synchronized

These attributes may be combined.

forwardref specifies that the body of the method is provided elsewhere. This attribute shall not be present when an assembly is loaded by the VES. It is used for tools (like a static linker) that will combine separately compiled modules and resolve the forward reference.

internalcall specifies that the method body is provided by this CLI (and is typically used by low-level methods in a system library). It shall not be applied to methods that are intended for use across implementations of the CLI.

ANNOTATION

Implementation-Specific (Microsoft): internalcall allows the lowest-level parts of the Base Class Library to wrap unmanaged code built into Microsoft's Common Language Runtime.


noinlining specifies that the body of this method should not be included in the code of any caller methods, by a CIL-to-native-code compiler; it shall be kept as a separate routine. noinlining specifies that the runtime shall not inline this method. Inlining refers to the process of replacing the call instruction with the body of the called method. This may be done by the runtime for optimization purposes.

RATIONALE

Specifying that a method not be inlined ensures that it remains "visible" for debugging (e.g., displaying stack traces) and profiling. It also provides a mechanism for the programmer to override the default heuristics a CIL-to-native-code compiler uses for inlining.


synchronized specifies that the whole body of the method shall be single-threaded. If this method is an instance or virtual method, a lock on the object shall be obtained before the method is entered. If this method is a static method, a lock on the type shall be obtained before the method is entered. If a lock cannot be obtained, the requesting thread shall not proceed until it is granted the lock. This may cause deadlocks. The lock is released when the method exits, through either a normal return or an exception. Exiting a synchronized method using a tail. call shall be implemented as though the tail. had not been specified.

14.4.4 Scope Blocks

<scopeBlock> ::= { <methodBodyItem>* }

A scopeBlock is used to group elements of a method body together. For example, it is used to designate the code sequence that constitutes the body of an exception handler.

ANNOTATION

Implementation-Specific (Microsoft): Scope blocks are syntactic sugar and primarily serve readability and debugging purposes.

 
 <scopeBlock> ::= { <methodBodyItem>* } 

A scope block defines the scope in which a local variable is accessible by its name. Scope blocks may be nested, such that a reference of a local variable will first be resolved in the innermost scope block, then at the next level, and so on until the topmost level of the method is reached. A declaration in an inner scope block hides declarations in the outer layers.

If duplicate declarations are used, the reference will be resolved to the first occurrence. Even though valid CIL, duplicate declarations are not recommended.

Scoping does not affect the lifetime of a local variable. All local variables are created (and if specified, initialized) when the method is entered. They stay alive until the method has finished executing.

The scoping does not affect the accessibility of a local variable by its zero-based index. All local variables are accessible from anywhere within the method by their index.

The index is assigned to a local variable in the order of declaration. Scoping is ignored for indexing purposes. Thus, each local variable is assigned the next available index starting at the top of the method. This behavior can be altered by specifying an explicit index, as described by a <localsSignature> as shown in Partition II, section 14.4.1.3.


14.4.5 vararg Methods

vararg methods accept a variable number of arguments. They shall use the vararg calling convention (see Partition II, section 14.3).

At each call site, a method reference shall be used to describe the types of the actual arguments that are passed. The fixed part of the argument list shall be separated from the additional arguments with an ellipsis (see Partition I, section 12.3.2.3).

The vararg arguments shall be accessed by obtaining a handle to the argument list using the CIL instruction arglist (see Partition III [section 3.4]). The handle may be used to create an instance of the value type System.ArgIterator, which provides a typesafe mechanism for accessing the arguments (see the .NET Framework Standard Library Annotated Reference).

 

Example (informative): The following example shows how a vararg method is declared and how the first vararg graphics/ccc.gif argument is accessed, assuming that at least one additional argument was passed to the method: .method public static vararg void MyMethod(int32 required) { .maxstack 3 .locals init (valuetype System.ArgIterator it, int32 x) ldloca it // initialize the iterator initobj valuetype System.ArgIterator ldloca it arglist // obtain the argument handle call instance void System.ArgIterator::.ctor(valuetype System graphics/ccc.gif.RuntimeArgumentHandle) // call constructor of iterator /* argument value will be stored in x when retrieved, so load address of x */ ldloca x ldloca it // retrieve the argument, the argument for required does not matter call instance typedref System.ArgIterator::GetNextArg() call object System.TypedReference::ToObject(typedref) // retrieve the object castclass System.Int32 // cast and unbox unbox int32 cpobj int32 // copy the value into x // first vararg argument is stored in x ret }

ANNOTATION

Support for variable-length argument lists is not part of the minimum requirement for implementing the CLI. The library support for it, System.ArgIterator, is not standardized.


14.5 Unmanaged Methods

In addition to supporting managed code and managed data, the CLI provides facilities for accessing pre-existing native code from the underlying platform, known as unmanaged code. These facilities are, by necessity, platform-dependent and hence are only partially specified here.

This standard specifies:

  • A mechanism in the file format for providing function pointers to managed code that can be called from unmanaged code (see Partition II, section 14.5.1).

  • A mechanism for marking certain method definitions as being implemented in unmanaged code (called platform invoke; see Partition II, section 14.5.2).

  • A mechanism for marking call sites used with method pointers to indicate that the call is to an unmanaged method (see Partition II, section 14.5.3).

  • A small set of predefined data types that can be passed (marshalled) using these mechanisms on all implementations of the CLI (see Partition II, section 14.5.5). The set of types is extensible through the use of custom attributes and modifiers, but these extensions are platform-specific.

14.5.1 Method Transition Thunks

NOTE

This mechanism is not part of the Kernel Profile, so it may not be present in all conforming implementations of the CLI. See Partition IV.


In order to call from unmanaged code into managed code, some platforms require a specific transition sequence to be performed. In addition, some platforms require that the representation of data types be converted (data marshalling). Both of these problems are solved by the .vtfixup directive. This directive may appear several times only at the top level of a CIL assembly file, as shown by the following grammar:

<decl> ::=

Section in Partition II

 

.vtfixup <vtfixupDecl>

5.10

| ...

 

The .vtfixup directive declares that at a certain memory location there is a table that contains metadata tokens referring to methods that shall be converted into method pointers. The CLI will do this conversion automatically when the file is loaded into memory for execution. The declaration specifies the number of entries in the table, what kind of method pointer is required, the width of an entry in the table, and the location of the table:

<vtfixupDecl> ::=

 

[ <int32> ] <vtfixupAttr>* at <dataLabel>

<vtfixupAttr> ::=

 

fromunmanaged

| int32

| int64

The attributes int32 and int64 are mutually exclusive, and int32 is the default. These attributes specify the width of each slot in the table. Each slot contains a 32-bit metadata token (zero-padded if the table has 64-bit slots), and the CLI converts it into a method pointer of the same width as the slot.

If fromunmanaged is specified, the CLI will generate a thunk that will convert the unmanaged method call to a managed call, call the method, and return the result to the unmanaged environment. The thunk will also perform data marshalling in the platform-specific manner described for platform invoke.

The ilasm syntax does not specify a mechanism for creating the table of tokens, but a compiler may simply emit the tokens as byte literals into a block specified using the .data directive.

14.5.2 Platform Invoke

Methods defined in native code may be invoked using the platform invoke (also known as PInvoke or p/invoke) functionality of the CLI. Platform invoke will switch from managed to unmanaged state and back, and also handle necessary data marshalling. Methods that need to be called using PInvoke are marked as pinvokeimpl. In addition, the methods shall have the implementation attributes native and unmanaged (see Partition II, section 14.4.2.5).

<methAttr> ::=

Description

Section in Partition II

 

pinvokeimpl ( <QSTRING> [as <QSTRING>] <pinvAttr>* )

Implemented in native code

14.4.2

| ...

  

The first quoted string is a platform-specific description indicating where the implementation of the method is located (for example, on Microsoft Windows this would be the name of the DLL that implements the method). The second (optional) string is the name of the method as it exists on that platform, since the platform may use name-mangling rules that force the name as it appears to a managed program to differ from the name as seen in the native implementation (this is common, for example, when the native code is generated by a C++ compiler).

Only static methods, defined at global scope (i.e., outside of any type), may be marked pinvokeimpl. A method declared with pinvokeimpl shall not have a body specified as part of the definition.

<pinvAttr> ::=

Description (platform-specific, suggestion only)

 

ansi

ANSI character set.

| autochar

Determine character set automatically.

| cdecl

Standard C style call.

| fastcall

C style fastcall.

| stdcall

Standard C++ style call.

| thiscall

The method accepts an implicit this pointer.

| unicode

Unicode character set.

| platformapi

Use call convention appropriate to target platform.

ANNOTATION

Implementation-Specific (Microsoft): In the first release, platformapi is not recognized by Microsoft ilasm. Instead use winapi.


The attributes ansi, autochar, and unicode are mutually exclusive. They govern how strings will be marshalled for calls to this method: ansi indicates that the native code will receive (and possibly return) a platform-specific representation that corresponds to a string encoded in the ANSI character set (typically this would match the representation of a C or C++ string constant); autochar indicates a platform-specific representation that is "natural" for the underlying platform; and unicode indicates a platform-specific representation that corresponds to a string encoded for use with Unicode methods on that platform.

The attributes cdecl, fastcall, stdcall, thiscall, and platformapi are mutually exclusive. They are platform-specific and specify the calling conventions for native code.

ANNOTATION

Implementation-Specific (Microsoft): In addition, the Microsoft implementation of the CLI on Microsoft Windows supports the following attributes:

  • lasterr to indicate that the native method supports C-style last error querying.

  • nomangle to indicate that the name in the DLL should be used precisely as specified, rather than attempting to add A (for "ascii") or W ("widechar") to find platform-specific variants based on the type of string marshalling requested.


 

Example (informative): The following shows the declaration of the method MessageBeep located in the Microsoft graphics/ccc.gif Windows DLL user32.dll: .method public static pinvokeimpl("user32.dll" stdcall) int8 MessageBeep(unsigned int32) graphics/ccc.gif native unmanaged {}
14.5.3 Via Function Pointers

Unmanaged functions can also be called via function pointers. There is no difference between calling managed or unmanaged functions with pointers. However, the unmanaged function needs to be declared with pinvokeimpl as described in Partition II, section 14.5.2. Calling managed methods with function pointers is described in Partition II, section 13.5.

14.5.4 COM Interop

ANNOTATION

Implementation-Specific (Microsoft): Unmanaged COM operates primarily by publishing uniquely identified interfaces and then sharing them between implementers (traditionally called "servers") and users (traditionally called "clients") of a given interface. It supports a rich set of types for use across the interface, and the interface itself can supply named constants and static methods, but it does not supply instance fields, instance methods, or virtual methods.

The CLI provides mechanisms useful to both implementers and users of existing classical COM interfaces. The goal is to permit programmers to deal with managed data types (thus eliminating the need for explicit memory management) while at the same time allowing interoperability with existing unmanaged servers and clients. COM interop does not support the use of global functions (i.e., methods that are not part of a managed type), static functions, or parameterized constructors.

Given an existing classical COM interface definition as a type library, the tlbimp tool produces a file that contains the metadata describing that interface. The types it exposes in the metadata are managed counterparts of the unmanaged types in the original interface.

Implementers of an existing classical COM interface can import the metadata produced by tlbimp and then write managed types that provide the implementation of the methods required by that interface. The metadata specifies the use of managed data types in many places, and the CLI provides automatic marshalling (i.e., copying with reformatting) of data between the managed and unmanaged data types.

Implementers of a new service can simply write a managed program whose publicly visible types adhere to a simple set of rules. They can then run the tlbexp tool to produce a type library for classical COM users. This set of rules guarantees that the data types exposed to the classical COM user are unmanaged types that can be marshalled automatically by the CLI.

Implementers need to run the RegAsm tool to register their implementation with classical COM for location and activation purposes if they wish to expose managed services to unmanaged code.

Users of existing classical COM interfaces simply import the metadata produced by tlbimp. They can then reference the (managed) types defined there, and the CLI uses the assembly mechanism and activation information to locate and instantiate instances of objects implementing the interface. Their code is the same whether the implementation of the interfaces is provided using classical COM (unmanaged) code or the CLI (managed) code: the interfaces they see use managed data types and hence do not need explicit memory management.

For some existing classical COM interfaces, the CLI provides an implementation of the interface. In some cases the VES allows the user to specify all or parts of the implementation; for others it provides the entire implementation.


14.5.5 Data Type Marshalling

While data type marshaling is necessarily platform-dependent, this standard specifies a minimum set of data types that shall be supported by all conforming implementations of the CLI. Additional data types may be supported in an implementation-dependent manner, using custom attributes and/or custom modifiers to specify any special handling required on the particular implementation.

The following data types shall be marshalled by all conforming implementations of the CLI; the native data type to which they conform is implementation-specific:

  • All integer data types (int8, int16, unsigned int8, bool, char, etc.), including the native integer types.

  • Enumerations, as their underlying data type.

  • All floating point data types (float32 and float64), if they are supported by the CLI implementation for managed code.

  • The type string.

  • Unmanaged pointers to any of the above types.

In addition, the following types shall be supported for marshalling from managed code to unmanaged code, but need not be supported in the reverse direction (i.e., as return types when calling unmanaged methods or as parameters when calling from unmanaged methods into managed methods):

  • One-dimensional zero-based arrays of any of the above.

  • Delegates (the mechanism for calling from unmanaged code into a delegate is platform-specific; it should not be assumed that marshalling a delegate will produce a function pointer that can be used directly from unmanaged code).

Finally, the type GCHandle can be used to marshal an object to unmanaged code. The unmanaged code receives a platform-specific data type that can be used as an "opaque handle" to a specific object.

ANNOTATION

You can pass a delegate from managed code to unmanaged code. The delegate is then marshalled into a form that the unmanaged code can use to call the method specified by the delegate.

What is returned, however, is not guaranteed to be a function pointer. Basically, it is up to the platform to determine what that returned "something" is. It may be a function pointer, or it might be implemented as a call to something else, to which the function (the thing returned plus the arguments) is passed. It is up to the VES implementation to determine how to do it. Rotor, a publicly available VES implementation, does it one way; the Microsoft Common Language Runtime does it another. Other VES implementations may have yet another way to implement it.


14.5.6 Managed Native Calling Conventions (x86)

ANNOTATION

Implementation-Specific (Microsoft): This section is intended for an advanced audience. It describes the details of a native method call from managed code on the x86 architecture. The information provided in this section may be important for optimization purposes. This section is not important for further understanding of the CLI and may be skipped.

There are two managed native calling conventions used on the x86. They are described here for completeness and because knowledge of these conventions allows an unsafe mechanism for bypassing the overhead of a transition from managed to unmanaged code.


14.5.6.1 Standard 80x86 Calling Convention

ANNOTATION

Implementation-Specific (Microsoft): The standard native calling convention is a variation on the fastcall convention used by Visual C++. It differs primarily in the order in which arguments are pushed onto the stack.

The only values that can be passed in registers are managed and unmanaged pointers, object references, and the built-in integer types int8, unsigned int8, int16, unsigned int16, int32, unsigned int32, native int, and native unsigned int. Enums are passed as their underlying type. All floating point values and 8-byte integer values are passed on the stack. When the return type is a value type that cannot be passed in a register, the caller shall create a buffer to hold the result and pass the address of this buffer as a hidden parameter.

Arguments are passed in left-to-right order, starting with the this pointer (for instance and virtual methods), followed by the return buffer pointer if needed, followed by the user-specified argument values. The first of these that can be placed in a register is put into ECX, the next in EDX, and all subsequent arguments are passed on the stack.

The return value is handled as follows:

Floating point values are returned on the top of the hardware floating point (FP) stack.

Integers up to 32 bits long are returned in EAX.

64-bit integers are passed, with EAX holding the least significant 32 bits and EDX holding the most significant 32 bits.

All other cases require the use of a return buffer, through which the value is returned.

In addition, it is guaranteed that if a return buffer is used, a value is stored there only upon ordinary exit from the method. The buffer is not allowed to be used for temporary storage within the method, and its contents will be unaltered if an exception occurs while the method is executing.

Example (informative):

 
 static System.Int32 f(int32 x) 

The incoming argument (x) is placed in ECX; the return value is in EAX.

 
 static float64 f(int32 x, int32 y, int32 z) 

x is passed in ECX, y in EDX, z on the top of stack; the return value is on the top of the FP stack.

 
 static float64 f(int32 x, float64 y, float64 z) 

x is passed in ECX, y on the top of the stack (not the FP stack), z in EDX; the return value is on the top of the FP stack.

 
 virtual float64 f(int32 x, int64 y, int64 z) 

this is passed in ECX, x in EDX; y is pushed on the stack, and then z is pushed on the stack (hence z is on top of the stack); the return value is on the top of the FP stack.

 
 virtual int64 f(int32 x, float64 y, float64 z) 

this is passed in ECX, x in EDX; y is pushed on the stack, and then z is pushed on the stack (hence z is on top of the stack); the return value is in EDX/EAX.

 
 virtual [mscorlib]System.Guid f(int32 x, float64 y, float64 z) 

Because System.Guid is a value type, the this pointer is passed in ECX, a pointer to the return buffer is passed in EDX, x is pushed, then y, and then z (hence z is on top of the stack); the return value is stored in the return buffer.


14.5.6.2 Varargs x86 Calling Convention

ANNOTATION

Implementation-Specific (Microsoft): All user-specified arguments are passed on the stack, pushed in left-to-right order. Following the last argument (hence on top of the stack upon entry to the method body), a special cookie is passed that provides information about the types of the arguments that have been pushed.

As with the standard calling convention, the this pointer and a return buffer (if either is needed) are passed in ECX and/or EDX.

Values are returned in the same way as for the standard calling convention.


14.5.6.3 Fast Calls to Unmanaged Code

ANNOTATION

Implementation-Specific (Microsoft): Transitions from managed to unmanaged code require a small amount of overhead to allow exceptions and garbage collection to correctly determine the execution context. On an x86 processor, under the best circumstances, these transitions take approximately five instructions per call or return from managed to unmanaged code. In addition, any method that includes calls with transitions incurs an eight-instruction overhead spread across the calling method's prolog and epilog.

This overhead can become a factor in performance of certain applications. For use in unverifiable code only, there is a mechanism to call from managed code to unmanaged code without the overhead of a transition. Such so-called "fast native calls" are accomplished by the use of a calli instruction that indicates that the destination is managed, even though the code address to which it refers is unmanaged. This can be arranged, for example, by initializing a variable of type function pointer in unmanaged code.

Clearly, this mechanism shall be tightly constrained, because the transition is essential if there is any possibility of a garbage collection or exception occurring while in the unmanaged code. The following restrictions apply to the use of this mechanism:

  1. The unmanaged code shall follow one of the two managed calling conventions (regular or vararg) that are specified below. In version 1 of the Microsoft CLR, only the regular calling convention is supported for fast native calls.

  2. The unmanaged code shall not execute for any extended time, because garbage collection cannot begin while this code is executing. It is wise to keep this code under 100 instructions long in all control flow paths.

  3. The unmanaged code shall not throw an exception (managed or unmanaged), including access violations, etc. Page faults are not considered an exception for this purpose.

  4. The unmanaged code shall not call back into managed code.

  5. The unmanaged code shall not trigger garbage collection (this usually follows from the restriction on calling back to managed code).

  6. The unmanaged code shall not block. That is, it shall not call any OS-provided routine that might block the thread (synchronous I/O, explicit acquisition of locks, etc.) Again, page faults are not a problem for this purpose.

  7. The managed code that calls the unmanaged method shall not have a long, tight loop in which it makes the call. The total time for the loop to execute should remain under 100 instructions, or the loop should include at least one call to a managed method. More technically, the method including the call shall produce "fully interruptible native code." In future versions, there may be a way to indicate this as a requirement on a method.


NOTE

Restrictions 2 through 6 apply not only to the unmanaged code called directly, but to anything it may call.




The Common Language Infrastructure Annotated Standard (Microsoft. NET Development Series)
The Common Language Infrastructure Annotated Standard (Microsoft. NET Development Series)
ISBN: N/A
EAN: N/A
Year: 2002
Pages: 121

flylib.com © 2008-2017.
If you may any questions please contact us: flylib@qtcs.net