13 Semantics of Special Types | The Common Language Infrastructure Annotated Standard (Microsoft. NET Development Series)

Special Types are those that are referenced from CIL, but for which no definition is supplied: the VES supplies the definitions automatically based on information available from the reference.

ANNOTATION

There are two kinds of special types: (1) those for which no definition is necessary because the VES produces the implementation upon declaration, and (2) those for which the user provides a definition and the VES provides the implementation.

Pointers, vectors, and arrays fall into the first category. In fact, there is no definition of either in the libraries. The VES just creates these upon declaration.

Built-in numeric types are a rather special case of this category. In the Base Class Library is a set of types that define these numeric types and the methods for operating on them. However, the VES requires that signatures encode these types with special values, rather than using the name of the type. For example, in a signature, a compiler must emit ELEMENT_TYPE_I4, numeric value 8, rather than ELEMENT_TYPE CLASS, followed by a reference to System.Int32 (see Partition II, section 22.1.15).

This does not mean that a programmer cannot specify the class library type (e.g., System.Int32) in a type definition, and some recommend this as a programming practice because that designation is required to access the methods. It does, however, mean that it is the responsibility of the compiler to emit only the built-in type in the signatures.

In addition to the built-in numeric types, the following types do not require a definition in code, and they must be defined by the VES:

Vectors
Arrays
Pointer types
Unmanaged pointers
Managed pointers
Method pointers

The following special types, which fall into the second category, must be defined in code, but the implementation is provided by the VES:

Enums
Delegates

13.1 Vectors

`<type> ::= ...`
	`\| <type>` `[ ]`

Vectors are single-dimension arrays with a zero lower bound. They have direct support in CIL instructions (newarr, ldelem, stelem, and ldelema; see Partition III [sections 4.19, 4.7, 4.25, and 4.8, respectively]). The CIL Framework also provides methods that deal with multi-dimensional arrays, or single-dimension arrays with a non-zero lower bound (see Partition II, section 13.2). Two vectors are the same type if their element types are the same, regardless of their actual upper bounds.

Vectors have a fixed size and element type, determined when they are created. All CIL instructions shall respect these values. That is, they shall reliably detect attempts to index beyond the end of the vector, attempts to store the incorrect type of data into an element of a vector, and attempts to take addresses of elements of a vector with an incorrect data type. See Partition III.


 Example (informative): Declaring a vector of Strings:      .field string[] errorStrings Declaring a vector of function pointers:      .field method instance void*(int32) [] myVec Create a vector of 4 strings, and store it into the field errorStrings. The four strings  lie at errorStrings[0] through errorStrings[3]:       ldc.i4.4       newarr                string       stfld                string[] CountDownForm::errorStrings Store the string "First" into errorStrings[0]:      ldfld string[] CountDownForm::errorStrings      ldc.i4.0      ldstr "First"      stelem

Vectors are subtypes of System.Array, an abstract class predefined by the CLI. It provides several methods that can be applied to all vectors. See the .NET Framework Standard Library Annotated Reference.

13.2 Arrays

While vectors (see Partition II, section 13.1) have direct support through CIL instructions, all other arrays are supported by the VES by creating subtypes of the abstract class System.Arrray (see the .NET Framework Standard Library Annotated Reference).

`<type> ::= ...`
	`\| <type>` `[` `[<bound> [,<bound>]*]` `]`

The rank of an array is the number of dimensions. The CLI does not support arrays with rank 0. The type of an array (other than a vector) shall be determined by the type of its elements and the number of dimensions.

`<bound> ::=`		Description
	`...\`	Lower and upper bounds unspecified. In the case of multi-dimensional arrays, the ellipsis may be omitted.
`\| <int32>`		Zero lower bound, <int32> upper bound.
`\| <int32> ...`		Lower bound only specified.
`\| <int32> ... <int32>`		Both bounds specified.

The fundamental operations provided by the CIL instruction set for vectors are provided by methods on the class created by the VES.

The VES shall provide two constructors for arrays. One takes a sequence of numbers giving the number of elements in each dimension (a lower bound of zero is assumed). The second takes twice as many arguments: a sequence of lower bounds, one for each dimension, followed by a sequence of lengths, one for each dimension (where length is the number of elements required).

ANNOTATION

The last sentence of the previous paragraph is not accurate. It states that the declaration is:

<type> [lower_bound1, lower_bound2, element_number1, element_number2]

This is not correct. Instead, it is:

<type> [lower_bound1, element_number1, lower_bound2, element_number2]

In addition to array constructors, the VES shall provide the instance methods Get, Set, and Address to access specific elements and compute their addresses. These methods take a number for each dimension, to specify the target element. In addition, Set takes an additional final argument specifying the value to store into the target element.


 Example (informative): Creates an array, MyArray, of strings with two dimensions, with indexes 5..10 and 3..7.  Stores the string "One" into MyArray[5, 3], retrieves it, and prints it out. Then computes  the address of MyArray[5, 4], stores "Test" into it, retrieves it, and prints it out. .assembly Test { } .assembly extern mscorlib { } .method public static void Start() { .maxstack 5   .entrypoint   .locals (class [mscorlib]System.String[,] myArray)   ldc.i4.5          // load lower bound for dim 1   ldc.i4.6          // load (upper bound - lower bound + 1) for dim 1   ldc.i4.3          // load lower bound for dim 2   ldc.i4.5          // load (upper bound - lower bound + 1) for dim 2   newobj instance void string[,]::.ctor(int32,             int32, int32, int32)   stloc  myArray   ldloc myArray   ldc.i4.5   ldc.i4.3   ldstr "One"   call instance void string[,]::Set(int32, int32, string)   ldloc myArray   ldc.i4.5   ldc.i4.3   call instance string string[,]::Get(int32, int32)   call void [mscorlib]System.Console::WriteLine(string)   ldloc myArray   ldc.i4.5   ldc.i4.4   call instance string & string[,]::Address(int32, int32)   ldstr "Test"   stind.ref   ldloc myArray   ldc.i4.5   ldc.i4.4   call instance string string[,]::Get(int32, int32)   call void [mscorlib]System.Console::WriteLine(string)   ret }

The following text is informative.

While the elements of multi-dimensional arrays can be thought of as laid out in contiguous memory, arrays of arrays are different each dimension (except the last) holds an array reference. The following picture illustrates the difference:

graphics/03inf02.gif

On the left is a [6, 10] rectangular array. On the right is not one, but a total of five arrays. The vertical array is an array of arrays, and references the four horizontal arrays. Note how the first and second elements of the vertical array both reference the same horizontal array.

Note that all dimensions of a multi-dimensional array shall be of the same size. But in an array of arrays, it is possible to reference arrays of different sizes. For example, the figure on the right shows the vertical array referencing arrays of lengths 8, 8, 3, null, 6, and 1.

There is no special support for these so-called "jagged arrays" in either the CIL instruction set or the VES. They are simply vectors whose elements are themselves either the base elements or (recursively) jagged arrays.

End of informative text

13.3 Enums

An enum, short for "enumeration," defines a set of symbols that all have the same type. A type shall be an enum if and only if it has an immediate base type of System.Enum. Since System.Enum itself has an immediate base type of System.ValueType (see the .NET Framework Standard Library Annotated Reference), enums are value types (see Partition II, section 12). The symbols of an enum are represented by an underlying type: one of { bool, char, int8, unsigned int8, int16, unsigned int16, int32, unsigned int32, int64, unsigned int64, float32, float64, native int, unsigned native int }.

NOTE

The CLI does not provide a guarantee that values of the enum type are integers corresponding to one of the symbols (unlike Pascal). In fact, the CLS (see Partition I, section 11, Collected CLS Rules) defines a convention for using enums to represent bit flags which can be combined to form integral value that are not named by the enum type itself.

Enums obey additional restrictions beyond those on other value types. Enums shall contain only fields as members (they shall not even define type initializers or instance constructors); they shall not implement any interfaces; they shall have auto field layout (see Partition II, section 9.1.2); they shall have exactly one instance field, and it shall be of the underlying type of the enum; all other fields shall be static and literal (see Partition II, section 15.1); and they shall not be initialized with the initobj instruction.

RATIONALE

These restrictions allow a very efficient implementation of enums.

The single, required, instance field stores the value of an instance of the enum. The static literal fields of an enum declare the mapping of the symbols of the enum to the underlying values. All of these fields shall have the type of the enum and shall have field init metadata that assigns them a value (see Partition II, section 15.2).

For binding purposes (e.g., for locating a method definition from the method reference used to call it), enums shall be distinct from their underlying type. For all other purposes, including verification and execution of code, an unboxed enum freely interconverts with its underlying type. Enums can be boxed (see Partition II, section 12) to a corresponding boxed instance type, but this type is not the same as the boxed type of the underlying type, so boxing does not lose the original type of the enum.


 Example (informative): Declare an enum type, then create a local variable of that type. Store a constant of the  underlying type into the enum (showing automatic coercion from the underlying type to the  enum type). Load the enum back and print it as the underlying type (showing automatic  coercion back). Finally, load the address of the enum and extract the contents of the  instance field and print that out as well. .assembly Test { } .assembly extern mscorlib { } .class sealed public ErrorCodes extends [mscorlib]System.Enum { .field public unsigned int8 MyValue   .field public static literal valuetype ErrorCodes no_error = int8(0)   .field public static literal valuetype ErrorCodes format_error =              int8(1)   .field public static literal valuetype ErrorCodes overflow_error =              int8(2)   .field public static literal valuetype ErrorCodes nonpositive_error =              int8(3) } .method public static void Start() { .maxstack 5   .entrypoint   .locals init (valuetype ErrorCodes errorCode)   ldc.i4.1           // load 1 (= format_error)   stloc errorCode    // store in local, note conversion to enum   ldloc errorCode   call void [mscorlib]System.Console::WriteLine(int32)   ldloca errorCode   // address of enum   ldfld unsigned int8 valuetype ErrorCodes::MyValue   call void [mscorlib]System.Console::WriteLine(int32)   ret }

13.4 Pointer Types

`<type> ::= ...`		Section in Partition II
	`\| <type>` `&`	13.4.2
	`\| <type>` `*`	13.4.1

A pointer type shall be defined by specifying a signature that includes the type for the location it points at. A pointer may be managed (reported to the CLI garbage collector, denoted by &; see Partition II, section 13.4.2) or unmanaged (not reported, denoted by *; see Partition II, section 13.4.1).

Pointers may contain the address of a field (of an object or value type) or an element of an array. Pointers differ from object references in that they do not point to an entire type instance, but rather to the interior of an instance. The CLI provides two typesafe operations on pointers:

Loading the value from the location referenced by the pointer
Storing an assignment compatible value into the location referenced by the pointer

For pointers into the same array or object (see Partition I, section 8.9.2), the following arithmetic operations are supported:

Adding an integer value to a pointer, where that value is interpreted as a number of bytes, results in a pointer of the same kind.
Subtracting an integer value (number of bytes) from a pointer results in a pointer of the same kind. Note that subtracting a pointer from an integer value is not permitted.
Two pointers, regardless of kind, can be subtracted from one another, producing an integer value that specifies the number of bytes between the addresses they reference.

The following is informative text.

Pointers are compatible with unsigned int32 on 32-bit architectures, and with unsigned int64 on 64-bit architectures. They are best considered as unsigned int, whose size varies depending upon the runtime machine architecture.

The CIL instruction set (see Partition III) contains instructions to compute addresses of fields, local variables, arguments, and elements of vectors [and arrays]:

Instruction	Description
`ldarga`	Load address of argument.
`ldelema`	Load address of vector element.
`ldflda`	Load address of field.
`ldloca`	Load address of local variable.
`ldsflda`	Load address of static field.

Once a pointer is loaded onto the stack, the ldind class of instructions may be used to load the data item to which it points. Similarly, the stind class of instructions can be used to store data into the location.

Note that the CLI will throw an InvalidOperationException for an ldflda instruction if the address is not within the current application domain. This situation arises typically only from the use of objects with a base type of System.MarshalByRefObject (see the .NET Framework Standard Library Annotated Reference).

13.4.1 Unmanaged Pointers

Unmanaged pointers (*) are the traditional pointers used in languages like C and C++. There are no restrictions on their use, although for the most part they result in code that cannot be verified. While it is perfectly legal to mark locations that contain unmanaged pointers as though they were unsigned integers (and this is, in fact, how they are treated by the VES), it is often better to mark them as unmanaged pointers to a specific type of data. This is done by using * in a signature for a return value, local variable, or an argument or by using a pointer type for a field or array element.

Unmanaged pointers are not reported to the garbage collector and can be used in any way that an integer can be used.
Verifiable code cannot dereference unmanaged pointers.
Unverified code can pass an unmanaged pointer to a method that expects a managed pointer. This is safe only if one of the following is true:
1. The unmanaged pointer refers to memory that is not in memory used by the CLI for storing instances of objects ("garbage-collected memory" or "managed memory").
2. The unmanaged pointer contains the address of a field within an object.
3. The unmanaged pointer contains the address of an element within an array.
4. The unmanaged pointer contains the address where the element following the last element in an array would be located

13.4.2 Managed Pointers

Managed pointers (&) may point to an instance of a value type, a field of an object, a field of a value type, an element of an array, or the address where an element just past the end of an array would be stored (for pointer indexes into managed arrays). Managed pointers cannot be null, and they shall be reported to the garbage collector even if they do not point to managed memory.

Managed pointers are specified by using & in a signature for a return value, local variable, or an argument, or by using a by-ref type for a field or array element.

Managed pointers can be passed as arguments, stored in local variables, and returned as values.
If a parameter is passed by reference, the corresponding argument is a managed pointer.
Managed pointers cannot be stored in static variables, array elements, or fields of objects or value types.
Managed pointers are not interchangeable with object references.
A managed pointer cannot point to another managed pointer, but it can point to an object reference or a value type.
A managed pointer can point to a local variable, or a method argument
Managed pointers that do not point to managed memory can be converted (using conv.u or conv.ovf.u) into unmanaged pointers, but this is not verifiable.
- Unverified code that erroneously converts a managed pointer into an unmanaged pointer can seriously compromise the integrity of the CLI. See Partition III, section 1.1.4.2 (Managed Pointers (type &)) for more details [see also Partition I, section 8.9.2].

End informative text

13.5 Method Pointers

`<type> ::= ...`
	`\|` `method` `<callConv> <type>` `* (` `<parameters>` `)`

Variables of type method pointer shall store the address of the entry point to a method with compatible signature. A pointer to a static or instance method is obtained with the ldftn instruction, while a pointer to a virtual method is obtained with the ldvirtftn instruction. A method may be called by using a method pointer with the calli instruction. See Partition III for the specification of these instructions.

NOTE

Like other pointers, method pointers are compatible with unsigned int64 on 64-bit architectures [and] with unsigned int32 and on 32-bit architectures. The preferred usage, however, is unsigned native int, which works on both 32- and 64-bit architectures.


 Example (informative): Call a method using a pointer. The method MakeDecision::Decide returns a method pointer to  either AddOne or Negate, alternating on each call. The main program calls MakeDecision: :Decide three times and after each call uses a CALLI instruction to call the method  specified. The output printed is "-1 2  1", indicating successful alternating calls. .assembly Test { } .assembly extern mscorlib { } .method public static int32 AddOne(int32 Input) { .maxstack 5   ldarg Input   ldc.i4.1   add   ret } .method public static int32 Negate(int32 Input) { .maxstack 5   ldarg Input   neg   ret } .class value sealed public MakeDecision extends [mscorlib]System.ValueType { .field static bool Oscillate   .method public static method int32 *(int32) Decide()   { ldsfld bool valuetype MakeDecision::Oscillate     dup     not     stsfld bool valuetype MakeDecision::Oscillate     brfalse NegateIt     ldftn int32 AddOne(int32)     ret NegateIt:     ldftn int32 Negate(int32)     ret   } } .method public static void Start() { .maxstack 2   .entrypoint   ldc.i4.1   call method int32 *(int32) valuetype MakeDecision::Decide()   calli int32(int32)   call  void [mscorlib]System.Console::WriteLine(int32)   ldc.i4.1   call method int32 *(int32) valuetype MakeDecision::Decide()   calli int32(int32)   call  void [mscorlib]System.Console::WriteLine(int32)   ldc.i4.1   call method int32 *(int32) valuetype MakeDecision::Decide()   calli int32(int32)   call  void [mscorlib]System.Console::WriteLine(int32)   ret }

13.6 Delegates

Delegates (see Partition I, section 8.9.3) are the object-oriented equivalent of function pointers. Unlike function pointers, delegates are object-oriented, typesafe, and secure. Delegates are reference types and are declared in the form of Classes. Delegates shall have an immediate base type of System.MulticastDelegate, which in turn has an immediate base type of System.Delegate (see the .NET Framework Standard Library Annotated Reference).

Delegates shall be declared sealed, and the only members a delegate shall have are either two or four methods as specified here. These methods shall be declared runtime and managed (see Partition II, section 14.4.3). They shall not have a body, since it shall be automatically created by the VES. Other methods available on delegates are inherited from the classes System.Delegate and System.MulticastDelegate in the Base Class Library (see the .NET Framework Standard Library Annotated Reference).

RATIONALE

A better design would be to simply have delegate classes derive directly from System.Delegate. Unfortunately, backward compatibility with an existing CLI does not permit this design.

The instance constructor (named .ctor and marked specialname and rtspecialname; see Partition II, section 9.5.1) shall take exactly two parameters. The first parameter shall be of type System.Object, and the second parameter shall be of type System.IntPtr. When actually called (via a newobj instruction; see Partition III), the first argument shall be an instance of the class (or one of its subclasses) that defines the target method, and the second argument shall be a method pointer to the method to be called.

The Invoke method shall be virtual and have the same signature (return type, parameter types, calling convention, and modifiers; see Partition II, section 7.1) as the target method. When actually called, the arguments passed shall match the types specified in this signature.

The BeginInvoke method (see Partition II, section 13.6.2.1), if present, shall be virtual and have a signature related to, but not the same as, that of the Invoke method. There are two differences in the signature. First, the return type shall be System.IAsyncResult (see the .NET Framework Standard Library Annotated Reference). Second, there shall be two additional parameters that follow those of Invoke: the first of type System.AsyncCallback and the second of type System.Object.

The EndInvoke method (see Partition II, section 13.6.2.2) shall be virtual and have the same return type as the Invoke method. It shall take as parameters exactly those parameters of Invoke that are managed pointers, in the same order they occur in the signature for Invoke. In addition, there shall be an additional parameter of type System.IAsyncResult.


 Example (informative): The following example declares a delegate used to call functions that take a single  integer and return void. It provides all four methods so it can be called either  synchronously or asynchronously. Because there are no parameters that are passed by  reference (i.e., as managed pointers), there are no additional arguments to EndInvoke. .assembly Test { } .assembly extern mscorlib { } .class private sealed StartStopEventHandler        extends [mscorlib]System.MulticastDelegate  { .method public specialname rtspecialname instance            void .ctor(object Instance, native int Method)                  runtime managed {}    .method public virtual void Invoke(int32 action) runtime managed {}    .method public virtual       class [mscorlib]System.IAsyncResult         BeginInvoke(int32 action,                     class [mscorlib]System.AsyncCallback callback,                     object Instance) runtime managed {}    .method public virtual       void EndInvoke(class [mscorlib]System.IAsyncResult result)       runtime managed {} }

As with any class, an instance is created using the newobj instruction in conjunction with the instance constructor. The first argument to the constructor shall be the object on which the method is to be called, or it shall be null if the method is a static method. The second argument shall be a method pointer to a method on the corresponding class and with a signature that matches that of the delegate class being instantiated.

ANNOTATION

Implementation-Specific (Microsoft): The Microsoft implementation of the CLI allows programmers to add more methods to a delegate, on the condition that they provide an implementation for those methods (i.e., the methods cannot be marked runtime). Note that such use makes the resulting assembly non-portable.

13.6.1 Synchronous Calls to Delegates

The synchronous mode of calling delegates corresponds to regular method calls and is performed by calling the virtual method named Invoke on the delegate. The delegate itself is the first argument to this call (it serves as the this pointer), followed by the other arguments as specified in the signature. When this call is made, the caller shall block until the called method returns. The called method shall be executed on the same thread as the caller.


 Example (informative): Continuing the previous example, define a class Test that declares a method, onStartStop,  appropriate for use as the target for the delegate. .class public Test { .field public int32 MyData   .method public void onStartStop(int32 action)   { ret        // put your code here   }   .method public specialname rtspecialname           instance void .ctor(int32 Data)   { ret        // call parent constructor, store state, etc.   } } Then define a main program. This one constructs an instance of Test and then a delegate  that targets the onStartStop method of that instance. Finally, call the delegate. .method public static void Start() { .maxstack 3   .entrypoint   .locals (class StartStopEventHandler DelegateOne,            class Test InstanceOne)   // Create instance of Test class   ldc.i4.1   newobj instance void Test::.ctor(int32)   stloc InstanceOne   // Create delegate to onStartStop method of that class   ldloc InstanceOne   ldftn instance void Test::onStartStop(int32)   newobj void StartStopEventHandler::.ctor(object, native int)   stloc DelegateOne   // Invoke the delegate, passing 100 as an argument   ldloc DelegateOne   ldc.i4 100   callvirt instance void StartStopEventHandler::Invoke(int32)   ret }   // Note that the example above creates a delegate to a non-virtual   // function. If onStartStop had instead been a virtual function, use   // the following code sequence instead:   ldloc InstanceOne   dup   ldvirtftn instance void Test::onStartStop(int32)   newobj void StartStopEventHandler::.ctor(object, native int)   stloc DelegateOne   // Invoke the delegate, passing 100 as an argument   ldloc DelegateOne

NOTE

The code sequence above shall use dup not ldloc InstanceOne twice. The dup code sequence is easily recognized as typesafe, whereas alternatives would require more complex analysis. Verifiability of code is discussed in Partition III, section 1.8.

13.6.2 Asynchronous Calls to Delegates

In the asynchronous mode, the call is dispatched, and the caller shall continue execution without waiting for the method to return. The called method shall be executed on a separate thread.

To call delegates asynchronously, the BeginInvoke and EndInvoke methods are used.

NOTE

If the caller thread terminates before the callee completes, the callee thread is unaffected. The callee thread continues execution and terminates silently.

NOTE

The callee may throw exceptions. Any unhandled exception propagates to the caller via the EndInvoke method.

13.6.2.1 The BeginInvoke Method

An asynchronous call to a delegate shall begin by making a virtual call to the BeginInvoke method. BeginInvoke is similar to the Invoke method (see Partition II, section 13.6.1), but has two differences:

It has two additional parameters, appended to the list, of type System.AsyncCallback and System.Object
The return type of the method is System.IAsyncResult

Although the BeginInvoke method therefore includes parameters that represent return values, these values are not updated by this method. The results instead are obtained from the EndInvoke method (see below).

Unlike a synchronous call, an asynchronous call shall provide a way for the caller to determine when the call has been completed. The CLI provides two such mechanisms. The first is through the result returned from the call. This object, an instance of the interface System.IAsyncResult, can be used to wait for the result to be computed, it can be queried for the current status of the method call, and it contains the System.Object value that was passed to the call to BeginInvoke. See the .NET Framework Standard Library Annotated Reference.

The second mechanism is through the System.AsyncCallback delegate passed to BeginInvoke. The VES shall call this delegate when the value is computed or an exception has been raised indicating that the result will not be available. The value passed to this callback is the same value passed to the call to BeginInvoke. A value of null may be passed for System.AsyncCallback to indicate that the VES need not provide the callback.

RATIONALE

This model supports both a polling approach (by checking the status of the returned System.IAsyncResult) and an event-driven approach (by supplying a System.AsyncCallback) to asynchronous calls.

A synchronous call returns information both through its return value and through output parameters. Output parameters are represented in the CLI as parameters with managed pointer type. Both the returned value and the values of the output parameters are not available until the VES signals that the asynchronous call has completed successfully. They are retrieved by calling the EndInvoke method on the delegate that began the asynchronous call.

13.6.2.2 The EndInvoke Method

The EndInvoke method can be called at any time after BeginInvoke. It shall suspend the thread that calls it until the asynchronous call completes. If the call completes successfully, EndInvoke will return the value that would have been returned, had the call been made synchronously, and its managed pointer arguments will point to values that would have been returned to the out parameters of the synchronous call.

EndInvoke requires as parameters the value returned by the originating call to BeginInvoke (so that different calls to the same delegate can be distinguished, since they may execute concurrently), as well as any managed pointers that were passed as arguments (so their return values can be provided).