Chapter 19: Unsafe Code, Pointers, Nullable Types, and Miscellaneous Topics | C# 2.0: The Complete Reference (Complete Reference Series)

This chapter covers a feature of C# whose name usually takes programmers by surprise: unsafe code. Unsafe code often involves the use of pointers. Together, unsafe code and pointers enable C# to be used to create applications that one might normally associate with C++: high-performance, systems code. Moreover, the inclusion of unsafe code and pointers gives C# capabilities that are lacking in Java.

Also covered are several new features added by C# 2.0, including nullable types, partial class definitions, an additional use for extern, fixed-size buffers, and friend assemblies. The chapter concludes by discussing the few keywords that have not been covered by the preceding chapters.

Unsafe Code

C# allows you to write what is called “unsafe” code. While this statement might seem shocking, it really isn’t. Unsafe code is not code that is poorly written; it is code that does not execute under the full management of the Common Language Runtime (CLR). As explained in Chapter 1, C# is normally used to create managed code. It is possible, however, to write code that does not execute under the full control of the CLR. This unmanaged code is not subject to the same controls and constraints as managed code, so it is called “unsafe” because it is not possible to verify that it won’t perform some type of harmful action. Thus, the term unsafe does not mean that the code is inherently flawed. It just means that it is possible for the code to perform actions that are not subject to the supervision of the managed context.

Given that unsafe code might cause problems, you might ask why anyone would want to create such code. The answer is that managed code prevents the use of pointers. If you are familiar with C or C++, then you know that pointers are variables that hold the addresses of other objects. Thus, pointers are a bit like references in C#. The main difference is that a pointer can point anywhere in memory; a reference always points to an object of its type. Since a pointer can point anywhere in memory, it is possible to misuse a pointer. It is also easy to introduce a coding error when using pointers. This is why C# does not support pointers when creating managed code. Pointers are, however, both useful and necessary for some types of programming (such as system-level utilities), and C# does allow you to create and use pointers. All pointer operations must be marked as unsafe, since they execute outside the managed context.

The declaration and use of pointers in C# parallels that of C/C++—if you know how to use pointers in C/C++, then you can use them in C#. But remember, the point of C# is the creation of managed code. Its ability to support unmanaged code allows it to be applied to a special class of problems. It is not for normal C# programming. In fact, to compile unmanaged code, you must use the /unsafe compiler option.

Since pointers are at the core of unsafe code, we will begin there.

Pointer Basics

Pointers are variables that hold the addresses of other variables. For example, if x contains the address of y, then x is said to “point to” y. Once a pointer points to a variable, the value of that variable can be obtained or changed through the pointer. Operations through pointers are often referred to as indirection.

Declaring a Pointer

Pointer variables must be declared as such. The general form of a pointer variable declaration is

 type* var-name;

Here, type is the pointer’s base type, which must be a nonreference type. Thus, you cannot declare a pointer to a class object. A pointer’s base type is also referred to as its referent type. Notice the placement of the *. It follows the type name. Var-name is the name of the pointer variable.

Here is an example. To declare ip to be a pointer to an int, use this declaration:

 int* ip;

For a float pointer, use

 float* fp;

In general, in a declaration statement, following a type name with an * creates a pointer type.

The type of data that a pointer will point to is determined by its base type. Thus, in the preceding examples, ip can be used to point to an int, and fp can be used to point to a float. Understand, however, that there is nothing that actually prevents a pointer from pointing elsewhere. This is why pointers are potentially unsafe.

If you come from a C/C++ background, then you need to be aware of an important difference between the way C# and C/C++ declare pointers. When you declare a pointer type in C/C++, the * is not distributive over a list of variables in a declaration. Thus, in C/C++, this statement:

 int* p, q;

declares an int pointer called p and an int called q. It is equivalent to the following two declarations:

 int* p; int q;

However, in C#, the * is distributive and the declaration

 int* p, q;

creates two pointer variables. Thus, in C# it is the same as these two declarations:

 int* p; int* q;

This is an important difference to keep in mind when porting C/C++ code to C#.

The * and & Pointer Operators

Two operators are used with pointers: * and &. The & is a unary operator that returns the memory address of its operand. (Recall that a unary operator requires only one operand.) For example:

 int* ip; int num = 10; ip = &num;

puts into ip the memory address of the variable num. This address is the location of the variable in the computer’s internal memory. It has nothing to do with the value of num. Thus, ip does not contain the value 10 (num’s initial value). It contains the address at which num is stored. The operation of & can be remembered as returning “the address of” the variable it precedes. Therefore, the preceding assignment statement could be verbalized as “ip receives the address of num.”

The second operator is *, and it is the complement of &. It is a unary operator that yields the value of the variable located at the address specified by its operand. That is, it yields the value of the variable pointed to by a pointer. Continuing with the same example, if ip contains the memory address of the variable num, then

 int val = *ip;

will place into val the value 10, which is the value of num, which is pointed to by ip. The operation of * can be remembered as “at address.” In this case, then, the statement could be read as “val receives the value at address ip.”

The * can also be used on the left side of an assignment statement. In this usage, it sets the value pointed to by the pointer. For example:

 *ip = 100;

This statement assigns 100 to the variable pointed to by ip, which is num in this case. Thus, this statement can be read as “at address ip, put the value 100.”

Using unsafe

Any code that uses pointers must be marked as unsafe by using the unsafe keyword. You can mark an individual statement or an entire method unsafe. For example, here is a program that uses pointers inside Main( ), which is marked unsafe:

 // Demonstrate pointers and unsafe. using System; class UnsafeCode {   // Mark Main as unsafe.   unsafe public static void Main() {     int count = 99;     int* p; // create an int pointer     p = &count; // put address of count into p     Console.WriteLine("Initial value of count is " + *p);     *p = 10; // assign 10 to count via p     Console.WriteLine("New value of count is " + *p);   } }

The output of this program is shown here:

 Initial value of count is 99 New value of count is 10

Using fixed

The fixed modifier is often used when working with pointers. It prevents a managed variable from being moved by the garbage collector. This is needed when a pointer refers to a field in a class object, for example. Since the pointer has no knowledge of the actions of the garbage collector, if the object is moved, the pointer will point to the wrong object. Here is the general form of fixed:

 fixed (type* p = &var) {     // use fixed object }

Here, p is a pointer that is being assigned the address of a variable. The object will remain at its current memory location until the block of code has executed. You can also use a single statement for the target of a fixed statement. The fixed keyword can be used only in an unsafe context. You can declare more than one fixed pointer at a time using a comma-separated list.

Here is an example of fixed:

 // Demonstrate fixed. using System; class Test {   public int num;   public Test(int i) { num = i; } } class FixedCode {   // Mark Main as unsafe.   unsafe public static void Main() {     Test o = new Test(19);     fixed (int* p = &o.num) { // use fixed to put address of o.num into p       Console.WriteLine("Initial value of o.num is " + *p);       //...       *p = 10; // assign to o.num via p       Console.WriteLine("New value of o.num is " + *p);       //...     }   } }

The output from this program is shown here:

 Initial value of o.num is 19 New value of o.num is 10

Here, fixed prevents o from being moved. Because p points to o.num, if o were moved, then p would point to an invalid location.

Accessing Structure Members Through a Pointer

A pointer can point to an object of a structure type as long as the structure does not contain reference types. When you access a member of a structure through a pointer, you must use the arrow operator, which is –>, rather than the dot (.) operator. For example, given this structure:

 struct MyStruct {   public int a;   public int b;   public int sum() { return a + b; } }

you would access its members through a pointer like this:

 MyStruct o = new MyStruct(); MyStruct* p; // declare a pointer p = &o; p->a = 10; // use the -> operator p->b = 20; // use the -> operator Console.WriteLine("Sum is " + p->sum());

Pointer Arithmetic

There are only four arithmetic operators that can be used on pointers: ++, –– , +, and –. To understand what occurs in pointer arithmetic, we will begin with an example. Let p1 be an int pointer with a current value of 2,000 (that is, it contains the address 2,000). After this expression:

 p1++;

the contents of p1 will be 2,004, not 2,001! The reason is that each time p1 is incremented, it will point to the next int. Since int in C# is 4 bytes long, incrementing p1 increases its value by 4. The reverse is true of decrements. Each decrement decreases p1’s value by 4. For example:

 p1--;

will cause p1 to have the value 1,996, assuming that it previously was 2,000.

Generalizing from the preceding example, each time that a pointer is incremented, it will point to the memory location of the next element of its base type. Each time it is decremented, it will point to the location of the previous element of its base type.

Pointer arithmetic is not limited to only increment and decrement operations. You can also add or subtract integers to or from pointers. The expression

 p1 = p1 + 9;

makes p1 point to the ninth element of p1’s base type, beyond the one it is currently pointing to.

Although you cannot add pointers, you can subtract one pointer from another (provided they are both of the same base type). The remainder will be the number of elements of the base type that separate the two pointers.

Other than addition and subtraction of a pointer and an integer, or the subtraction of two pointers, no other arithmetic operations can be performed on pointers. For example, you cannot add or subtract float or double values to or from pointers.

To see the effects of pointer arithmetic, execute the next short program. It prints the actual physical addresses to which an integer pointer (ip) and a floating-point pointer (fp) are pointing. Observe how each changes, relative to its base type, each time the loop is repeated.

 // Demonstrate the effects of pointer arithmetic. using System; class PtrArithDemo {   unsafe public static void Main() {     int x;     int i;     double d;     int* ip = &i;     double* fp = &d;     Console.WriteLine("int     double\n");     for(x=0; x < 10; x++) {        Console.WriteLine((uint) (ip) + " " +                          (uint) (fp));        ip++;        fp++;     }   } }

Sample output is shown here. Your output may differ, but the intervals will be the same.

 int     double 1243464 1243468 1243468 1243476 1243472 1243484 1243476 1243492 1243480 1243500 1243484 1243508 1243488 1243516 1243492 1243524 1243496 1243532 1243500 1243540

As the output shows, pointer arithmetic is performed relative to the base type of the pointer. Since an int is 4 bytes and a double is 8 bytes, the addresses change in multiples of these values.

Pointer Comparisons

Pointers can be compared using the relational operators, such as = =, <, and >. However, for the outcome of a pointer comparison to be meaningful, usually the two pointers must have some relationship to each other. For example, if p1 and p2 are pointers that point to two separate and unrelated variables, then any comparison between p1 and p2 is generally meaningless. However, if p1 and p2 point to variables that are related to each other, such as elements of the same array, then p1 and p2 can be meaningfully compared.

Here is an example that uses pointer comparison to find the middle element of an array:

 // Demonstrate pointer comparison. using System; class PtrCompDemo {   unsafe public static void Main() {     int[] nums = new int[11];     int x;     // find the middle     fixed (int* start = &nums[0]) {       fixed(int* end = &nums[nums.Length-1]) {         for(x=0; start+x <= end-x; x++) ;       }     }     Console.WriteLine("Middle element is " + x);   } }

Here is the output:

 Middle element is 6

This program finds the middle element by initially setting start to the first element of the array and end to the last element of the array. Then, using pointer arithmetic, the start pointer is increased and the end pointer is decreased until start is less than or equal to end.

One other point: The pointers start and end must be created within a fixed statement because they point to elements of an array, which is a reference type. Recall that in C#, arrays are implemented as objects and might be moved by the garbage collector.