The C Type System

I l @ ve RuBoard

The C# Type System

In the .NET framework, there is a common type system that is used to allow all languages targeted at the .NET environment to interoperate with each other. C# makes use of this underlying type system. Earlier it was stated that everything in C# is an object. Well this is almost true. Primitive data types are not objects and the reason for this is performance. Because objects are allocated on the heap and managed by the GC, this would introduce a significant amount of overhead to deal with basic types such as int and char . For this reason, C# implements primitive types as struct s, which are considered value types. In C#, a value type is allocated on the stack as opposed to being allocated on the heap and managed by the GC. Because value types are allocated on the stack, their lifetime is limited to the scope in which they were declared.

Table 2.1.1 presents a listing of the available types within C#.

With the exception of the string type, all types represented within Table 2.1.1 are implemented as a struct . The string type is special in the fact that its implementation is actually a sealed class. A sealed class is a class that cannot be inherited from and thus terminates the inheritance chain.

Table 2.1.1. C# Types

Type	Description
`object`	Base class of all objects in C#
`string`	Unicode sequence of characters
`sbyte`	8-bit signed integral
`short`	16-bit signed integral
`int`	32-bit signed integral
`long`	64-bit signed integral
`byte`	8-bit unsigned integral
`ushort`	16-bit unsigned integral
`uint`	32-bit unsigned integral
`ulong`	64-bit unsigned integral
`float`	Single-precision floating point
`double`	Double-precision floating point
`bool`	Boolean ” `true` or `false`
`char`	A Unicode character
`decimal`	28 significant digit decimal type

Value Types in Action

When a primitive type is declared, C# requires that the variable is initialized before any attempt is made to use that variable. In C++, the value of an un- initialized variable is undefined; the same rule applies in C#. The difference is that in C++, the variable can be used with unknown results.

Declaring a variable in C# has the following syntax:

  type  variable-name [= initialization]

Where type represents the variable type, and the variable name can consist of any combination of alphanumeric characters and the underscore. However, a variable name must begin with either an underscore or an alpha char and cannot begin with a numeric value.

The following are examples of variable declarations:

 int _999;        //valid int a_dog;    //valid int 123_go;    //invalid

NOTE

C# allows for the // and /* */ comment markers. The // denotes that all text to the right on the current line is a comment. The /* */ marker is used for multi-line comments.

C# also enforces strict type checking and assignment. This means no fudging ! In C++, it was possible to declare an unsigned integer and then assign the value of “1 to the variable. The C# compiler will quickly catch the assignment and produce a compiler error pointing out the invalid assignment.

 unsigned int cpp_fudge = -1;        //valid C++ uint csharp_fudge = -1;        //error in C#

The error produced by the C# compiler will declare that the constant value of “1 cannot be converted to an unsigned integer. Talk about strict!

The `struct`

In languages such as C/C++, it was not possible to create a primitive type. The primitive types were like little magic entities that just existed ”their meaning known, their implementation a mystery. C# implements primitive types as simple struct s. This means that it is possible for developers to create types that are treated in the same manner as the C# primitive types.

When creating a struct , it is important to keep the implementation to a bare minimum. After all, if additional functionality is required in its implementation, it would be better to implement the entity as a full blown object. struct s are often used to represent small pieces of data that generally have a restricted lifetime and tend to be inexpensive to manage in terms of memory requirements.

In contrast to C++, struct s and classes are two totally different entities. In C++, the only difference between a struct and a class is the default member visibility. In C#, struct s and classes seem to share many similarities on the surface, but they are not related and actually have many differences.

A struct in C# cannot inherit from another struct or class, however a struct can implement one or more interfaces. Structs may contain data members and methods . A struct cannot define a parameter less constructor. All struct s contain an implicit constructor that is responsible for initializing all data members to their default values. However, a struct can define a construct that accepts parameters. A construct in C# is the same as a constructor in C++ or Java. The big difference between a struct and a class is, of course, that struct s are value-types that are allocated on the stack and not reference counted.

The following is the syntax for declaring a struct :

  struct name  {     [access-modifier] members; }

The syntax is similar to both C++ and Java with respect to the declaration and construction. Member visibility defaults to private in C#. Members of a struct can have the following access modifiers applied to them: public, private, or internal. Because a struct cannot serve as the base from which another struct can inherit, the protected modifier has no place; if used, the C# compiler will quickly point out the error.

Listing 2.1.1 shows the declaration of a struct named Fraction. Notice that the member access is achieved by use of the dot operator, this can be seen on lines 13 and 14. In C#, all member access is through the dot operator regardless of whether it is a struct or class.

Listing 2.1.1 A Simple `struct`

 1: //File        :part02_01.cs  2: //Author    :Richard L. Weeks  3: //Purpose    :Declare a simple struct  4:  5: struct Fraction {  6:     public int numerator;  7:     public int denominator;  8: }  9: 10: public class StructTest { 11:     public static void Main( ) { 12:         Fraction f; 13:         f.numerator   = 5; 14:         f.denominator = 10; 15:     } 16: }

Value-types also have a built-in assignment operator that performs a copy of all data members. In C++, this functionality was achieved by implementing the assignment operator and providing the code necessary to copy the data members. Listing 2.1.2 uses C# built-in assignment operators to copy the value of one struct to another struct of the same type.

Listing 2.1.2 `struct` Assignment

 1: //File        :part02_02.cs  2: //Author    :Richard L. Weeks  3: //Purpose    :Declare a simple struct  4:  5: using System;  6:  7: struct Fraction {  8:     public int numerator;  9:     public int denominator; 10: 11:     public void Print( ) { 12:         Console.WriteLine( "{0} /{1} ", numerator, denominator ); 13:     } 14: } 15: 16: 17: public class StructTest { 18:     public static void Main( ) { 19: 20:         Fraction f; 21:         f.numerator   = 5; 22:         f.denominator = 10; 23:         f.Print( ); 24: 25:         Fraction f2 = f; 26:         f2.Print( ); 27: 28:         //modify struct instance f2 29:         f2.numerator = 1; 30: 31:         f.Print( ); 32:         f2.Print( ); 33:     } 34: }

Listing 2.1.2 extends the implementation of the Fraction struct by implementing the Print method on line 11. The Print method makes use of the Console.WriteLine method to display the current value of the Fraction .

To demonstrate the assignment operator provided by C# for struct s, two instances of the Fraction are declared. Line 25 declares a variable f2 and initializes it with the previous Fraction variable f . When f2.Print() is invoked, the same 5/10 output will be displayed as the call to f.Print() .

It is important to realize that a copy has occurred and not a reference assignment of f to f2 . When f2 is modified on line 29, the following Print method invocations display two different values. The call to f.Print() will still produce the 5/10 output, whereas the call to f2.Print() will now output 1/10.

Reference Types

C# has many circular references when trying to describe the concepts of the language. It seems that to discuss one aspect of the language, knowledge of a different aspect of the language is necessary. To this end, a brief discussion of reference types is necessary for the remaining topics to flow together.

A class construct is an example of a reference type in C#. Classes can be considered the big brother to C# struct s. Reference counting means that any reference type will exist so long as there remains some active reference to the entity. In classic COM, referencing counting was visible in the AddRef and Release methods of a COM object. When the final reference was released on the instance of the object, the object took the necessary steps to clean up.

Fortunately, C# has abstracted away all the gory details of referencing counting. The GC is responsible for cleaning up memory being used by un-referenced classes and interfaces. When an object's reference count reaches zero, the GC will invoke the Finalize method on the object, reclaim the memory, and return it to the general application heap. The Finalize method is similar to the destructor concept in C++, but there is no deterministic finalization in C#. Basically, there is no way to know when an object will expire. The topic of deterministic finalization is beyond the scope of this book.

A reference is acquired in one of two ways: when an instance of a reference type is created and when an assignment takes place. Remember that when the assignment operator was used in conjunction with value types, a copy of the value type was created. This is not the case when the assignment operator is used with reference types. Changing the Fraction struct to a class changes the behavior of the assignment operator, as shown in Listing 2.1.3.

Listing 2.1.3 Reference Types

 1: //File        :part02_03.cs  2: //Author    :Richard L. Weeks  3: //Purpose    :Reference Types  4:  5: using System;  6:  7: //A class represents a reference type in C#  8: class Fraction {  9:     public int numerator; 10:     public int denominator; 11: 12:     public void Print( ) { 13:         Console.WriteLine("{0} /{1} ", numerator, denominator ); 14:     } 15: } 16: 17: 18: public class ReferenceTest { 19:     public static void Main( ) { 20: 21:         Fraction f = new Fraction( ); 22:         f.numerator   = 5; 23:         f.denominator = 10; 24:         f.Print( ); 25: 26:         Fraction f2 = f;    //f2 is a reference to f and not a copy!!! 27:         f2.Print( ); 28: 29:         //modify instance f2. Note that f is also effected. 30:         f2.numerator = 1; 31: 32:         f.Print( ); 33:         f2.Print( ); 34:     } 35: }

There are only two changes made to Listing 2.1.2 to create Listing 2.1.3. The first change was declaring the Fraction to be a class instead of a struct . This small change means that instead of the Fraction being allocated on the stack, it will now be created on the heap and reference counted.

The next change to the code involves the way an instance of Fraction is created. Notice that line 21 has changed. To create an instance of a reference type, the new keyword must be used. Now, an instance of the Fraction class must be created to declare the variable f . Without using the proper declaration, the variable f would be considered an un-initialized variable.

The changes to Listing 2.1.2 impact the overall semantics of the code in Listing 2.1.3. Notice that the declaration of the variable f2 on line 26 is now considered a reference to the variable f . This means that the variable f2 is the same as variable f ” f2 is not a copy. This is the fundamental difference between value types and reference types. When f2 is modified on line 30, that same modification is apparent in the variable f .

The invocation of f.Print() and f2.Print() will always produce the same output. When a change is made to f2.numerator , it is the same as changing f.numerator . In C++, is was possible to define an assignment operator and to control the behavior of that operator. This ability does not exist in C#. The assignment operator cannot be overloaded or re-implemented by the developer.

Boxing and Unboxing

The concept of boxing allows for treating a value type as a reference type. Times exist when it is necessary for a value type to be treated as an object, such as storing values in an array or some other collection.

When a value type is boxed, an object instance is created on the heap and the value of the value type is copied into the object. When this happens, the boxed object and the value type are two different entities. The object is not a reference to the original value type. Any change to the value type is not reflected in the object and vise-versa.

Boxing a value type can be done with an implicit assignment. An implicit assignment is an assignment that does not require a type cast, as shown in the following:

 int  i  = 10; object o =  i;

The value type variable i is implicitly cast to that of type object. When the value type i is boxed into object o , an instance of type object is created on the heap and the type information and value of the right-side of the expression is copied into the object.

To unbox an object, an explicit conversion is required, as shown in the following:

 int  i  = 10; object  o  =  i;  int  j  = (int)o;    //explicit conversion from object to int

The integer variable j now holds the value that was held by the object o . It is important to understand that the variables i , o , and j are all independent of each other and not merely references to the same memory space.