Data Types | Pro Visual C++ 2005 for C# Developers

Types in Java and C# can be grouped into two main categories: value types and reference types. As you are probably aware, value type variables store their data on the stack, whereas reference types store data on the heap. Let's start by considering value types.

Value Types

There is only one category of value type in Java; all value types are by default the primitive data types of the language. C# offers a more robust assortment. Value types can be broken down into three main categories:

Simple types
Enumeration types
Structures

The following sections look at each of these in turn.

Simple types

The C# compiler recognizes a number of the usual predefined datatypes (defined in the System Base Class namespace), including integer, character, Boolean, and floating-point types. Of course, the value ranges of the indicated types may be different from one language to another. The C# types and their Java counterparts are discussed next.

Integer values

C# has eight predefined signed and unsigned integer types (as opposed to just four signed integer types in Java):

C# Type	Description	Equivalent in Java
Sbyte	Signed 8-bit	Byte
Short	Signed 16-bit	Short
int	Signed 32-bit	Int
long	Signed 64-bit	Long
byte	8-bit unsigned integer	n/a
ushort	16-bit unsigned integer	n/a
uint 32-bit	Unsigned integer	n/a
ulong 64-bit	Unsigned integer	n/a

When an integer has no suffix the type to which its value can be bound is evaluated in the order int, uint, long, ulong, decimal. Integer values may be represented as decimal or hexadecimal literals. In the following example, the result is 52 for both values:

 int dec = 52; int hex = 0x34; Console.WriteLine("decimal {0}, hexadecimal {1}",dec, hex);

Character values

char represents a single two-byte long Unicode character. C# extends the flexibility of character assignment by allowing assignment via the hexadecimal escape sequence prefixed by \x and Unicode representation via \u. You will also find that you will not be able to convert characters to integers implicitly. All other common Java language escape sequences are fully supported.

Boolean values

The bool type, as in Java, is used to represent the values true and false directly, or as the result of an equation as shown here:

 bool first_time = true;  bool second_time = (counter < 0);

Decimal values

C# introduces the decimal type, which is a 128-bit data type that represents values ranging from approximately 1.0x10-28 to 7.9x1028. They are primarily intended for financial and monetary calculations where precision is important (for example, in foreign exchange calculations). When assigning the decimal type a value, m must be appended to the literal value. Otherwise, the compiler treats the value as a double. Because a double cannot be implicitly converted to a decimal, omitting the m requires an explicit cast:

 decimal precise = 1.234m;  decimal precise = (decimal)1.234;

Floating-point values

The following table lists the C# floating type values and their Java equivalents.

C# Type	Description	Equivalent in Java
Float	Signed 32-bit floating point	float
double	Signed 64-bit floating point	double

Floating-point values can either be doubles or floats. A real numeric literal on the right-hand side of an assignment operator is treated as a double by default. Because there is no implicit conversion from float to double, you may be taken aback when a compiler error occurs. The following example illustrates this problem:

 float f = 5.6; Console.WriteLine(f);

This example produces the following compiler error message.

Literal of type double cannot be implicitly converted to type 'float'; use an 'F'  suffix to create a literal of this type

There are two ways to solve this problem. You could cast your literal to float, but the compiler itself offers a more reasonable alternative. Using the suffix F tells the compiler that this is a literal of type float, not double:

 float f = 5.6F;

Although it is not necessary, you can use a D suffix to signify a double type literal.

Enumeration types

An enumeration is a distinct type consisting of a set of named constants. In Java you can achieve this by using static final variables. In this sense, the enumerations may actually be part of the class that is using them. Another alternative is to define the enumeration as an interface. The following example illustrates this concept:

 interface Color { static int RED = 0; static int GREEN = 1; static int BLUE = 2; }

Of course, the problem with this approach is that it is not type-safe. Any integer read in or calculated can be used as a color. It is possible, however, to programmatically implement a type-safe enumeration in Java by utilizing a variation of the Singleton pattern, which limits the class to a predefined number of instances. The following Java code illustrates how this can be done:

 final class Day { // final so it cannot be sub-classed private String internal; private Day(String Day) {internal = Day;} // private constructor public static final Day MONDAY = new Day("MONDAY"); public static final Day TUESDAY = new Day("TUESDAY"); public static final Day WEDNESDAY = new Day("WEDNESDAY"); public static final Day THURDAY = new Day("THURSDAY"); public static final Day FRIDAY = new Day("FRIDAY"); }

As you can see from this example, the enumerated constants are not tied to primitive types, but to object references. Also, because the class is defined as final, it can't be subclassed, so no other classes can be created from it. The constructor is marked as private, so other methods can't use the class to create new objects. The only objects that will ever be created with this class are the static objects the class creates for itself the first time the class is referenced.

Although the concept is pretty simple, the workaround involves techniques that may not be immediately apparent to a novice, after all, you just want a readily available list of constants. C#, in contrast, provides built-in enumeration support, which also ensures type safety. To declare an enumeration in C# the enum keyword is used. In its simple form, an enum might look like this:

 public enum Status { Working, Complete, BeforeBegin }

In this example, the first value is 0 and the enum counts upward from there, Complete being 1 and so on. If you are interested in having enum represent different values, you can assign them as follows:

 public enum Status { Working = 131, Complete = 129, BeforeBegin = 132 }

You also have the choice of using a different numerical integral type by inheriting from long, short, or byte. int is always the default type, as demonstrated in this snippet:

 public enum Status : int { Working, Complete, BeforeBegin } public enum SmallStatus : byte { Working, Complete, BeforeBegin } public enum BigStatus : long { Working, Complete, BeforeBegin }

It might not be immediately apparent, but there is a big difference among these three enumerations, tied directly to the size of the type they inherit from. The C# byte, for example, can contain one byte of memory. This means SmallStatus cannot have more than 255 constants; if you want more, set the value of any of its constants to more than 255. The following listing displays how you can use the sizeof() operator to identify the differences between the different versions of Status:

 int x = sizeof(Status); int y = sizeof(SmallStatus); int z = sizeof(BigStatus); Console.WriteLine("Regular size:\t{0}\nSmall size:\t{1}\nLarge size:\t{2}", x, y, z);

Compiling the listing produces the following results:

Regular size: 4 Small size: 1 Large size: 8

Structures

One of the major differences between a C# structure (identified with the keyword struct) and an object is that, by default, the struct is passed by value, whereas an object is passed by reference. There is no analogue in Java to structures. Structures have constructors and methods; they can have other members normally associated with a C# class too: indexers (for more on these members see Chapter 4,"Inheritance"), properties, operators, and even nested types. Structures can even implement interfaces. By using structs, you can create types that behave in the same way as, and share similar benefits to, the built-in types. The following snippet demonstrates how a structure can be used:

 public struct EmployeeInfo { public string firstName public string lastName public string jobTitle public string dept public long employeeID }

Although you could have created a class to hold the same information, using a struct is a little more efficient here because it is easier to create and copy it. The following snippet shows how to copy values from one struct to another:

 EmployeeInfo employee1; EmployeeInfo employee2; employee1 = new EmployeeInfo(); employee1.firstName = "Dawn"; employee1.lastName = "Lane"; employee1.jobTitle = "Secretary"; employee1.dept = "Admin"; employee1.employeeID = 203; employee2 = employee1;

Structures are often used to tidy up function calls too: you can bundle up related data together in a struct and then pass the struct as a parameter to the method. However, the following limitations apply to using structures:

A struct cannot inherit from another struct or from classes.
A struct cannot act as the base for a class.
Although a struct may declare constructors, those constructors must take at least one argument.
The struct members cannot have initializers.

Structs and attributes

Attributes (or compiler directives, discussed in Chapter 11, "Reflection," and Appendix D, "C# for C++ Developers") can be used with structures to add more power and flexibility to them. The StructLayout attribute in the System.Runtime.InteropServices namespace, for example, can be used to define the layout of fields in the struct. It is possible to use this feature to create a structure similar in functionality to a C/C++ union. A union is a data type whose members share the same memory block. It can be used to store values of different types in the same memory block. In the event that you do not know what type the values to be received will be, a union is a great way to go. Of course, there is no actual conversion happening; in fact there are no underlying checks on the validity of the data. The same bit pattern is simply interpreted in a different way. The following snippet demonstrates how a union could be created using a struct:

 using System.Runtime.InteropServices; [StructLayout(LayoutKind.Explicit)] public struct Variant { [FieldOffset(0)]public int intVal; [FieldOffset(0)]public string strinVal; [FieldOffset(0)]public decimal decVal; [FieldOffset(0)]public float floatVal; [FieldOffset(0)]public char charVal; }

The FieldOffset attribute applied to the fields is used to set the physical location of the specified field. Setting the starting point of each field to 0 ensures that any data store in one field will overwrite to a certain extent whatever data may have been stored there. It follows then that the total size of the fields will be the size of the largest field, in this case the decimal.

Reference Types

All a reference type variable stores is the reference to data that exists on the heap. Only the memory addresses of the stored objects are kept in the stack. The object type, arrays, and interfaces are all reference types. Objects, classes, and the relationship between the two do not differ much between Java and C#. You will also find that interfaces, and how they are used, are not very different in the two languages. You look at classes and class inheritance in C# in more depth later in this appendix. Strings can also be used the same way in either C# or Java. C# also introduces a new type of reference type called a delegate. Delegates represent a type-safe version of C++ function pointers (references to methods) and are discussed in Chapter 6, "Delegates and Events."

Arrays and Collections

Array syntax in C# is very similar to that used in Java. However, C# supports "jagged" arrays and adds multidimensional arrays (as opposed to the arrays of arrays supported by Java):

 int[] x = new int[20]; //same as in Java except [] must be next to type int[,] y = new int[12,3]; //same as int y[][] = new int[12][3]; int[][] z = new int[5][]; //same as int x[][] = new int[5][];

In C#, arrays are actual types, so they must be written syntactically as such. Unlike in Java, you cannot place the array rank specifier [] before or after the variable; it must come before the variable and after the data type. Because arrays are types, they have their own methods and properties. For example, you can get the length of array x using:

 int xLength = x.Length;

You can also sort the array using the static Sort() method:

 Array.Sort(x);

You should also note that although C# allows you to declare arrays without initializing them, you cannot leave the determination of the size of an array until runtime. If you need a dynamically sized array, you must use a System.Collections.ArrayList object (similar to the Java's Arraylist collection). C# collection objects are covered in depth in Chapter 9, "Collections."

Type Conversion and Casting

Type conversion in Java consists of implicit or explicit narrow and wide casting, using the () operator as needed. It is generally possible to perform similar type conversions in C#. C# also introduces a number of powerful features built into the language. These include boxing and unboxing.

Because value types are nothing more than memory blocks of a certain size, they are great to use for speed reasons. Sometimes, however, the convenience of objects is good to have for a value type. Boxing and unboxing provide a mechanism that forms a binding link between value types and reference types by allowing them to be converted to and from the object type.

Boxing an object means implicitly converting any value type to type Object. An instance of Object is created and allocated, and the value in the value type is copied to the new object. Here is an example of how boxing works in C#:

 int x = 10; Object obj = x;

This type of functionality is not available in Java. The previous code would not compile because primitives cannot be converted to reference types.

Unboxing is simply the casting of the Object type containing the value back to the appropriate value type. Again, this functionality is not available in Java. You can modify the previous code to illustrate this concept. You will immediately notice that although boxing is an implicit cast, unboxing requires an explicit one:

 int x = 10; Object obj = x;  int y = (int) obj;

Another powerful feature of C# dealing with casting is the ability to define custom conversion operators for your classes and structs. Chapter 5, "Operators and Casts," deals with this issue in depth.