User-Defined Casts | Pro Visual C++ 2005 for C# Developers

Earlier, this chapter examined how you can convert values between predefined data types. You saw that this is done through a process of casting. You also saw that C# allows two different types of casts: implicit and explicit.

For an explicit cast, you explicitly mark the cast in your code by writing the destination data type inside parentheses:

 int I = 3; long l = I                  // implicit short s = (short)I          // explicit

For the predefined data types, explicit casts are required where there is a risk that the cast might fail or some data might be lost. The following are some examples:

When converting from an int to a short, because the short might not be large enough to hold the value of the int.
When converting from signed to unsigned data types will return incorrect results if the signed variable holds a negative value.
When converting from floating-point to integer data types, the fractional part of the number will be lost.
When converting from a nullable type to a non-nullable type, a value of null will cause an exception.

The idea is that by making the cast explicit in your code, C# forces you to affirm that you understand there is a risk of data loss, and therefore presumably you have written your code to take this into account.

Because C# allows you to define your own data types (structs and classes), it follows that you will need the facility to support casts to and from those data types. The mechanism is that you can define a cast as a member operator of one of the relevant classes. Your cast operator must be marked as either implicit or explicit to indicate how you are intending it to be used. The expectation is that you follow the same guidelines as for the predefined casts: if you know the cast is always safe whatever the value held by the source variable, then you define it as implicit. If, on the other hand, you know there is a risk of something going wrong for certain values — perhaps some loss of data or an exception being thrown — then you should define the cast as explicit.

Important

You should define any custom casts you write as explicit if there are any source data values for which the cast will fail or if there is any risk of an exception being thrown.

The syntax for defining a cast is similar to that for overloading operators discussed earlier in this chapter. This is not a coincidence, because a cast is regarded as an operator whose effect is to convert from the source type to the destination type. To illustrate the syntax, the following is taken from an example struct named Currency, which is introduced later in this section:

 public static implicit operator float (Currency value) { // processing }

The return type of the operator defines the target type of the cast operation, and the single parameter is the source object for the conversion. The cast defined here allows you to implicitly convert the value of a Currency into a float. Note that if a conversion has been declared as implicit, the compiler will permit its use either implicitly or explicitly. If it has been declared as explicit, the compiler will only permit it to be used explicitly. In common with other operator overloads, casts must be declared as both public and static.

Note

C++ developers will notice that this is different from C++, in which casts are instance members of classes.

Implementing User-Defined Casts

This section illustrates the use of implicit and explicit user-defined casts in an example called Simple Currency (which, as usual, is found in the code download). In this example, you define a struct, Currency, which holds a positive USD ($) monetary value. C# provides the decimal type for this purpose, but it is possible you might still want to write your own struct or class to represent monetary values if you want to perform sophisticated financial processing and therefore want to implement specific methods on such a class.

Note

The syntax for casting is the same for structs and classes. This example happens to be for a struct, but would work just as well if you declared Currency as a class.

Initially, the definition of the Currency struct is as follows:

 struct Currency { public uint Dollars; public ushort Cents; public Currency(uint dollars, ushort cents) { this.Dollars = dollars; this.Cents = cents; } public override string ToString() { return string.Format("${0}.{1,-2:00}", Dollars,Cents); }

The use of unsigned data types for the Dollar and Cents fields ensures that a Currency instance can only hold positive values. It is restricted this way in order to illustrate some points about explicit casts later on. You might want to use a class like this to hold, for example, salary information for employees of a company (people's salaries tend not to be negative!). To keep the class simple, the fields are public, but usually, you would make them private and define corresponding properties for the dollars and cents.

Start off by assuming that you want to be able to convert Currency instances to float values, where the integer part of the float represents the dollars. In other words, you would like to be able to write code like this:

 Currency balance = new Currency(10,50);  float f = balance; // We want f to be set to 10.5

To be able to do this, you need to define a cast. Hence, you add the following to your Currency definition:

 public static implicit operator float (Currency value) { return value.Dollars + (value.Cents/100.0f); }

This cast is implicit. This is a sensible choice in this case, because, as should be clear from the definition of Currency, any value that can be stored in the currency can also be stored in a float. There's no way that anything should ever go wrong in this cast.

Note

There is a slight cheat here — in fact, when converting a uint to a float, there can be a loss in precision, but Microsoft has deemed this error sufficiently marginal to count the uint-to-float cast as implicit anyway.

However, if you have a float that you would like to be converted to a Currency, the conversion is not guaranteed to work: a float can store negative values, which Currency instances can't; and a float can store numbers of a far higher magnitude than can be stored in the (uint) Dollar field of Currency. So if a float contains an inappropriate value, converting it to a Currency could give unpredictable results. As a result of this risk, the conversion from float to Currency should be defined as explicit. Here is the first attempt, which won't give quite the correct results, but it is instructive to examine why:

 public static explicit operator Currency (float value) { uint dollars = (uint)value; ushort cents = (ushort)((value-dollars)*100); return new Currency(dollars, cents); }

The following code will now successfully compile:

 float amount = 45.63f;  Currency amount2 = (Currency)amount;

However, the following code, if you tried it, would generate a compilation error, because it attempts to use an explicit cast implicitly:

 float amount = 45.63f;  Currency amount2 = amount; // wrong

By making the cast explicit, you warn the developer to be careful because data loss might occur. However, as you soon see, this isn't how you want your Currency struct to behave. Try writing a test harness and running the sample. Here is the Main() method, which instantiates a Currency struct and attempts a few conversions. At the start of this code, you write out the value of balance in two different ways (because this will be needed to illustrate something later on in the example):

 static void Main() { try { Currency balance = new Currency(50,35); Console.WriteLine(balance); Console.WriteLine("balance is " + balance); Console.WriteLine("balance is (using ToString()) " + balance.ToString()); float balance2= balance; Console.WriteLine("After converting to float, = " + balance2); balance = (Currency) balance2; Console.WriteLine("After converting back to Currency, = " + balance); Console.WriteLine("Now attempt to convert out of range value of " + "-$100.00 to a Currency:"); checked { balance = (Currency) (-50.5); Console.WriteLine("Result is " + balance.ToString()); } } catch(Exception e) { Console.WriteLine("Exception occurred: " + e.Message); } }

Notice that the entire code is placed in a try block to catch any exceptions that occur during your casts. Also, the lines that test converting an out-of-range value to Currency are placed in a checked block in an attempt to trap negative values. Running this code gives this output:

SimpleCurrency

50.35 Balance is $50.35 Balance is (using ToString()) $50.35 After converting to float, = 50.35 After converting back to Currency, = $50.34 Now attempt to convert out of range value of -$100.00 to a Currency: Result is $4294967246.60486

This output shows that the code didn't quite work as expected. First, converting back from float to Currency gave a wrong result of $50.34 instead of $50.35. Second, no exception was generated when you tried to convert an obviously out-of-range value.

The first problem is caused by rounding errors. If a cast is used to convert from a float to a uint, the computer will truncate the number rather than rounding it. The computer stores numbers in binary rather than decimal, and the fraction 0.35 cannot be exactly represented as a binary fraction (just like 1/3 cannot be represented exactly as a decimal fraction; it comes out as 0.3333 recurring). So, the computer ends up storing a value very slightly lower than 0.35, and which can be represented exactly in binary format. Multiply by 100 and you get a number fractionally less than 35, which gets truncated to 34 cents. Clearly in this situation, such errors caused by truncation are serious, and the way to avoid them is to ensure that some intelligent rounding is performed in numerical conversions instead. Luckily, Microsoft has written a class that will do this: System.Convert. System.Convert contains a large number of static methods to perform various numerical conversions, and the one that we want is Convert.ToUInt16(). Note that the extra care taken by the System.Convert methods does come at a performance cost, so you should only use them when you need them.

Now let's examine why the expected overflow exception didn't get thrown. The problem here is this: the place where the overflow really occurs isn't actually in the Main() routine at all — it is inside the code for the cast operator, which is called from the Main() method. And that code wasn't marked as checked.

The solution here is to ensure that the cast itself is computed in a checked context too. With both of these changes, the revised code for the conversion looks like this:

public static explicit operator Currency (float value) { checked {       uint dollars = (uint)value; ushort cents = Convert.ToUInt16((value-dollars)*100);       return new Currency(dollars, cents); } }

Note that you use Convert.ToUInt16() to calculate the cents, as described earlier, but you do not use it for calculating the dollar part of the amount. System.Convert is not needed when working out the dollar amount because truncating the float value is what you want there.

Note

It is worth noting that the System.Convert methods also carry out their own overflow checking. Hence, for the particular case we are considering, there is no need to place the call to Convert.ToUInt16() inside the checked context. The checked context is still required, however, for the explicit casting of valueto dollars.

You won't see a new set of results with this new checked cast just yet, because you have some more modifications to make to the SimpleCurrency example later in this section.

Note

If you are defining a cast that will be used very often, and for which performance is at an absolute premium, you may prefer not to do any error checking. That's also a legitimate solution, provided the behavior of your cast and the lack of error checking are very clearly documented.

Casts between classes

The Currency example involves only classes that convert to or from float — one of the predefined data types. However, it is not necessary to involve any of the simple data types. It is perfectly legitimate to define casts to convert between instances of different structs or classes that you have defined. You need to be aware of a couple of restrictions, however:

You cannot define a cast if one of the classes is derived from the other (these types of cast already exist, as you will see).
The cast must be defined inside the definition of either the source or destination data type.

To illustrate these requirements, suppose you have the class hierarchy shown in Figure 5-1.

image from book Figure 5-1

In other words, classes C and D are indirectly derived from A. In this case, the only legitimate user-defined cast between A, B, C, or D would be to convert between classes C and D, because these classes are not derived from each other. The code to do so might look like this (assuming you want the casts to be explicit, which is usually the case when defining casts between user-defined casts):

 public static explicit operator D(C value) { // and so on } public static explicit operator C(D value) { // and so on }

For each of these casts, you have a choice of where you place the definitions — inside the class definition of C or inside the class definition of D, but not anywhere else. C# requires you to put the definition of a cast inside either the source class (or struct) or the destination class (or struct). A side effect of this is that you can't define a cast between two classes unless you have access to edit the source code for at least one of them. This is sensible because it prevents third parties from introducing casts into your classes.

Once you have defined a cast inside one of the classes, you can't also define the same cast inside the other class. Obviously, there should be only one cast for each conversion — otherwise the compiler wouldn't know which one to pick.

Casts between base and derived classes

To see how these casts work, start by considering the case where the source and destination are both reference types, and consider two classes, MyBase and MyDerived, where MyDerived is derived directly or indirectly from MyBase.

First, from MyDerived to MyBase; it is always possible (assuming the constructors are available) to write:

 MyDerived derivedObject = new MyDerived(); MyBase baseCopy = derivedObject;

In this case, you are casting implicitly from MyDerived to MyBase. This works because of the rule that any reference to a type MyBase is allowed to refer to objects of class MyBase or to objects of anything derived from MyBase. In OO programming, instances of a derived class are, in a real sense, instances of the base class, plus something extra. All the functions and fields defined on the base class are defined in the derived class too.

Alternatively, you can write:

 MyBase derivedObject = new MyDerived(); MyBase baseObject = new MyBase(); MyDerived derivedCopy1 = (MyDerived) derivedObject;   // OK MyDerived derivedCopy2 = (MyDerived) baseObject;      // Throws exception

This code is perfectly legal C# (in a syntactic sense, that is) and illustrates casting from a base class to a derived class. However, the final statement will throw an exception when executed. What happens when you perform the cast is that the object being referred to is examined. Because a base class reference can in principle refer to a derived class instance, it is possible that this object is actually an instance of the derived class that you are attempting to cast to. If that's the case, the cast succeeds, and the derived reference is set to refer to the object. If, however, the object in question is not an instance of the derived class (or of any class derived from it), the cast fails and an exception is thrown.

Notice the casts that the compiler has supplied, which convert between base and derived class, do not actually do any data conversion on the object in question. All they do is set the new reference to refer to the object if it is legal for that conversion to occur. To that extent, these casts are very different in nature from the ones that you will normally define yourself. For example, in the SimpleCurrency example earlier, you defined casts that convert between a Currency struct and a float. In the float-to-Currency cast, you actually instantiated a new Currency struct and initialized it with the required values. The predefined casts between base and derived classes do not do this. If you actually want to convert a MyBase instance into a real MyDerived object with values based on the contents of the MyBase instance, you would not be able to use the cast syntax to do this. The most sensible option is usually to define a derived class constructor that takes a base class instance as a parameter, and have this constructor perform the relevant initializations:

 class DerivedClass : BaseClass { public DerivedClass(BaseClass rhs) { // initialize object from the Base instance } // etc.

Boxing and unboxing casts

The previous discussion focused on casting between base and derived classes where both participants were reference types. Similar principles apply when casting value types, although in this case it is not possible to simply copy references — some copying of data must take place.

It is not, of course, possible to derive from structs or primitive value types. So, casting between base and derived structs invariably means casting between a primitive type or a struct and System.Object (theoretically, it is possible to cast between a struct and System.ValueType, though it is hard to see why you would want to do this).

The cast from any struct (or primitive type) to object is always available as an implicit cast — because it is a cast from a derived to a base type — and is just the familiar process of boxing. For example, with the Currency struct:

 Currency balance = new Currency(40,0);  object baseCopy = balance;

When this implicit cast is executed, the contents of balance are copied onto the heap into a boxed object, and the baseCopy object reference set to this object. What actually happens behind the scenes is this: When you originally defined the Currency struct, the .NET Framework implicitly supplied another (hidden) class, a boxed Currency class, which contains all the same fields as the Currency struct, but is a reference type, stored on the heap. This happens whenever you define a value type — whether it is a struct or enum, and similar boxed reference types exist corresponding to all the primitive value types of int, double, uint, and so on. It is not possible, nor necessary, to gain direct programmatic access to any of these boxed classes in source code, but they are the objects that are working behind the scenes whenever a value type is cast to object. When you implicitly cast Currency to object, a boxed Currency instance gets instantiated and initialized with all the data from the Currency struct. In the preceding code, it is this boxed Currency instance that baseCopy will refer to. By these means, it is possible for casting from derived to base type to work syntactically in the same way for value types as for reference types.

Casting the other way is known as unboxing. Just as for casting between a base reference type and a derived reference type, it is an explicit cast, because an exception will be thrown if the object being cast is not of the correct type:

 object derivedObject = new Currency(40,0);  object baseObject = new object();  Currency derivedCopy1 = (Currency)derivedObject;   // OK  Currency derivedCopy2 = (Currency)baseObject;      // Exception thrown

This code works analogously to the similar code presented earlier for reference types. Casting derivedObject to Currency works fine because derivedObject actually refers to a boxed Currency instance — the cast will be performed by copying the fields out of the boxed Currency object into a new Currency struct. The second cast fails because baseObject does not refer to a boxed Currency object.

When using boxing and unboxing, it is important to understand both processes actually copy the data into the new boxed or unboxed object. Hence, manipulations on the boxed object, for example, will not affect the contents of the original value type.

Multiple Casting

One thing you will have to watch for when you are defining casts is that if the C# compiler is presented with a situation in which no direct cast is available to perform a requested conversion, it will attempt to find a way of combining casts to do the conversion. For example, with the Currency struct, suppose the compiler encounters a couple of lines of code like this:

 Currency balance = new Currency(10,50); long amount = (long)balance; double amountD = balance;

You first initialize a Currency instance, and then you attempt to convert it to a long. The trouble is that you haven't defined the cast to do that. However, this code will still compile successfully. What will happen is that the compiler will realize that you have defined an implicit cast to get from Currency to float, and the compiler already knows how to explicitly cast a float to a long. Hence, it will compile that line of code into IL code that converts balance first to a float, and then converts that result to a long. The same thing happens in the final line of the code, when you convert balance to a double. However, because the cast from Currency to float and the predefined cast from float to double are both implicit, you can write this conversion in your code as an implicit cast. If you'd preferred, you could have specified the casting route explicitly:

 Currency balance = new Currency(10,50); long amount = (long)(float)balance; double amountD = (double)(float)balance;

However, in most cases, this would be seen as needlessly complicating your code. The following code by contrast would produce a compilation error:

 Currency balance = new Currency(10,50);  long amount = balance;

The reason is the best match for the conversion that the compiler can find is still to convert first to float then to long. The conversion from float to long needs to be specified explicitly, though.

All this by itself shouldn't give you too much trouble. The rules are, after all, fairly intuitive and designed to prevent any data loss from occurring without the developer knowing about it. However, the problem is that if you are not careful when you define your casts, it is possible for the compiler to figure out a path that leads to unexpected results. For example, suppose it occurs to someone else in the group writing the Currency struct that it would be useful to be able to convert a uint containing the total number of cents in an amount into a Currency (cents not dollars because the idea is not to lose the fractions of a dollar). So, this cast might be written to try to achieve this:

public static implicit operator Currency (uint value) { return new Currency(value/100u, (ushort)(value%100)); } // Don't do this!

Note the u after the first 100 in this code to ensure that value/100u is interpreted as a uint. If you'd written value/100, the compiler would have interpreted this as an int, not a uint.

Don't do this is clearly commented in this code, and here's why. Look at the following code snippet; all it does is convert a uint containing 350 into a Currency and back again. What do you think bal2 will contain after executing this?

 uint bal = 350; Currency balance = bal; uint bal2 = (uint)balance;

The answer is not 350 but 3! And it all follows logically. You convert 350 implicitly to a Currency, giving the result balance.Dollars=3, balance.Cents=50. Then the compiler does its usual figuring out of best path for the conversion back. Balance ends up getting implicitly converted to a float (value3.5), and this gets converted explicitly to a uint with value 3.

Of course, other instances exist in which converting to another data type and back again causes data loss. For example, converting a float containing 5.8 to an int and back to a float again will lose the fractional part, giving a result of 5, but there is a slight difference in principle between losing the fractional part of a number and dividing an integer by more than 100! Currency has suddenly become a rather dangerous class that does strange things to integers!

The problem is that there is a conflict between how your casts interpret integers. The casts between Currency and float interpret an integer value of 1 as corresponding to one dollar, but the latest uint-to-Currency cast interprets this value as one cent. This is an example of very poor design. If you want your classes to be easy to use, you should make sure all your casts behave in a way that is mutually compatible, in the sense that they intuitively give the same results. In this case, the solution is obviously to rewrite the uint-to-Currency cast so that it interprets an integer value of 1 as one dollar:

 public static implicit operator Currency (uint value) { return new Currency(value, 0); }

Incidentally, you might wonder whether this new cast is necessary at all. The answer is that it could be useful. Without this cast, the only way for the compiler to carry out a uint-to-Currency conversion would be via a float. Converting directly is a lot more efficient in this case, so having this extra cast gives performance benefits, but you need to make sure it gives the same result as you would get going via a float, which you have now done. In other situations, you may also find that separately defining casts for different predefined data types allows more conversions to be implicit rather than explicit, though that's not the case here.

A good test of whether your casts are compatible is to ask whether a conversion will give the same results (other than perhaps a loss of accuracy as in float-to-int conversions), irrespective of which path it takes. The Currency class provides a good example of this. Look at this code:

 Currency balance = new Currency(50, 35);  ulong bal = (ulong) balance;

At present, there is only one way that the compiler can achieve this conversion: by converting the Currency to a float implicitly, then to a ulong explicitly. The float-to-ulong conversion requires an explicit conversion, but that's fine because you have specified one here.

Suppose, however, that you then added another cast, to convert implicitly from a Currency to a uint. You will actually do this by modifying the Currency struct by adding the casts both to and from uint. This code is available as the SimpleCurrency2 example:

public static implicit operator Currency (uint value) {    return new Currency(value, 0); } public static implicit operator uint (Currency value) { return value.Dollars; }

Now the compiler has another possible route to convert from Currency to ulong: to convert from Currency to uint implicitly then to ulong implicitly. Which of these two routes will it take? C# does have some precise rules (which are not detailed in this book; if you are interested, details are in the MSDN documentation) to say how the compiler decides which is the best route if there are several possibilities. The best answer is that you should design your casts so that all routes give the same answer(other than possible loss of precision), in which case it doesn't really matter which one the compiler picks. (As it happens in this case, the compiler picks the Currency-to-uint-to-ulong route in preference to Currency-to-float-to-ulong.)

To test the SimpleCurrency2 sample, add this code to the test code for SimpleCurrency:

try {    Currency balance = new Currency(50,35);    Console.WriteLine(balance);    Console.WriteLine("balance is " + balance);    Console.WriteLine("balance is (using ToString()) " + balance.ToString()); uint balance3 = (uint) balance; Console.WriteLine("Converting to uint gives " + balance3);

Running the sample now gives these results:

SimpleCurrency2

50 balance is $50.35 balance is (using ToString()) $50.35 Converting to uint gives 50 After converting to float, = 50.35 After converting back to Currency, = $50.34 Now attempt to convert out of range value of -$100.00 to a Currency: Exception occurred: Arithmetic operation resulted in an overflow.

The output shows that the conversion to uint has been successful, though as expected, you have lost the cents part of the Currency in making this conversion. Casting a negative float to Currency has also produced the expected overflow exception now that the float-to-Currency cast itself defines a checked context.

However, the output also demonstrates one last potential problem that you need to be aware of when working with casts. The very first line of output has not displayed the balance correctly, displaying 50 instead of $50.35. Consider these lines:

Console.WriteLine(balance); Console.WriteLine("balance is " + balance); Console.WriteLine("balance is (using ToString()) " + balance.ToString());

Only the last two lines correctly display the Currency as a string. So what's going on? The problem here is that when you combine casts with method overloads, you get another source of unpredictability. We will look at these lines in reverse order.

The third Console.WriteLine() statement explicitly calls the Currency.ToString() method ensuring the Currency is displayed as a string. The second does not do so. However, the string literal "balance is" passed to Console.WriteLine() makes it clear to the compiler that the parameter is to be interpreted as a string. Hence, the Currency.ToString() method will be called implicitly.

The very first Console.WriteLine() method, however, simply passes a raw Currency struct to Console.WriteLine(). Now, Console.WriteLine() has many overloads, but none of them takes a Currency struct. So the compiler will start fishing around to see what it can cast the Currency to in order to make it match up with one of the overloads of Console.WriteLine(). As it happens, one of the Console.WriteLine() overloads is designed to display uints quickly and efficiently, and it takes a uint as a parameter, and you have now supplied a cast that converts Currency implicitly to uint.

In fact, Console.WriteLine() has another overload that takes a double as a parameter and displays the value of that double. If you look closely at the output from the first SimpleCurrency example, you will find the very first line of output displayed Currency as a double, using this overload. In that example, there wasn't a direct cast from Currency to uint, so the compiler picked Currency-to-float-to- double as its preferred way of matching up the available casts to the available Console.WriteLine() overloads. However, now that there is a direct cast to uint available in SimpleCurrency2, the compiler has opted for this route.

The upshot of this is that if you have a method call that takes several overloads, and you attempt to pass it a parameter whose data type doesn't match any of the overloads exactly, then you are forcing the compiler to decide not only what casts to use to perform the data conversion, but which overload, and hence which data conversion, to pick. The compiler always works logically and according to strict rules, but the results may not be what you expected. If there is any doubt, you are better off specifying which cast to use explicitly.