.NET Framework Data Types

.NET Framework Data Types

The C in FCL stands for class, but the FCL isn t strictly a class library; it s a library of types. Types can mean any of the following:

  • Classes

  • Structs

  • Interfaces

  • Enumerations

  • Delegates

Understanding what a type is and how one type differs from another is crucial to understanding the FCL. The information in the next several sections will not only enrich your understanding of the FCL, but also help you when the time comes to build data types of your own.

Classes

A class in the .NET Framework is similar to a class in C++: a bundle of code and data that is instantiated to form objects. Classes in traditional object-oriented programming languages such as C++ contain member variables and member functions. Framework classes are richer and can contain the following members:

  • Fields, which are analogous to member variables in C++

  • Methods, which are analogous to member functions in C++

  • Properties, which expose data in the same way fields do but are in fact implemented using accessor (get and set) methods

  • Events, which define the notifications a class is capable of firing

Here, in C#, is a class that implements a Rectangle data type:

class Rectangle { // Fields protected int width = 1; protected int height = 1; // Properties public int Width { get { return width; } set { if (value > 0) width = value; else throw new ArgumentOutOfRangeException (  "Width must be 1 or higher"); } } public int Height { get { return height; } set { if (value > 0) height = value; else throw new ArgumentOutOfRangeException (  "Height must be 1 or higher"); } } public int Area { get { return width * height; } } // Methods (constructors) public Rectangle () {} public Rectangle (int cx, int cy) { Width = cx; Height = cy; } }

Rectangle has seven class members: two fields, three properties, and two methods, which both happen to be constructors special methods that are called each time an instance of the class is created. The fields are protected, which means that only Rectangle and Rectangle derivatives can access them. To read or write a Rectangle object s width and height, a client must use the Width and Height properties. Notice that these properties set accessors throw an exception if an illegal value is entered, a protection that couldn t be afforded had Rectangle s width and height been exposed through publicly declared fields. Area is a read-only property because it lacks a set accessor. A compiler will flag attempts to write to the Area property with compilation errors.

Many languages that target the .NET Framework feature a new operator for instantiating objects. The following statements create instances of Rectangle in C#:

Rectangle rect = new Rectangle (); // Use first constructor Rectangle rect = new Rectangle (3, 4); // Use second constructor

Once the object is created, it might be used like this:

rect.Width *= 2; // Double the rectangle's width int area = rect.Area; // Get the rectangle's new area

Significantly, neither C# nor any other .NET programming language has a delete operator. You create objects, but the garbage collector deletes them.

In C#, classes define reference types, which are allocated on the garbage-collected heap (which is often called the managed heap because it s managed by the garbage collector) and accessed through references that abstract underlying pointers. The counterpart to the reference type is the value type, which you ll learn about in the next section. Most of the time you don t have to be concerned about the differences between the two, but occasionally the differences become very important and can actually be debilitating to your code if not accounted for. See the section Boxing and Unboxing later in this chapter for details.

All classes inherit a virtual method named Finalize from System.Object, which is the ultimate root class for all data types. Finalize is called just before an object is destroyed by the garbage collector. The garbage collector frees the object s memory, but classes that wrap file handles, window handles, and other unmanaged resources ( unmanaged because they re not freed by the garbage collector) must override Finalize and use it to free those resources. This, too, has some important implications for developers. I ll say more later in this chapter in the section entitled Nondeterministic Destruction.

Incidentally, classes can derive from at most one other class, but they can derive from one class and any number of interfaces. When you read the documentation for FCL classes, don t be surprised if you occasionally see long lists of base classes, which really aren t classes at all, but interfaces. Also be aware that if you don t specify a base class when declaring a class, your class derives implicitly from System.Object. Consequently, you can call ToString and other System.Object methods on any object.

Structs

Classes are intended to represent complex data types. Because class instances are allocated on the managed heap, some overhead is associated with creating and destroying them. Some types, however, are simple types that would benefit from being created on the stack, which lives outside the purview of the garbage collector and offers a high-performance alternative to the managed heap. Bytes and integers are examples of simple data types.

That s why the .NET Framework supports value types as well as reference types. In C#, value types are defined with the struct keyword. Value types impose less overhead than reference types because they re allocated on the stack, not the heap. Bytes, integers, and most of the other primitive data types that the CLR supports are value types.

Here s an example of a simple value type:

struct Point { public int x; public int y;

 public Point (int x, int y) { this.x = x; this.y = y; } }

Point stores x and y coordinates in fields exposed directly to clients. It also defines a constructor that can be used to instantiate and initialize a Point in one operation. A Point can be instantiated in any of the following ways:

Point point = new Point (3, 4); // x==3, y==4 Point point = new Point (); // x==0, y==0 Point point; // x==0, y==0

Note that even though the first two statements appear to create a Point object on the heap, in reality the object is created on the stack. If you come from a C++ heritage, get over the notion that new always allocates memory on the heap. Also, despite the fact that the third statement creates a Point object whose fields hold zeros, C# considers the Point to be uninitialized and won t let you use it until you explicitly assign values to x and y.

Value types are subject to some restrictions that reference types are not. Value types can t derive from other types, although they implicitly derive from System.ValueType and can (and often do) derive from interfaces. They also shouldn t wrap unmanaged resources such as file handles because value types have no way to release those resources when they re destroyed. Even though value types inherit a Finalize method from System.Object, Finalize is never called because the garbage collector ignores objects created on the stack.

Interfaces

An interface is a group of zero or more abstract methods methods that have no default implementation but that are to be implemented in a class or struct. Interfaces can also include properties and events, although methods are far more common.

An interface defines a contract between a type and users of that type. For example, many of the classes in the System.Collections namespace derive from an interface named IEnumerable. IEnumerable defines methods for iterating over the items in a collection. It s because the FCL s collection classes implement IEnumerable that C# s foreach keyword can be used with them. At run time, the code generated from foreach uses IEnumerable s GetEnumerator method to iterate over the collection s contents.

Interfaces are defined with C# s interface keyword:

interface ISecret { void Encrypt (byte[] inbuf, out byte[] outbuf, Key key); void Unencrypt (byte[] inbuf, out byte[] outbuf, Key key); }

A class or struct that wants to implement an interface simply derives from it and provides concrete implementations of its methods:

class Message : ISecret { public void Encrypt (byte[] inbuf, out byte[] outbuf, Key key) { ... } public void Unencrypt (byte[] inbuf, out byte[] outbuf, Key key) { ... } }

In C#, the is keyword can be used to determine whether an object implements a given interface. If msg is an object that implements ISecret, then in this example, is returns true; otherwise, it returns false:

if (msg is ISecret) { ISecret secret = (ISecret) msg; secret.Encrypt (...); }

The related as operator can be used to test an object for an interface and cast it to the interface type with a single statement.

Enumerations

Enumerations in .NET Framework land are similar to enumerations in C++. They re types that consist of a set of named constants, and in C# they re defined with the enum keyword. Here s a simple enumerated type named Color:

enum Color { Red, Green, Blue }

With Color thusly defined, colors can be represented this way:

Color.Red // Red Color.Green // Green Color.Blue // Blue

Many FCL classes use enumerated types as method parameters. For example, if you use the Regex class to parse text and want the parsing to be case-insensitive, you don t pass a numeric value to Regex s constructor; you pass a member of an enumerated type named RegexOptions:

Regex regex = new Regex (exp, RegexOptions.IgnoreCase);

Using words rather than numbers makes your code more readable. Nevertheless, because an enumerated type s members are assigned numeric values (by default, 0 for the first member, 1 for the second, and so on), you can always use a number in place of a member name if you prefer.

The enum keyword isn t simply a compiler keyword; it creates a bona fide type that implicitly derives from System.Enum. System.Enum defines methods that you can use to do some interesting things with enumerated types. For example, you can call GetNames on an enumerated type to enumerate the names of all its members. Try that in unmanaged C++!

Delegates

Newcomers to the .NET Framework often find delegates confusing. A delegate is a type-safe wrapper around a callback function. It s rather simple to write an unmanaged C++ application that crashes when it performs a callback. It s impossible to write a managed application that does the same, thanks to delegates.

Delegates are most commonly used to define the signatures of callback methods that are used to respond to events. For example, the FCL s Timer class (a member of the System.Timers namespace) defines an event named Elapsed that fires whenever a preprogrammed timer interval elapses. Applications that want to respond to Elapsed events pass a Timer object a reference to the method they want called when an Elapsed event fires. The reference that they pass isn t a raw memory address but rather an instance of a delegate that wraps the method s memory address. The System.Timers namespace defines a delegate named ElapsedEventHandler for precisely that purpose.

If you could steal a look at the Timer class s source code, you d see something like this:

public delegate void ElapsedEventHandler (Object sender, ElapsedEventArgs e); public class Timer { public event ElapsedEventHandler Elapsed; . . . }

Here s how Timer fires an Elapsed event:

if (Elapsed != null) // Make sure somebody's listening Elapsed (this, new ElapsedEventArgs (...)); // Fire!

And here s how a client might use a Timer object to call a method named UpdateData every 60 seconds:

Timer timer = new Timer (60000); timer.Elapsed += new ElapsedEventHandler (UpdateData); . . . void UpdateData (Object sender, ElapsedEventArgs e) { // Callback received! }

As you can see, UpdateData conforms to the signature specified by the delegate. To register to receive Elapsed events, the client creates a new instance of ElapsedEventHandler that wraps UpdateData (note the reference to UpdateData passed to ElapsedEventHandler s constructor) and wires it to timer s Elapsed event using the += operator. This paradigm is used over and over in .NET Framework applications. Events and delegates are an important feature of the type system.

In practice, it s instructive to know more about what happens under the hood when a compiler encounters a delegate definition. Suppose the C# compiler encounters code such as this:

public delegate void ElapsedEventHandler (Object sender, ElapsedEventArgs e);

It responds by generating a class that derives from System.MulticastDelegate. The delegate keyword is simply an alias for something that in this case looks like this:

public class ElapsedEventHandler : MulticastDelegate { public ElapsedEventHandler (object target, int method) { ... } public virtual void Invoke (object sender, ElapsedEventArgs e) { ... } ... }

The derived class inherits several important members from MulticastDelegate, including private fields that identify the method that the delegate wraps and the object instance that implements the method (assuming the method is an instance method rather than a static method). The compiler adds an Invoke method that calls the method that the delegate wraps. C# hides the Invoke method and lets you invoke a callback method simply by using a delegate s instance name as if it were a method name.

Boxing and Unboxing

The architects of the .NET Framework could have made every type a reference type, but they chose to support value types as well to avoid imposing undue overhead on the use of integers and other primitive data types. But there s a downside to a type system with a split personality. To pass a value type to a method that expects a reference type, you must convert the value type to a reference type. You can t convert a value type to a reference type per se, but you can box the value type. Boxing creates a copy of a value type on the managed heap. The opposite of boxing is unboxing, which, in C#, duplicates a reference type on the stack. Common intermediate language (CIL) has instructions for performing boxing and unboxing.

Some compilers, the C# and Visual Basic .NET compilers among them, attempt to provide a unified view of the type system by hiding boxing and unboxing under the hood. The following code wouldn t work without boxing because it stores an int in a Hashtable object, and Hashtable objects store references exclusively:

Hashtable table = new Hashtable (); // Create a Hashtable table.Add ("First", 1); // Add 1 keyed by "First"

Here s the CIL emitted by the C# compiler:

newobj instance void [mscorlib]System.Collections.Hashtable::.ctor() stloc.0 ldloc.0 ldstr  "First" ldc.i4.1 box [mscorlib]System.Int32 callvirt instance void [mscorlib]System.Collections.Hashtable::Add(object, object)

Notice the BOX instruction that converts the integer value 1 to a boxed value type. The compiler emitted this instruction so that you wouldn t have to think about reference types and value types. The string used to key the Hashtable entry ( First ) doesn t have to be boxed because it s an instance of System.String, and System.String is a reference type.

Many compilers are happy to box values without being asked to. For example, the following C# code compiles just fine:

int val = 1; // Declare an instance of a value type object obj = val; // Box it

But in C#, unboxing a reference value requires an explicit cast:

int val = 1; object obj = val; int val2 = obj; // This won't compile int val3 = (int) obj; // This will

You lose a bit of performance when you box or unbox a value, but in the vast majority of applications, such losses are more than offset by the added efficiency of storing simple data types on the stack rather than in the garbage-collected heap.

Reference Types vs. Value Types

Thanks to boxing and unboxing, the dichotomy between value types and reference types is mostly transparent to the programmer. Sometimes, however, you must know which type you re dealing with; otherwise, subtle differences between the two can impact your application s behavior in ways that you might not expect.

Here s an example. The following code defines a simple reference type (class) named Point. It also declares two Point references, p1 and p2. The reference p1 is initialized with a reference to a new Point object, and p2 is initialized by setting it equal to p1. Because p1 and p2 are little more than pointers in disguise, setting one equal to the other does not make a copy of the Point object; it merely copies an address. Therefore, modifying one Point affects both:

class Point { public int x; public int y; } . . . Point p1 = new Point (); p1.x = 1; p1.y = 2; Point p2 = p1; // Copies the underlying pointer p2.x = 3; p2.y = 4;

Console.WriteLine ("p1 = ({0}, {1})", p1.x, p1.y); // Writes "(3, 4)" Console.WriteLine ("p2 = ({0}, {1})", p2.x, p2.y); // Writes "(3, 4)"

The next code fragment is identical to the first, save for the fact that Point is now a value type (struct). But because setting one value type equal to another creates a copy of the latter, the results are quite different. Changes made to one Point no longer affect the other:

struct Point { public int x; public int y; } . . . Point p1 = new Point (); p1.x = 1; p1.y = 2; Point p2 = p1; // Makes a new copy of the object on the stack p2.x = 3; p2.y = 4;

Console.WriteLine ("p1 = ({0}, {1})", p1.x, p1.y); // Writes "(1, 2)" Console.WriteLine ("p2 = ({0}, {1})", p2.x, p2.y); // Writes "(3, 4)"

Sometimes differences between reference types and value types are even more insidious. For example, if Point is a value type, the following code is perfectly legal:

Point p; p.x = 3; p.y = 4;

But if Point is a reference type, the very same instruction sequence won t even compile. Why? Because the statement

Point p;

declares an instance of a value type but only a reference to a reference type. A reference is like a pointer it s useless until it s initialized, as in the following:

Point p = new Point ();

Programmers with C++ experience are especially vulnerable to this error because they see a statement that declares a reference and automatically assume that an object is being created on the stack.

The FCL contains a mixture of value types and reference types. Clearly, it s sometimes important to know which type you re dealing with. How do you know whether a particular FCL type is a value type or a reference type? Simple. If the documentation says it s a class (as in String Class ), it s a reference type. If the documentation says it s a structure (for example, DateTime Structure ), it s a value type. Be aware of the difference, and you ll avoid frustrating hours spent in the debugger trying to figure out why code that looks perfectly good produces unpredictable results.

Nondeterministic Destruction

In traditional environments, objects are created and destroyed at precise, deterministic points in time. As an example, consider the following class written in unmanaged C++:

class File { protected: int Handle; // File handle public: File (char* name) { // TODO: Open the file and copy the handle to Handle } ~File () { // TODO: Close the file handle } };

When you instantiate this class, the class constructor is called:

File* pFile = new File ("Readme.txt");

And when you delete the object, its destructor is called:

delete pFile;

If you create the object on the stack instead of the heap, destruction is still deterministic because the class destructor is called the moment the object goes out of scope.

Destruction works differently in the .NET Framework. Remember, you create objects, but you never delete them; the garbage collector deletes them for you. But therein lies a problem. Suppose you write a File class in C#:

class File { protected IntPtr Handle = IntPtr.Zero; public File (string name) { // TODO: Open the file and copy the handle to Handle } ~File () { // TODO: Close the file handle } }

Then you create a class instance like this:

File file = new File ("Readme.txt");

Now ask yourself a question: when does the file handle get closed?

The short answer is that the handle gets closed when the object is destroyed. But when is the object destroyed? When the garbage collector destroys it. When does the garbage collector destroy it? Ah there s the key question. You don t know. You can t know because the garbage collector decides on its own when to run, and until the garbage collector runs, the object isn t destroyed and its destructor isn t called. That s called nondeterministic destruction, or NDD. Technically, there s no such thing as a destructor in managed code. When you write something that looks like a destructor in C#, the compiler actually overrides the Finalize method that your class inherits from System.Object. C# simplifies the syntax by letting you write something that looks like a destructor, but that arguably makes matters worse because it implies that it is a destructor, and to unknowing developers, destructors imply deterministic destruction.

Deterministic destruction doesn t exist in framework applications unless your code does something really ugly, like this:

GC.Collect ();

GC is a class in the System namespace that provides a programmatic interface to the garbage collector. Collect is a static method that forces a collection. Garbage collecting impedes performance, so now that you know that this method exists, forget about it. The last thing you want to do is write code that simulates deterministic destruction by calling the garbage collector periodically.

NDD is a big deal because failure to account for it can lead to all sorts of run-time errors in your applications. Suppose someone uses your File class to open a file. Later on that person uses it to open the same file again. Depending on how the file was opened the first time, it might not open again because the handle is still open if the garbage collector hasn t run.

File handles aren t the only problem. Take bitmaps, for instance. The FCL features a handy little class named Bitmap (it s in the System.Drawing namespace) that encapsulates bitmapped images and understands a wide variety of image file formats. When you create a Bitmap object on a Windows machine, the Bitmap object calls down to the Windows GDI, creates a GDI bitmap, and stores the GDI bitmap handle in a field. But guess what? Until the garbage collector runs and the Bitmap object s Finalize method is called, the GDI bitmap remains open. Large GDI bitmaps consume lots of memory, so it s entirely conceivable that after the application has run for a while, it ll start throwing exceptions every time it tries to create a bitmap because of insufficient memory. End users won t appreciate an image viewer utility (like the one you ll build in Chapter 4) that has to be restarted every few minutes.

So what do you do about NDD? Here are two rules for avoiding the NDD blues. The first rule is for programmers who use (rather than write) classes that encapsulate file handles and other unmanaged resources. Most such classes implement a method named Close or Dispose that releases resources that require deterministic closure. If you use classes that wrap unmanaged resources, call Close or Dispose on them the moment you re finished using them. Assuming File implements a Close method that closes the encapsulated file handle, here s the right way to use the File class:

File file = new File ("Readme.txt"); . . . // Finished using the file, so close it file.Close ();

The second rule, which is actually a set of rules, applies to developers who write classes that wrap unmanaged resources. Here s a summary:

  • Implement a protected Dispose method (hereafter referred to as the protected Dispose ) that takes a Boolean as a parameter. In this method, free any unmanaged resources (such as file handles) that the class encapsulates. If the parameter passed to the protected Dispose is true, also call Close or Dispose (the public Dispose inherited from IDisposable) on any class members (fields) that wrap unmanaged resources.

  • Implement the .NET Framework s IDisposable interface, which contains a single method named Dispose that takes no parameters. Implement this version of Dispose (the public Dispose ) by calling GC.SuppressFinalize to prevent the garbage collector from calling Finalize, and then calling the protected Dispose and passing in true.

  • Override Finalize. Finalize is called by the garbage collector when an object is finalized that is, when an object is destroyed. In Finalize, call the protected Dispose and pass in false. The false parameter is important because it prevents the protected Dispose from attempting to call Close or the public Dispose on any encapsulated class members, which may already have been finalized if a garbage collection is in progress.

  • If it makes sense semantically (for example, if the resource that the class encapsulates can be closed in the manner of a file handle), implement a Close method that calls the public Dispose.

Based on these principles, here s the right way to implement a File class:

class File : IDisposable { protected IntPtr Handle = IntPtr.Zero; public File (string name) { // TODO: Open the file and copy the handle to Handle. } ~File () { Dispose (false); } public void Dispose () { GC.SuppressFinalize (this); Dispose (true); } protected virtual void Dispose (bool disposing) { // TODO: Close the file handle. if (disposing) { // TODO: If the class has members that wrap // unmanaged resources, call Close or Dispose on // them here. } } public void Close () { Dispose (); } }

Note that the destructor actually, the Finalize method now calls the protected Dispose with a false parameter, and that the public Dispose calls the protected Dispose and passes in true. The call to GC.SuppressFinalize is both a performance optimization and a measure to prevent the handle from being closed twice. Because the object has already closed the file handle, there s no need for the garbage collector to call its Finalize method. It s still important to override the Finalize method to ensure proper disposal if Close or Dispose isn t called.



Programming Microsoft  .NET
Applied MicrosoftNET Framework Programming in Microsoft Visual BasicNET
ISBN: B000MUD834
EAN: N/A
Year: 2002
Pages: 101

flylib.com © 2008-2017.
If you may any questions please contact us: flylib@qtcs.net