Managed and Unmanaged Data | Microsoft Visual C++ .NET 2003 Kick Start

When you are writing a unmanaged application, you have only unmanaged data. When you write a managed application in Visual Basic or C#, you will have only managed data. But when you write a managed C++ application, you control whether your data is managed or unmanaged. There are benefits to managed data, but to gain those benefits you have to accept some restrictions.

Unmanaged Data

This term is just a way to describe the way data has always been handled in C++ programming. You can invent a class:

 class ArithmeticClass { public:     double Add( double num1,  double num2); };

You can create an instance of your class on the stack and use it, like this:

 ArithmeticClass arith; arith.Add(2,2);

Or you can create an instance on the heap, and use it through a pointer:

 ArithmeticClass* pArith = new ArithmeticClass(); pArith->Add(2,2);

When the flow of control leaves the block in which the stack variable was declared, the class' destructor runs and the memory becomes available on the stack again. The heap variable must be explicitly cleared away:

 delete pArith;

And this, of course, is the rub. Memory management is your job in classic C++. What you create with new you must clean up with delete . If you forget, you can suffer a memory leak.

What's more, when you use an unmanaged pointer, you can make all kinds of mistakes that overwrite other memory or otherwise corrupt your application. When your data is unmanaged, it's up to you to manage it. That can be a lot of work, and if you mess up, you might create a subtle bug that's hard to find.

Garbage-Collected Classes

If you create a class as a garbage-collected class, the runtime will manage the memory for you. In exchange, you give up the capability to create an instance on the stack, and you have to follow some rules when designing your class.

You set garbage collection class-by-class, not object-by-object. When you define the class, include the __gc keyword, like this:

 __gc class Sample { private:     double number; public:     Sample( double num): number(num){}; };

Having defined your class in this way, you cannot create instances on the stack any more. If you do, this compiler error results:

 error C3149: 'Sample' : illegal use of managed type 'Sample'; did you forget a '*'?

Instead, you can only allocate instances of a garbage-collected on the heap, with new . For example

 Sample* s = new Sample(3.2);

When you're finished with this instance, just ignore it. When the pointer goes out of scope, the instance will be eligible for cleanup by the garbage collector. If you do try to delete this instance, you'll get a compiler error:

[View full width]

  [View full width] 
 error C3841: illegal delete expression: managed type 'Sample' does not have a destructor  defined

Just leave the pointer alone and let the runtime manage your memory. It will be cleaned up eventually.

SHOP TALK: DETERMINISTIC DESTRUCTION

One of the hallmarks of a garbage-collected runtime is that you don't know when an object in memory will be cleaned up. If the application uses a lot of memory and is constantly allocating more, anything that's eligible to be cleaned up will be destructed quickly. In an application that doesn't use much memory and runs for a long long time, it might be hours or days before an object is cleaned up.

C++ programmers are used to putting clean-up code in the destructor. They know when the destructor will run: when the object goes out of scope or when they use the delete operator on the pointer. This is called deterministic destruction , because programmers know when the destructor will run. But when you use managed data, you have no idea when the destructor will execute, which is called indeterministic destruction. In this case, having clean-up code is a bad idea. For example, if the destructor closes a file, it will sit open until the destructor runsand that might be a really long time.

I spent quite a few nights up late with friends arguing about deterministic destruction. Finally, I came to an opinion. If a managed class holds only memorythings you would normally clean up with delete don't write a destructor and don't worry about destruction, deterministic or otherwise. If the class holds a non-memory resource (an open file, a database connection, a resource lock of some kind, or anything else that isn't memory and therefore isn't garbage collected), don't use a destructor eitherimplement the Dispose pattern. The heart of this pattern is a method called Dispose that cleans up the non-memory resource (closes a file, for example). Code that creates instances of your class should call the Dispose method when it's through with the instance. You can write a destructor as a sort of backup plan in case client code forgets to dispose your object.

Making your class a garbage-collected class is not always as simple as adding the __gc keyword to the definition. There are some restrictions about the way you can define your class if you want it to be garbage-collected, and some restrictions on the way you use it, too.

Inheritance Restrictions

When you write an unmanaged class, you can inherit from any kind of base class at all, or no base class if you prefer. Garbage-collected classes don't have quite the same freedom. Specifically, a garbage-collected class cannot inherit from an unmanaged class.

Consider this pair of classes:

 class A { protected:     int a; }; __gc class Sample: public A { public:     Sample(int x) : a(x) {} };

This looks like perfectly good C++. And it would be, if it weren't for that __gc extension keyword in the definition of Sample . If you type this code and try to compile it, you'll be told

 error C3253: 'A' : a managed class cannot derive from an unmanaged class

To fix the problem, make the base class managed if possible, or leave the derived class unmanaged.

Single Inheritance

Not only must garbage-collected classes inherit only from other garbage-collected classes, but they also can't use multiple inheritance. Consider this variant on the Sample class shown earlier:

 __gc class A { protected:     int a; }; __gc class B { protected:     int b; }; __gc class Sample: public A, public B { public:     Sample(int x, int y) : a(x), b(y) {} };

Compiling these classes produces this error:

 error C2890: 'Sample' : managed class can only have one non-interface superclass

You can, if you want, inherit from as many managed interfaces as you need to, but you can't use traditional multiple inheritance. Most C++ programmers don't use multiple inheritance. If you're such a programmer, it's no great loss to give it up. If you have working code that uses multiple inheritance, it's best to leave it as unmanaged and access it from new managed code. You'll see how to do that later in this book.

Additional Restrictions on Garbage-Collected Classes

Garbage-collected classes have some other restrictions:

You cannot use the friend keyword to give another class access to the private members of a managed class.
No member variable of the class can be an instance of an unmanaged class (unless all the member functions of the unmanaged class are static).
You cannot override operator & or operator new .
You cannot implement a copy constructor.

At first glance, these might seem restrictive . But remember that the reason you usually write a copy constructor is that your destructor does something destructive, such as freeing memory. A garbage-collected class probably has no destructor at all, and therefore has no need for a specialized copy constructor.

Value Classes

Many programmers feel uncomfortable when they are told they cannot create instances of a garbage-collected class on the stack. There are a number of advantages, in "classic" C++, to creating an object on the stack:

The object is destructed for you when it goes out of scope.
The overhead of allocating on the stack is slightly less than allocating on the heap.
The heap can get fragmented (and therefore you will have a performance hit) if you allocate and free a lot of short-lived objects.

In managed C++ (in other words, C++ with Managed Extensions), the garbage collector takes care of destructing the object for you. It can also defragment the heap. The garbage collector introduces overhead of its own, of course, and the allocation cost difference between the stack and the heap remains. So, for certain kinds of objects, it might be a better choice to use a value class rather than a garbage-collected class.

The fundamental types, such as int , are referred to as value types , because they are allocated on the stack. You can define a simple class of yours to be a value class. You can also do the same for a struct . If your class mainly exists to hold a few small member variables , and doesn't have a complicated lifetime, it's a good candidate for a value class.

Here is Sample once again as a value class:

 __value class Sample { public:     int a;     int b;     Sample(int x, int y) : a(x),b(y) {} };

To create and use an instance of Sample , you must allocate it on the stack, not the heap:

 Sample s(2,4); s.Report();

These value classes are still managed, but they're not garbage-collected. The restrictions on value classes are

You cannot use the friend keyword to give another class access to the private members of a managed class.
No member variable of the class can be an instance of an unmanaged class (unless all the member functions of the unmanaged class are static).
You cannot implement a copy constructor.
A value class cannot inherit from a garbage-collected class, or another value class, or an unmanaged class. It can inherit from any number of managed interfaces.
A value class cannot have virtual methods other than those it inherits (and possibly overrides ) from System::ValueType .
No class can inherit from a value class.
A value class cannot be labeled abstract with the __abstract keyword.

The best candidates for value classes are small classes whose objects don't exist for long and aren't passed around from method to method (thus creating a lot of references to them in different pieces of code).

Another advantage of value classes is that instances are essentially never uninitialized . When you allocate an instance of Sample on the stack, you can pass parameters to the constructor. But if you do not, the member variables are initialized to zero for you. This is true even if you wrote a constructor that takes arguments, and didn't write a constructor that doesn't take arguments. Look at these two lines of code:

 Sample s2; s2.Report();

STACK AND HEAP

If you've been reading or hearing about Visual Studio.NET already, you might have heard that structs are value types and classes are reference types, or that structs are on the stack and objects are on the heap. Those statements apply to C#. In C++, you're in chargeyou can allocate instances of value types (classes or structs ) on the stack and instances of garbage-collected types (classes or structs ) on the heap.

When this code runs, it will report

 a is 0 and b is 0

Because the Sample class is managed, the members are initialized automatically.

Pinning and Boxing

The class libraries that come with the .NET Framework are terrificthey provide functionality every application needs. It's natural to use them from your managed C++ applications. One thing you need to know: The methods in these classes are expecting pointers to garbage-collected objects, not the unmanaged data your application might be using. What's more, when you use older libraries, the methods are expecting instances or pointers to instances of unmanaged variables, not instances of managed classes or member variables of those managed instances. When you mix managed and unmanaged data, you need to use two new .NET concepts: boxing and pinning.

Boxing a Fundamental Type

If you try to pass a piece of unmanaged data to a method that is expecting managed data, the compiler will reject your attempt. Consider this fragment of code:

 int i = 3; System::Console::Write("i is "); System::Console::WriteLine(i);

Simple as it looks, this code won't compile. The error message is

[View full width]

  [View full width] 
 error C2665: 'System::Console::WriteLine' : none of the 18 overloads can convert parameter  2 from type 'int'         could be 'void System::Console::WriteLine (System::String __gc *,System::Object  __gc *)'

You need to pass a pointer to a garbage-collected object to the WriteLine method. The way you get a pointer to a garbage-collected object is to use the __box extension, like this:

 System::Console::WriteLine(__box(i));

This is referred to as boxing the integer, and is a convenient way to use framework methods even from an application that has some unmanaged data.

Pinning a Pointer

Sometimes your problem is the other way around. You have a function, already written, that expects a pointer of some sort. If this is legacy code, it doesn't expect a pointer that can move around; it expects a classic pointer to an unmanaged object. Even a pointer to an integer member variable of a managed object can be moved by the garbage collector when it moves the entire instance of the object.

Consider this simple managed class, another variation on Sample used throughout this chapter:

 __gc class Sample { public:    int a;    int b;    Sample(int x, int y) : a(x), b(y) {} };

You might argue about whether it's a good idea for the member variables a and b to be public, but it makes this code simpler to do so. Consider this simple function, perhaps one that was written in an earlier version of Visual C++:

 void Equalize(int* a, int* b) {     int avg = (*a + *b)/2 ;     *a = avg;     *b = avg; }

Say that you want to use this Equalize() function on the two member variables of an instance of this Sample class:

 Sample* s = new Sample(2,4); Equalize(&(s->a),&(s->b));

This code won't compile. The error is

 error C2664: 'Equalize' : cannot convert parameter 1 from 'int __gc *' to 'int *'         Cannot convert a managed type to an unmanaged type

Although they are both pointers, the pointer to a garbage-collected type cannot be converted to a pointer to an unmanaged type. What you can do is pin the pointer. This creates another pointer you can pass to the function, and ensures that the garbage collector will not move the instance (in this case, s ) for the life of the pinned pointer. Here's how it's done:

 int __pin* pa = &(s->a); int __pin* pb = &(s->b); Equalize(pa,pb);

The two new pointers, pa and pb , are pointers to unmanaged types, so they can be passed to Equalize() . They point to the location in memory where the member variables of s are kept, and the garbage collector will not move ( unpin ) the instance, s , until pa and pb go out of scope. You can unpin the instance sooner by deliberately setting pa and pb to 0, a null pointer, like this:

 pa=0; pb=0;

Boxing unmanaged data into temporary managed instances, and pinning managed data to obtain a pointer to unmanaged data, are both ways to deal with differences between the managed and the unmanaged world. When you remember that you have these extensions available to you, the task of mixing old and new programming techniques and libraries becomes much simpler.