Chapter 5: Constructors and the Object Lifecycle | C# Class Design Handbook: Coding Effective Classes

Overview

Developers new to C# who have had experience in other object-oriented languages such as Java should not be surprised with the level of support for creating objects on the managed heap. Java developers have wallowed in the ideal that memory management is a function of the JVM and construction and destruction are managed processes that should be abstracted from the developer. The equivalent C++ developers, on the other hand, would have to manage memory themselves, which more often than not caused problems.

In C# we can use both managed and unmanaged code, which allows us to decide if we wish to create an object in managed code and not worry about the object's lifetime (because the CLR will take care of it for us), or create an object in unmanaged code and destroy it and reclaim the memory ourselves. In this chapter we will analyze all the ways of creating and destroying objects in C#; some are considered safe and others are considered unsafe. The hype surrounding the CLR is warranted in that the CLR memory allocation and de-allocation processes are far more efficient that we could ever hope to be ourselves.

In this chapter, we will look at the lifecycle of an object; specifically we will discuss the following:

How objects are created on the managed heap and how to define and use constructors
How objects are destroyed, including how to change the destruction pattern
How to write efficient object construction code
How to creating deep and shallow copies of objects
How to use serialization and deserialization to construct and destroy objects
Finally, we'll analyze the various Design Patterns used in object construction

The concept of automatic memory management is fundamental to the CLR, which will allocate memory on the heap for managed objects and use the Garbage Collector (GC) to deallocate the same memory for an object during Managed Execution of code – by Managed Execution we refer to the CLR's lifecycle management of an object (its creation and destruction) rather than our management of heap memory. When writing managed code, memory leaks (which were common with C++) disappear since we are no longer required to free up memory.

When managed code is executed, a managed heap is created so that memory can be allocated in order to create class instances (the managed heap is simply an area of memory managed by the CLR and GC). When faced with creating objects in C++, we generally have the choice of two areas. The first is the stack, in which each thread has its own limited quantity of fast memory. The second is the heap, and it is more practical to create an object on the heap. Unlike the stack space, the heap space doesn't have a size limitation defined as a fixed quantity by the OS; its amount is defined by the machine hardware and operational environment – other processes and memory consumption. However, object creation and destruction is slower than accessing memory on the stack.

The CLR is designed to optimize object creation by managing the heap address through the use of a pointer to the next available address after the object is created. The benefit of this is that the CLR will allocate memory based on the next available address space and increment a pointer, so that it knows where to allocate next. This system is much faster than the equivalent unmanaged heap and is supposedly nearly as fast as creating an object on the stack – this is due to the CLR not needing to search for a free block of memory to place an object in, since the managed heap pointer moves directly to the next block of memory after a new object has been created, thus cutting out the search time necessary to allocate an object with unmanaged code. The additional advantage is that since objects are stored contiguously, the application can locate objects far faster than it would by just having a pointer to a non-continuous area of memory.

The GC acts as an object manager by keeping an eye on object usage, by freeing up memory by deleting object space and marking it available for reuse. The GC will continually compact the managed heap so that space can be optimized and objects can be stored contiguously. Any references created to objects within code will be managed by the runtime, making the location of the object on the managed heap completely transparent to the developer. In C# and C++ .NET, this allocation and compaction of the heap can be a problem for objects that are created using unsafe code, which use direct access to memory, by returning pointers to heap memory for newly created objects. The runtime will be unable to update the references to the unmanaged object, and when compaction occurs we could end up with a pointer to an incorrect memory location. There are techniques that can be used to fix the object location so that the GC doesn't move it in memory and this process is called pinning. In the following example, the char* variable pre is pinned – this can be done with all unmanaged pointers. This file is called string.cs and has to be complied with an /unsafe switch:

     using System;     class preString     {       static unsafe string AddPreString(string s)       {         fixed(char* pre = "A")         {           return (s + *pre);         }       }       [STAThread]       static void Main()       {         Console.WriteLine(preString.AddPreString("PostString "));       }     }

The following output is shown when we run the program:

     C:\Class Design\Ch 05>string     PostString A

The compaction occurs when the runtime finds many objects with null references. These are objects that are not in use by the application and are considered unreachable by the GC. The GC is written to take advantage of the idea of generations, which enable it to free memory to the managed heap. The .NET Framework documentation defines the following three high-level rules, which make the idea of categorizing the managed heap into different generations optimal.

The memory compaction processes functions quicker by compacting a subset of the managed heap rather than the entire heap. This is done since it will only act on a single generation of objects (the term generation with reference to the GC is explained in the section following).
It has been shown that newer objects (objects which have been created at some point during execution but not right at the beginning) will have shorter lifetimes and older ones will likely have longer lifetimes.
Newer objects tend to be related to each other generally, and this makes their storage contiguous and them likely to always be released in the lowest-level generation.

These rules are the fundamental reason for the generations idea of the GC. When we create an object, it will be created in generation 0, however, there will come a time when there will be no space on the managed heap to allocate any other objects within generation 0. At this stage, the GC will attempt to reclaim memory in generation 0 and probably compact enough memory to allow the application to continue creating objects. Those that survive the collection will be promoted to generation 1 and later generation 2 (these are just different areas of the managed heap) and are considered longer lasting objects so collection will occur on these generations less frequently. Despite the efficient memory deallocation process due to the GC, it is possible (though unlikely) that the GC will be unable to clean up the managed heap faster than objects are allocated to it. In this case, the GC will be unable to allocate memory for a new object and the application will throw an OutOfMemoryException.

Although the process is managed and we don't have any direct control of it, it is still useful to understand that there are things that we can do to aid the process. Any weaknesses actually lie within resources not managed by the GC, such as file handles and database connections. Later in this chapter, we shall look at the tools that are provided for us to write efficient code to release references to these types of unmanaged objects.