|
|||||||||||
| Chapter 1 - C# and .NET Architecture | |
| bySimon Robinsonet al. | |
| Wrox Press 2002 | |
From what we learned in the previous section, Intermediate Language obviously plays a fundamental role in the .NET Framework. As C# developers, we now understand that our C# code will be compiled into Intermediate Language before it is executed (indeed, the C# compiler only compiles to managed code). It makes sense, then, that we should now take a closer look at the main characteristics of IL, since any language that targets .NET would logically need to support the main characteristics of IL too.
Here are the important features of the Intermediate Language:
Object-orientation and use of interfaces
Strong distinction between value and reference types
Strong data typing
Error handling through the use of exceptions
Use of attributes
Let's now take a closer look at each of these characteristics.
The language independence of .NET does have some practical limits. In particular, IL, however it is designed, is inevitably going to implement some particular programming methodology, which means that languages targeting it are going to have to be compatible with that methodology. The particular route that Microsoft has
Those readers unfamiliar with the concepts of Object Orientation should refer to Appendix A for more information.
Besides classic object-oriented programming, Intermediate Language also
We have now seen that working with .NET means compiling to the Intermediate Language, and that in
To start with, we need to consider exactly what we mean by language interoperability. After all, COM allowed components written in different languages to work together in the sense of calling each other's methods. What was inadequate about that? COM, by virtue of being a binary standard, did allow components to instantiate other components and call methods or properties against them, without worrying about the language the respective
An associated problem was that, when debugging, you would still have to independently debug components written in different languages. It was not possible to step between languages in the debugger. So what we really mean by language interoperability is that classes written in one language should be able to talk directly to classes written in another language. In particular:
A class written in one language can inherit from a class written in another language
The class can contain an instance of another class, no matter what the languages of the two classes are
An object can directly call methods against another object written in another language
Objects (or references to objects) can be passed around between methods
When calling methods between languages we can step between the method calls in the debugger, even where this means stepping between sourcecode written in different languages
This is all quite an ambitious aim, but amazingly, .NET and the Intermediate Language have achieved it. For the case of stepping between methods in the debugger, this facility is really
As with any programming language, IL provides a number of predefined primitive data types. One characteristic of Intermediate Language however, is that it makes a strong distinction between value and reference types. Value types are those for which a variable directly stores its data, while reference types are those for which a variable simply stores the address at which the corresponding data can be found.
In C++ terms, reference types can be
One very important aspect of IL is that it is based on exceptionally
strong data typing
. What we mean by that is that all
For instance, VB developers will be used to being able to pass variables around without worrying too much about their types, because VB automatically
You should note that some languages compatible with .NET, such as VB.NET, still allow some laxity in typing, but that is only possible because the compilers behind the scenes ensure the type safety is enforced in the
Although enforcing type safety might initially appear to hurt performance, in many cases this is far outweighed by the benefits
Language Interoperability
Garbage Collection
Security
Application Domains
Let's take a closer look at why strong data typing is particularly important for these features of .NET.
One important reason that strong data typing is important is that, if a class is to derive from or contains instances of other classes, it needs to know about all the data types used by the other classes. Indeed, it is the absence of any agreed system for specifying this information in the past that has always been the real
Suppose that one of the methods of a VB.NET class is defined to return an
Integer
- one of the standard data types available in VB.NET. C# simply does not have any data type of that
This data type problem is
For the example that we were considering before, VB.NET's Integer is actually a 32-bit signed integer, which maps exactly to the IL type known as Int32 . This will therefore be the data type specified in the IL code. Because the C# compiler is aware of this type, there is no problem. At source code level, C# refers to Int32 with the keyword int , so the compiler will simply treat the VB.NET method as if it returned an int .
The CTS doesn't merely specify primitive data types, but a rich hierarchy of types, which includes
The types in this tree represent:
|
Type |
Meaning |
|---|---|
|
Type |
Base class that represents any type. |
|
Value Type |
Base class that represents any value type. |
|
Reference Types |
Any data types that are accessed through a reference and stored on the heap. |
|
Built-in Value Types |
Includes most of the standard primitive types, which represent
|
|
Enumerations |
Sets of enumerated values. |
|
|
Types that have been defined in sourcecode and are stored as value types. In C# terms, this means any struct. |
|
Interface Types |
Interfaces. |
|
Pointer Types |
Pointers. |
|
|
Data types that provide information about
|
|
Arrays |
Any type that contains an array of objects. |
|
Class Types |
Types that are self-describing but are not arrays. |
|
Delegates |
Types that are designed to hold references to methods. |
|
User-defined Reference Types |
Types that have been defined in sourcecode and are stored as reference types. In C# terms, this means any class. |
|
Boxed Value Types |
A value type that is temporarily wrapped in a reference so that it can be stored on the heap. |
We won't list all of the built-in value types here, because they are covered in detail in Chapter 2. In C#, each predefined type recognized by the compiler maps onto one of the IL built-in types. The same is true in VB.NET.
The Common Language Specification works with the Common Type System to ensure language interoperability. The CLS is a set of minimum standards that all compilers targeting .NET must support. Since IL is a very rich language, writers of most compilers will prefer to restrict the capabilities of a given compiler to only support a subset of the facilities offered by IL and the CTS. That is fine, as long as the compiler supports everything that is defined in the CLS.
| Important |
It is
|
An example is provided by case sensitivity. IL is case-sensitive. Developers who work with case-sensitive languages regularly take advantage of the flexibility this case sensitivity gives them when selecting variable names. VB.NET, however, is not case sensitive. The CLS works around this by indicating that CLS-compliant code should not expose any two
This example shows that the CLS works in two ways. First, it means that individual compilers do not have to be powerful enough to support the full features of .NET - this should
The beauty of this idea is that the restriction to using CLS-compliant features only applies to public and protected
We won't go into the details of the CLS specifications here. In general, the CLS won't affect your C# code very much, because there are very few non-CLS-compliant features of C# anyway.
The
garbage collector
is .NET's answer to memory management, and in particular to the question of what to do about reclaiming memory that running applications ask for. Up until now there have been two techniques used on Windows platform for deallocating memory that processes have dynamically
Make the application code do it all manually
Make objects maintain reference counts
Having the application code responsible for de-allocating memory is the technique used by lower-level, high-performance languages such as C++. It is efficient, and it has the advantage that (in general) resources are never occupied for longer than unnecessary. The big
Although modern developer environments do provide tools to assist in detecting memory leaks, they
Maintaining reference counts is favored in COM. The idea is that each COM component maintains a count of how many clients are currently maintaining references to it. When this count
The .NET runtime relies on the garbage collector instead . This is a program whose purpose is to clean up memory. The idea is that all dynamically requested memory is allocated on the heap (that is true for all languages, although in the case of .NET, the CLR maintains its own managed heap for .NET applications to use). Every so often, when .NET detects that the managed heap for a given process is becoming full and therefore needs tidying up, it calls the garbage collector. The garbage collector runs through variables currently in scope in your code, examining references to objects stored on the heap to identify which ones are accessible from your code - that is to say which objects have references that refer to them. Any objects that are not referred to are deemed to be no longer accessible from your code and can therefore be removed. Java uses a similar system of garbage collection to this.
Garbage collection works in .NET because Intermediate Language has been designed to facilitate the process. The principle requires, firstly, that you cannot get references to existing objects other than by copying existing references, and secondly, that Intermediate Language is type safe. In this context, what we mean is that if any reference to an object exists, then there is sufficient information in the reference to exactly determine the type of the object.
It would not be possible to use the garbage collection mechanism with a language such as unmanaged C++, for example, because C++ allows pointers to be
One aspect of garbage collection that it is important to be aware of is that it is not deterministic. In other words, you cannot guarantee when the garbage collector will be called; it will be called when the CLR decides that it is needed (unless you explicitly call the collector).
.NET can really excel in terms of complementing the security mechanisms provided by Windows because it can offer
Role-based security
is based on the identity of the account under which the process is running, in other words, who owns and is running the process. Code-based security on the other hand is based on what the code actually does and on how much the code is trusted. Thanks to the strong type safety of IL, the CLR is able to inspect code before running it in order to determine required security permissions. .NET also offers a mechanism by which code can
The importance of
code-based security
is that it
Security issues are covered in more depth later in the book, in Chapter 23.
Application domains are an important innovation in .NET and are designed to ease the overhead involved when running applications that need to be isolated from each other, but which also need to be able to communicate with each other. The classic example of this is a web server application, which may be
In pre-.NET days, the choice would be between allowing those instances to share a process, with the resultant risk of a problem in one running instance bringing the whole web site down, or isolating those instances in separate processes, with the associated performance overhead.
Up until now, the only means of isolating code has been through processes. When you start a new application running, it runs within the context of a process. Windows isolates processes from each other through address spaces. The idea is that each process has available 4 gigabytes of virtual memory in which to store its data and executable code (the figure of 4GB is for 32-bit systems; 64-bit systems will have more). Windows imposes an extra level of indirection by which this virtual memory maps into a particular area of actual physical memory or disk space. Each process will get a different mapping, with no overlap between the actual physical memories that the blocks of virtual address space map to. This situation is shown in the diagram:
In general, any process is only able to access memory by specifying an address in virtual memory - processes do not have direct access to physical memory. Hence it is simply
Processes don't just serve as a way to isolate instances of running code from each other. On Windows NT/2000 systems, they also form the unit to which security privileges and permissions are assigned. Each process has its own security token, which indicates to Windows precisely what operations that process is permitted to do.
While processes are great for security in both of these senses, their big disadvantage is performance. Often a number of processes will actually be working together, and therefore need to communicate with each other. The obvious example of this is where a process calls up a COM component, which is an executable, and therefore is required to run in its own process. The same thing happens in COM when surrogates are used. Since processes cannot share any memory, a complex marshaling process has to be used to copy data between the processes. This gives a very significant hit for performance. If you need components to work together and don't want that performance hit, then the only way up till now has been to use DLL-based components and have everything running in the same address space ( with the associated risk that a badly behaved component will bring everything else down.
Application domains
are designed as a way of separating components without resulting in the performance problems associated with passing data between processes. The idea is that any one process is divided into a number of application domains. Each application domain
If different executables are running in the same process space, then they are clearly able to easily share data, because theoretically they can directly see each other's data. However, although this is possible in principle, the CLR makes sure that this does not happen in practice by inspecting the code for each running application, to ensure that the code cannot stray outside its own data areas. This sounds at first sight like an almost impossible trick to pull off - after all how can you tell what the program is going to do without actually running it?
In fact, it is usually possible to do this because of the strong type safety of the IL. In most cases, unless code is explicitly using unsafe features such as pointers, the data types it is using will ensure that memory is not accessed inappropriately. For example, .NET array types perform bounds checking to ensure that no out of bounds array operations are permitted. If a running application
Code that has been
.NET is designed to facilitate handling of error conditions using the same mechanism, based on exceptions, that is employed by Java and C++. C++ developers should note that, however, because of IL's stronger typing system, there is no performance penalty associated with the use of exceptions with IL in the way that there is in C++. Also, the finally block, which has long been on many C++ developers' wish list, is supported by .NET and by C#.
We will cover exceptions in detail in Chapter 4. Briefly, the idea is that certain areas of code are designated as exception handler routines, with each one able to deal with a particular error condition (for example, a file not being found, or being
The architecture of exception handling also provides a
Most exception handling architecture, including the control of program flow when an exception occurs, is handled by the high-level languages (C#, VB.NET, C++), and is not supported by any special IL commands. C#, for example, handles exceptions using try{} , catch{} , and finally{} blocks of code, as we'll see later in Chapter 4.
What .NET does do, however, is provide the infrastructure to allow compilers that target .NET to support exception handling. In particular, it provides a set of .NET classes that can represent the exceptions, and the language interoperability to allow the thrown exception objects to be interpreted by the exception handling code, irrespective of what language the exception handling code is written in. This language independence is absent from both the C++ and Java implementations of exception handling, although it is present to a limited extent in the COM mechanism for handling errors, which involves returning error codes from methods and passing error objects around. The fact that exceptions are handled consistently in different languages is a crucial aspect of facilitating
Attributes are a feature that will be familiar to developers who use C++ to write COM components (through their use in Microsoft's COM Interface Definition Language (IDL)) although they will not be familiar to Visual Basic or Java developers. The initial idea of an attribute was that it provided extra information concerning some item in the program that could be used by the compiler.
Attributes are supported in .NET - and hence now by C++, C#, and VB.NET. What is, however, particularly innovative about attributes in .NET is that a mechanism exists whereby you can define your own attributes in your sourcecode. These user-defined attributes will be placed with the metadata for the corresponding data types or methods. This can be useful for documentation purposes, where they can be used in conjunction with reflection technology (described) in order to perform programming
Attributes are covered in 5 of this book.

Flawless Consulting: A Guide to Getting Your Expertise Used

Efficiency in Learning: Evidence-Based Guidelines to Manage Cognitive Load

The Performance Consultant's Fieldbook: Tools and Techniques for Improving Organizations and People (Essential Knowledge Resource)

Strategic Business Partner: Aligning People Strategies with Business Goals