ADO.NET in .NET

for RuBoard

In this section, you'll drill down to look at how ADO.NET is architected to realize the design goals discussed in the previous section. Let's begin with a review of the execution environment in which ADO.NET runs and then move on to the namespaces, classes, and interfaces that make up ADO.NET in the .NET Framework.

Managed Code Review

Although the ins and outs of managed code and the Common Language Runtime are beyond the scope of this book, it's important to have a foundational understanding of how the code you write for ADO.NET is actually executed.

Note

For an in-depth look at the common language runtime, see Essential .NET Volume I by Don Box, published by Addison-Wesley. Chapter 1 of my Sams book Building Distributed Applications with Visual Basic .NET also provides a more in-depth look at the common language runtime from a Visual Basic perspective.

To begin, all code you write to work with ADO.NET, whether in VB .NET, VC# .NET, or any of the .NET languages, will by executed by the common language runtime and is thus referred to as managed code. The common language runtime includes a host of runtime features including a class loader, thread support, exception manager, security engine, garbage collector, code manager, and type checker. In turn , all managed code is first compiled to a machine-independent intermediate language called Microsoft Intermediate Language (MSIL), and subsequently compiled to native instructions for execution in a just-in-time (JIT) manner as the common language runtime's class loader loads code at runtime. This process is outlined in Figure 1.2.

Figure 1.2. The managed code execution environment. This diagram depicts how managed code is compiled and executed by the common language runtime.

graphics/01fig02.gif

Note

Yes, it is also possible to pre-JIT code at install time. This can be done using a command-line utility that stores the resulting native code (executable or DLL) in a code cache on the machine. When the assembly is to be executed by the common language runtime, it locates the native code and runs it directly instead of using JIT compilation.

When your code is compiled to MSIL, it is stored in a PE (portable executable) file called a module. The module contains the MSIL instruction in addition to metadata that describes the types (classes) in the code you've written, along with the dependencies on other types. The metadata is roughly equivalent to a COM type library. This metadata is heavily relied upon by the common language runtime and other tools in VS .NET to make sure that the appropriate code is loaded and to assist in enabling features such as IntelliSense and debugging. A module can then be incorporated in, or exist independently as, an assembly. An assembly is the fundamental unit of packaging, deployment, security, and versioning in .NET and contains a manifest (embedded in one of the modules or in its own PE file) that describes the version, an optional public key (called a strong name ) used for uniquely identifying this assembly from all others, and a list of dependent assemblies and files.

As you'll note from Figure 1.2, ADO.NET is itself an assembly (written in VC# .NET) called System.Data.dll that is a part of, and is installed with, the .NET Framework in the windowsdir \Microsoft.NET\Framework\ framework_version directory. As a result, the manifest of your assemblies will reference the ADO.NET assembly and so, at runtime, the common language runtime will be able to make sure that ADO.NET is loaded and JIT compiled.

Note

By running the .NET Framework Configuration Manager from the Administrative Tools group , you can view what is called the Global Assembly Cache (GAC). Simply put, the GAC is a machinewide store for assemblies that have been given a strong name. Putting an assembly in the GAC makes it easy for thecommon language runtime's class loader to find it at runtime. Not surprisingly, the ADO.NET assembly is placed in the GAC when you install VS .NET. You'll also notice that the Configuration Manager depicts assemblies with two different icons. The lion's share of the assemblies in the GAC is placed there after simply being compiled to MSIL. The core assemblies that are used in almost every .NET application such as mscorlib , System , and System.Xml , however, have been pre-JITted to native code for better performance.

The last, and perhaps most important, point to note about the managed code environment of the common language runtime is the existence of the Common Type System (CTS). In the CTS, all data types (simply referred to as types ) are ultimately derived from a base object called Object ( System.Object ) and found in the assembly mscorlib.dll. The CTS is key to understanding .NET because it governs how types are represented and dealt with by the common language runtime.

For example, Figure 1.3 shows that all types are classified as value types or reference types. As the name implies, value types are typically passed by value in applications and are used to represent simple data types such as integers, Boolean, and character. Value types are simply allocated on the stack and therefore are very lightweight. Reference types are allocated on the heap, are addressed by their memory location, and are used to represent classes, interfaces, pointers, strings, and delegates (which you can think of as object-oriented function pointers). Therefore, the high-level objects found in ADO.NET such as the DataSet are reference types. This distinction is important because reference types are automatically garbage collected by the common language runtime, although they can also expose a Dispose or Close method for de-allocation.

Figure 1.3. The Common Type System. This diagram shows how the CTS is organized. All types are derived from `System.Object` .

graphics/01fig03.gif

The CTS is what makes cross-language development in .NET a reality. By using the same underlying representation of types as managed by the common language runtime, languages can freely use types created in other .NET languages without having to perform any translation or coercion. This also means that an assembly written in one language can even inherit from a type written in another language. This is fundamental to ADO.NET because the ADO.NET classes were written in VC# .NET, but can be used, for example, as base classes for code written in VB .NET.

The `System.Data` Namespace

Assemblies in .NET contain classes, interfaces, and enumerated types arranged hierarchically in namespaces. Namespaces can cross assembly boundaries and can themselves contain other namespaces. They are simply a convenient way to arrange code and can be navigated using the familiar dot notation. Within the ADO.NET assembly, the primary namespace is, not surprisingly, System.Data . Within System.Data are four namespaces that implement specific ADO.NET features as shown in Table 1.1.

Table 1.1. The `System.Data` namespaces. These namespaces comprise the functionality of ADO.NET.

Namespace	Description
`System.Data`	Contains the heart of the ADO.NET architecture, including more than 45 classes and more than 20 enumerations that comprise the `DataSet` and a dozen or more interfaces that are implemented by .NET Data Providers
`System.Data.Common`	Contains about a dozen classes shared by .NET Data Providers such as the OleDb and SqlClient providers
`System.Data.OleDb`	Contains approximately 20 classes and a few enumerations that make up the OLE DB .NET Data Provider
`System.Data.SqlClient`	Contains approximately 20 classes and a few enumerations that make up the SQL Server .NET Data Provider
`System.Data.SqlTypes`	Contains more than a dozen structures that map to data types exposed by SQL Server in addition to a couple of enumerations and classes used to perform comparisons and handle exceptions

In addition, the ADO.NET assembly contains one class from the System.Xml namespace, most of which is defined in the System.Xml.dll assembly. This class, XmlDataDocument , is used to bridge the gap between relational and XML data. We'll discuss this on Day 7.

As you can see from Table 1.1, the first two namespaces contain types that are for general use, whereas the last three implement features particular to accessing data through an OLE DB provider or to accessing SQL Server. This highlights the fundamental division of ADO.NET into the DataSet and .NET Data Providers, as shown in Figure 1.4.

Figure 1.4. ADO.NET architecture. This diagram depicts the architecture of ADO.NET and its fundamental division between the `DataSet` and the .NET Data Providers.

graphics/01fig04.gif

As mentioned previously, the DataSet implements the in-memory cache for disconnected data and as a result is not dependent on any particular data source. The classes exposed by the .NET Data Providers, particularly the DataAdapter , are used to populate the DataSet . The DataSet can then be serialized and passed between tiers of a distributed application using the facilities of the common language runtime. In addition, it can load data from multiple data adapters and represent the data hierarchically through a set of relationships defined by its XSD schema. Finally, changes can be made to the DataSet that are tracked by the DataSet and it can be passed back to the .NET Data Provider in order to update the underlying data store.

Table 1.1 also shows that ADO.NET ships with two .NET Data Providers: System.OleDb and System.SqlClient . The role of the providers is to implement classes that use the interface and classes in System.Data and System.Common to expose the functionality of a particular data store. In other words, the .NET Data Providers are analogous to OLE DB providers and ODBC drivers, with the exception that they expose functionality at the programmatic layer rather than simply as an abstraction. This means that vendors writing .NET Data Providers can expose custom functionality to developers directly as additional classes or methods .

At a functional level, as shown in Figure 1.4, providers will expose functionality based on the interfaces and classes in System.Data and System.Data.Common to connect to the data store, initiate transactions, communicate with a DataSet , handle exceptions, execute commands, handle parameters, and stream through data in a fast-forward read-only fashion. All the interfaces and classes shown in Figure 1.4, with the exception of CommandBuilder , Exception , and Error , are implemented or inherited by an actual provider to provide the programming model for implementing a provider. By convention, the provider also exposes the CommandBuilder class to automatically populate a data adapter with commands used to select, insert, update, and delete data from the data source and Exception and Error classes to handle errors returned from the data source. Finally, access to providers can be controlled through the use of code-access security implemented as permission objects. You'll explore each of these functions in depth during Week 2.

Note

In addition to the two providers that ship with ADO.NET, there is also an ODBC .NET Provider available for download from msdn.microsoft.com.

Interface-Based Programming

As you can see from Figure 1.4, ADO.NET makes use of interfaces (those identifiers prefixed with an I , such as IDbDataAdapter , IDbConnection , and IDataReader ) to provide the template or contract between a class that uses the interface and the client. By implementing or deriving from an interface in a class, you ensure that the methods, properties, fields, and events that your class exposes follow a particular semantically related pattern. You can also implement several interfaces in the same class to support different sets of functionality. This is called interface inheritance. In turn, following a predefined pattern allows client code to be written that can work with any class that implements a particular interface. This is referred to as polymorphism, and in ADO.NET can be very useful in writing code that works with multiple .NET Data Providers.

.NET languages based on the common language runtime also support single- implementation inheritance wherein a class can be derived from another class, and in addition to inheriting its member definitions, it also inherits the implementation or code behind those members . You can see from Figure 1.4 that the classes DbDataAdapter and DataAdapter can be used in this way.

If you're a VB or ASP developer, you've probably not worked with interfaces very much. This is primarily the case because although VB 6.0 supported interfaces through the use of the Implements keyword, it was not natural to create interfaces in VB 6.0 and ASP did not support them at all. Secondarily, although the COM programming model relied on interfaces in a very fundamental way, VB did a good job of hiding that fact from developers.

Generic Versus Specific Providers

As is implied by the previous discussion, .NET Data Providers can come in several flavors. The OLE DB and ODBC providers are generic in that they are used to access a variety of data stores and more or less act as a pass through to other software that communicates with the actual data store. On the other hand, the SQL Server provider is an example of a specific provider because it bypasses OLE DB and ODBC and talks to SQL Server directly using SQL Server's native Tabular Data Stream (TDS) protocol. This provides for better performance. This architecture points to the fact that vendors will likely implement specific providers to expose custom functionality and improve performance, while OLE DB and ODBC can still be used through the generic providers. As you'll learn on Day 14, "Working with Other Providers," you can also take advantage of this architecture to build your own generic and specific providers for enterprise applications.

Because .NET Data Providers implement the functionality shown in Figure 1.4, Figure 1.5 shows the same diagram, this time with the specific classes implemented by the SQL Server provider in the System.Data.SqlClient namespace.

Figure 1.5. The SqlClient .NET Data Provider. This diagram shows the implementation of the SqlClient provider.

graphics/01fig05.gif

It's also important to keep in mind that ADO.NET, although a fundamental part of the .NET Framework because of its importance for most corporate developers is, in terms of the number of classes it includes, a very small part of the framework as a whole. In its entirety, the .NET Framework encompasses more than 6,500 classes, and includes functionality for everything from building XML Web Services and Windows Forms to building components that can be hosted by Component Services.

A Note About Language Choice

If it wasn't clear from the discussion on managed code, it cannot be overemphasized that one of the fundamental goals of .NET is to provide a language-independent framework for developing modern distributed applications. To that end, it doesn't matter whether you program ADO.NET from VB .NET, VC# .NET, or any of the other languages targeted for the common language runtime. However, because the two primary languages that most developers will use initially are VB and C#, all the examples in this book will use one of those two languages. I'll alternate the language used in the listings and the exercise solutions throughout the book, but of course you're free to implement the exercises in whatever language you choose. I think you'll find it fairly easy to read code written to use ADO.NET in either VB or C#, although I'm certainly aware that there might be concepts in each language that will need further explanation. At those times, look for tips and notes for clarification .

for RuBoard

Managed Code Review

Figure 1.2. The managed code execution environment. This diagram depicts how managed code is compiled and executed by the common language runtime.

Figure 1.3. The Common Type System. This diagram shows how the CTS is organized. All types are derived from System.Object .

The System.Data Namespace

Table 1.1. The System.Data namespaces. These namespaces comprise the functionality of ADO.NET.

Figure 1.4. ADO.NET architecture. This diagram depicts the architecture of ADO.NET and its fundamental division between the DataSet and the .NET Data Providers.

Interface-Based Programming

Generic Versus Specific Providers

Figure 1.5. The SqlClient .NET Data Provider. This diagram shows the implementation of the SqlClient provider.

A Note About Language Choice

Figure 1.3. The Common Type System. This diagram shows how the CTS is organized. All types are derived from `System.Object` .

The `System.Data` Namespace

Table 1.1. The `System.Data` namespaces. These namespaces comprise the functionality of ADO.NET.

Figure 1.4. ADO.NET architecture. This diagram depicts the architecture of ADO.NET and its fundamental division between the `DataSet` and the .NET Data Providers.