When used together, providers and DataSets give an end-to-end interface to individual data values, from the fields in your database tables, to the in-memory items of a DataRow record. Figure 10-1 shows this object interaction.
Figure 10-1. Providers and DataSets in action
When you interact with data from an external database, you always use the provider classes, but it's up to you whether you want to also use DataSets. There are pros and cons in both methods, some of which appear in Table 10-1.
Table 10-1. The Pros and Cons of Using DataSets
You must supply all SQL statements, in the format expected by the provider. This is true for all SELECT, INSERT, UPDATE, and DELETE requests.
The DataSet and DataAdapter work together to craft many of the SQL statements on your behalf.
Data retrieved through the DataReader is read-only. You must issue separate commands to update data.
Data read from the database can be modified in-memory, and updated as a batch with a single method call.
Data transfers are very efficient, because there is no overhead needed to move data into a complex DataSet structure.
There may be a performance hit as the DataSet builds the necessary objects required for each transferred record.
Memory allocation is limited to a single record's worth of data fields, plus some minimal overhead.
Memory allocation is required for the entire result set, plus overhead for every table, field, column, and row in the result set.
Only a single DataReader can be open at a time (unless the provider supports MARS, which I'll discuss in just a bit).
Any number of DataSets can be in use at once.
A live connection to the database exists as long as a DataReader is in use. If it takes you five minutes to scan a result set because you are doing a lot of per-record analysis, the connection will be active for the full five minutes.
Data connections are maintained only long enough to transfer data from or to the database.
DataReaders present one record at a time. The records must be processed in the order they arrive.
You can jump around the records in a DataSet, and reorganize them to meet your needs.
You spend a lot of time working with Strings (for SQL statements) and raw data fields.
All data fields are organized logically, just like they are in the actual database. You can interact with them directly.
Each Command and Connection works with a single provider-supported data source.
Different DataTables within your DataSet can connect to distinct data sources. Also, you can hand-craft data so that each DataRow contains data from different sources.
Because you manage all SQL statements, you have a (relatively) high level of control over the entire data interaction process.
Because the view of the data is abstracted, you have a (relatively) limited level of control over the entire data interaction process (although advanced use of DataSets does give you some additional control).
For me, the last entry in Table 10-1 is the clincher. The job of a programmer is to control the software experience of the user, and the more control, the better. That's why I generally hate "wizards" and "code generators" that take control away from me, the developer. There are limits to my paranoia, though; I put up with the basic template code supplied by Visual Studio when creating new projects. Still, you'll see my code-controlling personality in the Library Project, with my heavy dependence on DataReaders over DataSets. When I do store data long-term, I usually just stick the data in a DataTable without a containing DataSet.
I mentioned something called MARS in Table 10-1. MARS stands for Multiple Active Result Sets. Normally, a single Connection object only allows a single DataReader to be in use at any given moment. This limitation is bi-directional. If you are scanning a DataReader via a SELECT statement, you cannot issue INSERT, UPDATE, or DELETE statements on that same connection until you close the DataReader.
With the introduction of MARS, a single connection can now handle multiple data transmission activities in either direction. SQL Server adds MARS support with its 2005 release; Oracle has supported MARS-like features since the initial .NET release.
MARS connections seem like a feature you would always want to enable. But they do add additional overhead to your application that can slow it down. Also, MARS does not always mix well with multi-threaded applications.