The System.Data namespace contains several collection-like objects that, combined, provide an in-memory representation of the DBMS relational programming model. The DataSet class looks like a catalog, whereas the DataTable maps to an individual table. The DataRelation class represents a relationship between tables, and the DataView creates a filtered view of a table's data. In addition, the System.Data namespace also supports constraints and a relatively simple model of indexing.
The facilities of the memory-resident database model feature a programming model in which disconnection is a key feature rather than a precise requirement. Using the DataSet model, for example, you can filter and sort the data on the client before it gets to the middle tier. Having such facilities available within the DataSet means that once the data is there, you don't need to go back to the database to get a different view on the data. The data stored in the DataSet is self-sufficient, which makes the whole model inherently disconnected.
Note | An interesting use of the DataSet that makes sense both for Web and desktop scenarios is in moving data around between components and tiers. The DataSet is great at encapsulating tables of data and relationships. It can also be passed around between tiers as a monolithic object. Finally, it can be serialized, meaning that data and related schema can be moved between tiers in a loosely coupled manner. |
The DataSet class is the principal component in the ADO.NET object model, but several others are satellite classes that play a fundamental role. ADO.NET container classes are listed Table 8-7.
Class | Description |
---|---|
DataSet | An in-memory cache of data made of tables, relations, and constraints. Serializable and remotable, it can be filled from a variety of data sources and works regardless of which one is used. |
DataTable | Represents a relational table of data with a collection of columns and rows. |
DataColumn | Represents a column in a DataTable object. |
DataRow | Represents a row in a DataTable object. |
DataView | Defined on top of a particular table, it creates a filtered view of data. Can be configured to support editing and sorting. The data view is not a copy of the data just a mask. |
DataRelation | Represents a relationship between two tables in the same DataSet. The relationship is set on a common column. |
A key point to remember about ADO.NET container classes is that they work regardless of the data source used. You can populate the tables in a DataSet using the results of a SQL Server query as well as file system information or data read out of a real-time device. Even more important, none of the ADO.NET container classes retains information about the source. Like array or collection objects, they just contain data. Unlike array or collection objects, though, they provide facilities to relate and manage data in a database-like fashion.
The DataSet class implements three important interfaces IListSource makes it possible to return a data-bound list of elements, ISerializable makes the class capable of controlling how its data is serialized to a .NET formatter, and IXmlSerializable guarantees the class can serialize itself to XML. Table 8-8 lists the properties of the DataSet class.
Property | Description |
---|---|
CaseSensitive | Gets or sets a value that indicates whether string comparisons within DataTable objects are case-sensitive. |
DataSetName | Gets or sets the name of the DataSet. |
DefaultViewManager | Gets the default view manager object an instance of the DefaultViewManager class that contains settings for each table in the DataSet. |
EnforceConstraints | Gets or sets a value that indicates whether constraint rules are enforced when attempting any update operation. |
ExtendedProperties | Gets the collection of customized user information associated with the DataSet. |
HasErrors | Gets whether there are errors in any of the child DataTable objects. |
Locale | Gets or sets the locale information used to compare strings within the tables. |
Namespace | Gets or sets the namespace of the DataSet. |
Prefix | Gets or sets the prefix that aliases the namespace of the DataSet. |
Relations | Gets the collection of the relations set between pairs of child tables. |
RemotingFormat | Indicates the desired serialization format binary or XML. Not supported in ADO.NET 1.x. |
SchemaSerializationMode | Indicates whether schema should be included in the serialized data. Not supported in ADO.NET 1.x. |
Tables | Gets the collection of contained tables. |
The Namespace and Prefix properties affect the way in which the DataSet serializes itself to XML. The name of the DataSet is also used to set the root node of the XML representation. If the DataSetName is empty, the NewDataSet string is used. The methods of the class are listed in Table 8-9.
Method | Description |
---|---|
AcceptChanges | Commits all the changes made to all the tables in the DataSet since it was loaded or since the last time the method was called. |
Clear | Removes all rows in all tables. |
Clone | Copies the structure of the DataSet, including all table schemas, relations, and constraints. No data is copied. |
Copy | Makes a deep copy of the object, including schema and data. |
CreateDataReader | Returns a DataTable-specific data reader object with one result set per table, in the same sequence as they appear in the Tables collection. Not supported in ADO.NET 1.x. |
GetChanges | Returns a copy of the DataSet containing only the changes made to it since it was last loaded or since AcceptChanges was called. |
GetXml | Returns the XML representation of the data stored. |
GetXmlSchema | Returns the XSD schema for the XML string representing the data stored in the DataSet. |
HasChanges | Indicates whether there are new, deleted, or modified rows in any of the contained tables. |
InferXmlSchema | Replicates into the DataSet the table structure inferred from the specified XML document. |
Merge | Merges the specified ADO.NET object (DataSet, DataTable, or an array of DataRow objects) into the current DataSet. |
ReadXml | Populates the DataSet by reading schema and data from the specified XML document. |
ReadXmlSchema | Replicates into the DataSet the table structure read from the specified XML schema. |
RejectChanges | Rolls back all the changes made to all the tables since it was created or since the last time AcceptChanges was called. |
Reset | Empties tables, relations, and constraints, resetting the DataSet to its default state. |
WriteXml | Serializes the DataSet contents to XML. |
WriteXmlSchema | Writes the DataSet structure as an XML schema. |
To make a full, deep copy of the DataSet, you must resort to the Copy method except that in this case you duplicate the object. The following code does not duplicate the object:
DataSet tmp = ds;
If you simply assign the current DataSet reference to another variable, you duplicate the reference but not the object. Use the following code to duplicate the object:
DataSet tmp = ds.Copy();
The Copy method creates and returns a new instance of the DataSet object and ensures that all the tables, relations, and constraints are duplicated. The Clone method is limited to returning a new DataSet object in which all the properties have been replicated but no data in the tables is copied.
DataSets and data readers are often presented as mutually exclusive and alternative ways to read data in ADO.NET applications. At its core, there's just one physical way of reading data in ADO.NET using data readers. DataSets are disconnected containers automatically filled using a reader, and they are ideal for caching data. Data readers are ideal tools for consuming data as you walk your way through the result set.
Imagine now that you have access to some previously cached data say, a DataSet stored in the session state. How do you find and read a particular record? You typically indicate the coordinates of the record (row and column) and perform a random access to it. If you need to read two or more records, you just repeat the operation. In ADO.NET 2.0, there's a better way using in-memory, disconnected readers that you create through the CreateDataReader method. A reader obtained in this way is different from the connected, cursor-like data reader you get out of the ExecuteReader method on the command class. What you get here is a DataTableReader object, good at scrolling the contents of an in-memory data table using the same cursor-like programming interface of data readers. Here's an example:
DataSet data = new DataSet(); SqlDataAdapter adapter = new SqlDataAdapter( "SELECT * FROM employees;SELECT * FROM customers", ConfigurationManager.ConnectionStrings["LocalNWind"].ConnectionString); adapter.Fill(data); // Access the whole data set record by record DataTableReader reader = data.CreateDataReader(); do { while (reader.Read()) { // reader[1] indicates the second column Response.Write(String.Format("{0} <br>", reader[1])); } Response.Write("<hr>"); } while (reader.NextResult()); reader.Close();
The do statement loops through all the result sets and lists the content of the second field for the record. This code is not really different from the code we examined in Chapter 7 for multiple result sets except that all this code runs in-memory without any connection to the database.
What's the purpose of table readers? Your code runs faster when repeated reads of many consecutive records should be performed.
A merge operation is typically accomplished by a client application to update an existing DataSet object with the latest changes read from the data source. The Merge method should be used to fuse together two DataSet objects that have nearly identical schemas. The two schemas, though, are not strictly required to be identical.
The first step in the merge operation compares the schemas of the involved DataSet objects to see whether they match. If the DataSet to be imported contains new columns or a new table source, what happens depends on the missing schema action specified. By default, any missing schema element is added to the target DataSet, but you can change the behavior by choosing the Merge overload that allows for a MissingSchemaAction parameter.
At the second step, the Merge method attempts to merge the data by looking at the changed rows in the DataSet to be imported. Any modified or deleted row is matched to the corresponding row in the existing DataSet by using the primary key value. Added rows are simply added to the existing DataSet and retain their primary key value.
The merge operation is an atomic operation that must guarantee integrity and consistency only at its end. For this reason, constraints are disabled during a merge operation. However, if at the end of the merge the original constraints can't be restored for example, a unique constraint is violated an exception is thrown, but no uploaded data gets lost. In this case, the Merge method completely disables constraints in the DataSet. It sets the EnforceConstraints property to false and marks all invalid rows in error. To restore constraints, you must first resolve errors.
When the DataSet is first loaded, all the rows in all tables are marked as unchanged. (All rows are marked Added if the AcceptChangesDuringFill property is false on the adapter used to fill the DataSet.) The state of a table row is stored in a property named RowState. Allowable values for the row state are in the DataRowState enumeration listed in Table 8-10.
Value | Description |
---|---|
Added | The row has been added to the table. |
Deleted | The row is marked for deletion from the parent table. |
Detached | Either the row has been created but not yet added to the table or the row has been removed from the rows collection. |
Modified | Some columns within the row have been changed. |
Unchanged | No changes have been made since the last call to AcceptChanges. This is also the state of all rows when the table is first created. |
Each programmatic operation accomplished on a DataSet member changes the state of the involved rows. All changes remain pending in an uncommitted state until a specific call is made to make the changes persistent. The AcceptChanges method has the power to commit all the changes and accept the current values as the new original values of the table. After AcceptChanges is called, all changes are cleared and rows incorporate the changed values and appear as unchanged. The RejectChanges method, on the other hand, rolls back all the pending changes and restores the original values. Note that the DataSet retains original values until changes are committed or rejected.
The commit model is applicable at various levels. In particular, by calling AcceptChanges or RejectChanges on the DataSet object, you commit or roll back changes for all the rows in all the contained tables. If you call the same methods on a DataTable object, the effect applies to all the rows in the specified table. Finally, you can also accept or reject changes for an individual row in a particular table.
The contents of a DataSet object can be serialized as XML in two ways, which I'll call stateless and stateful. Although these expressions are not common throughout the ADO.NET documentation, I feel that they perfectly describe the gist of the two possible approaches. A stateless representation takes a snapshot of the current instance of the data and renders it according to a particular XML schema the ADO.NET normal form which is shown in the following code:
<MyDataSet> <Employees> <ID>...</ID> <Name>...</Name> </Employees> ... <Orders> <OrderID>...</OrderID> <OrderDate>...</OrderDate> <Amount>...</Amount> </Orders> </MyDataSet>
The root node appears after the DataSetName property. Nodes one level deeper represent rows of all tables and are named as the table. Each row node contains as many children as there are columns in the row. This code snippet refers to a DataSet with two tables Employees and Orders with two and three columns, respectively. That kind of string is what the GetXml method returns and what the WriteXml method writes out when the default write mode is chosen:
dataSet.WriteXml(fileName); dataSet.WriteXml(fileName, mode);
A stateful representation, on the other hand, contains the history of the data in the object and includes information about changes as well as pending errors. Table 8-11 summarizes the writing options available for use with WriteXml through the WriteXmlMode enumeration.
Write Mode | Description |
---|---|
IgnoreSchema | Writes the contents of the DataSet as XML data without schema. |
WriteSchema | Writes the contents of the DataSet, including an inline XSD schema. The schema cannot be inserted as XDR, nor can it be added as a reference. |
DiffGram | Writes the contents of the DataSet as a DiffGram, including original and current values |
IgnoreSchema is the default option. The following code demonstrates the typical way to serialize a DataSet to an XML file:
StreamWriter sw = new StreamWriter(fileName); dataset.WriteXml(sw); // defaults to XmlWriteMode.IgnoreSchema sw.Close();
A DiffGram is an XML serialization format that includes both the original values and current values of each row in each table. In particular, a DiffGram contains the current instance of rows with the up-to-date values plus a section where all the original values for changed rows are grouped. Each row is given a unique identifier that is used to track changes between the two sections of the DiffGram. This relationship looks a lot like a foreign-key relationship. The following listing outlines the structure of a DiffGram:
<diffgr:diffgram xmlns:msdata="urn:schemas-microsoft-com:xml-msdata" xmlns:diffgr="urn:schemas-microsoft-com:xml-diffgram-v1"> <DataSet> ... </DataSet> <diffgr:before> ... </diffgr:before> <diffgr:errors> ... </diffgr:errors> </diffgr:diffgram>
The <diffgr:diffgram> root node can have up to three children. The first is the DataSet object with its current contents, including newly added rows and modified rows (but not deleted rows). The actual name of this subtree depends on the DataSetName property of the source DataSet object. If the DataSet has no name, the subtree's root is NewDataSet. The subtree rooted in the <diffgr:before> node contains enough information to restore the original state of all modified rows. For example, it still contains any row that has been deleted as well as the original content of any modified row. All columns affected by any change are tracked in the <diffgr:before> subtree. The last subtree is <diffgr:errors>, and it contains information about any errors that might have occurred on a particular row.
In addition to XML serialization, the DataSet class fully supports .NET binary serialization. Marked with the [Serializable] attribute, the DataSet object implements the ISerializable interface and gains full control over the serialization process. Put another way, the DataSet itself embeds the code that generates the stream of bytes saved as the serialized version of the object.
In ADO.NET 1.x, the DataSet serializes as XML even when binary serialization is requested through a .NET formatter. Worse yet, the DataSet uses the fairly verbose DiffGram format, topped with any related schema information. All .NET distributed systems that make intensive use of disconnected data (as Microsoft's architecture patterns and practices suggests) are sensitive to the size of serialized data. The larger the DataSet, the more these systems suffer from consumption of CPU, memory, and bandwidth. Nicely enough, ADO.NET 2.0 provides a great fix for this limitation through the RemotingFormat property.
The property accepts values from the SerializationFormat enum type: Xml (the default) or Binary. When a DataSet instance is being serialized through a .NET formatter (say, in a .NET Remoting scenario), it looks at the value of the RemotingFormat property and decides about the persistence format. Needless to say, if you set RemotingFormat to Binary you get a much more compact output:
DataSet ds = GetData(); ds.RemotingFormat = SerializationFormat.Binary; StreamWriter writer = new StreamWriter(BinFile); BinaryFormatter bin = new BinaryFormatter(); bin.Serialize(writer.BaseStream, ds); writer.Close();
The preceding code shows how to serialize to disk a DataSet in a truly binary format. If you omit the statement that sets the remoting format, you obtain the same behavior as in ADO.NET 1.x. If you're passing a DataSet through a .NET Remoting channel, the only thing you have to do is set the RemotingFormat property.
The DataTable object represents one table of in-memory data. Mostly used as a container of data within a DataSet, the DataTable object is also valid as a standalone object that contains tabular data. The DataTable and DataSet are the only ADO.NET objects that can be remoted and serialized. Just as with a DataSet, a DataTable can be created programmatically. In this case, you first define its schema and then add new rows. The following code snippet shows how to create a new table within a DataSet:
DataSet ds = new DataSet(); DataTable tableEmp = new DataTable("Employees"); tableEmp.Columns.Add("ID", typeof(int)); tableEmp.Columns.Add("Name", typeof(string)); ds.Tables.Add(tableEmp);
The table is named Employees and features two columns ID and Name. The table is empty because no rows have been added yet. To add rows, you first create a new row object by using the NewRow method:
DataRow row = tableEmp.NewRow(); row["ID"] = 1; row["Name"] = "Joe Users"; tableEmp.Rows.Add(row);
The DataTable contains a collection of constraint objects that can be used to ensure the integrity of the data and signals changes to its data-firing events. Let's have a closer look at the programming interface of the DataTable, beginning with properties. Table 8-12 lists the properties of the DataTable class.
Property | Description |
---|---|
CaseSensitive | Gets or sets whether string comparisons are case-sensitive. |
ChildRelations | Gets the collection of child relations for this table. |
Columns | Gets the collection of columns that belong to this table. |
Constraints | Gets the collection of constraints maintained by this table. |
DataSet | Gets the DataSet this table belongs to. |
DefaultView | Gets the default DataView object for this table. |
DisplayExpression | Gets or sets a display string for the table. Used in the ToString method together with TableName. |
ExtendedProperties | Gets the collection of customized user information. |
HasErrors | Gets a value that indicates whether there are errors in any of the rows. |
Locale | Gets or sets locale information used to compare strings. |
MinimumCapacity | Gets or sets the initial starting size for the table. |
Namespace | Gets or sets the namespace for the XML representation of the table. |
ParentRelations | Gets the collection of parent relations for this table. |
Prefix | Gets or sets the prefix that aliases the table namespace. |
PrimaryKey | Gets or sets an array of columns that function as the primary key for the table. |
RemotingFormat | Indicates the desired serialization format binary or XML. Not supported in ADO.NET 1.x. |
Rows | Gets the collection of rows that belong to this table. |
TableName | Gets or sets the name of the DataTable object. |
Shared by several ADO.NET objects, the ExtendedProperties collection manages name/value pairs and accepts values of type object. You can use this collection as a generic cargo variable in which to store any user information. The methods of the DataTable class are listed in Table 8-13.
Method | Description |
---|---|
AcceptChanges | Commits all the pending changes made to the table. |
BeginInit | Begins the initialization of the table. Used when the table is used on a form or by another component. |
BeginLoadData | Turns off notifications, index maintenance, and constraints while loading data. |
Clear | Removes all the data from the table. |
Clone | Clones the structure of the table. Copies constraints and schema, but doesn't copy data. |
Compute | Computes the given expression on the rows that meet the specified filter criteria. Returns the result of the computation as an object. |
Copy | Copies both the structure and data for the table. |
CreateDataReader | Returns a DataTableReader object for the current table. Not supported in ADO.NET 1.x. |
EndInit | Ends the initialization of the table. Closes the operation started with BeginInit. |
EndLoadData | Turns on notifications, index maintenance, and constraints after loading data. |
GetChanges | Gets a copy of the table containing all changes made to it since it was last loaded or since AcceptChanges was called. |
GetErrors | Gets an array of all the DataRow objects that contain errors. |
ImportRow | Performs a deep copy of a DataRow, and loads it into the table. Settings, including original and current values, are preserved. |
LoadDataRow | Finds and updates a specific row. If no matching row is found, a new row is created using the given values. Uses the primary keys to locate the row. |
NewRow | Creates a new DataRow object with the schema as the table. |
ReadXml | Populates the DataTable reading schema and data from the specified XML document. Not supported in ADO.NET 1.x. |
ReadXmlSchema | Replicates into the DataTable the table structure read from the specified XML schema. Not supported in ADO.NET 1.x. |
RejectChanges | Rolls back all changes that have been made to the table since it was loaded or since the last time AcceptChanges was called. |
Reset | Resets the DataTable object to its default state. |
Select | Gets the array of DataRow objects that match the criteria. |
WriteXml | Serializes the DataTable contents to XML. Not supported in ADO.NET 1.x. |
WriteXmlSchema | Writes the DataTable structure as an XML schema. Not supported in ADO.NET 1.x. |
In ADO.NET 2.0, the DataTable implements the IXmlSerializable interface and provides public methods to load and save its contents from and to XML streams. The implementation of the interface is also the key that now allows DataTable to be used as parameters and return values from .NET Web service methods.
Any row in the DataTable is represented by a DataRow object, whereas the DataColumn object represents a column. The Select method implements a simple but effective query engine for the rows of the table. The result set is an array of DataRow objects. The filter string is expressed in an internal language that looks like that used to build WHERE clauses in a SQL SELECT statement. The following line of code is a valid expression that selects all records in which the ID is greater than 5 and the name begins with A:
tableEmp.Select("ID >5 AND Name LIKE 'A%'");
Refer to the .NET Framework documentation for the full syntax supported by the Select method. Note that it is the same language you can use to define expression-based DataTable columns.
The Compute method of the DataTable class calculates a value by applying a given expression to the table rows that match a specified filter. Expressions can include any sort of Boolean and arithmetic operators, but they can also include more interesting aggregate functions such as Min, Max, Count, and Sum , plus a few more statistical operators such as average, standard deviation, and variance. The following code counts the rows in which the Name column begins with A:
int numRecs = (int) tableEmp.Compute("Count(ID)", " Name LIKE 'A%'");
The Compute method has two possible overloads one that takes only the expression to compute and one that also adds a filter string, as shown in the preceding code. Note that all aggregate functions can operate on a single column. This means you can directly compute the sum on two columns, as in the following pseudocode:
Sum(quantity * price)
To compute functions on multiple columns, you can leverage the capabilities of the DataColumn object and, in particular, its support for dynamic expressions. For example, you can define an in-memory column named order_item_price as follows:
tableEmp.Columns.Add("order_item_price", typeof(double), "quantity*price");
At this point, you can compute the sum of that column using the following expression:
Sum(order_item_price)
A DataColumn object represents the schema of a column in a DataTable object. It provides properties that describe the characteristics and capabilities of the column. The DataColumn properties include AllowDBNull, Unique, ReadOnly, DefaultValue, and Expression. As discussed earlier, some of these properties are automatically set with the corresponding information read from the data source at least when the data source is a database.
A DataColumn object has a name and type; sometimes it can also have an associated expression. The content of an expression-based column is a function of one or more columns combined with operators and aggregates to form a full expression. When an expression-based column is created, ADO.NET precalculates and caches all the values for the column as if they were native data. At the same time, ADO.NET tracks the columns involved and monitors them for changes. It does so by registering an internal handler for the DataTable's RowChanged event. When a row changes in one of the columns involved in the expression, the computed column is automatically refreshed.
Expression-based columns are extremely powerful for setting up more effective and practical forms of data binding, as we'll see in the next chapter. In addition, expression-based columns work side by side with table relations and, using both, you can implement really powerful features. We'll demonstrate this later in the "Data Relations" section.
The data in a table is represented with a collection of DataRow objects. A row has a state, an array of values, and possibly error information. The DataTable maintains various versions of the row. You can query for a particular version at any time using the Item accessor property. The following code snippet shows how to read the original value of a column in a particular DataRow object. By default, you are returned the current value.
Response.Write(row["Name", DataRowVersion.Original].ToString());
All the values in a row can be accessed either individually or as a whole. When accessing all the values in a row, you use the ItemArray property, which passes you an array of objects, one per each column. The ItemArray property is a quick way to read values from a row and to set all the columns on a row in a single shot.
The DataRow class doesn't have a public constructor. As a result, a data row can be created only implicitly using the NewRow method on a base table. The NewRow method populates the DataRow object with as many entries as there are columns in the DataTable. In this case, the table provides the schema for the row, but the row is in no way a child of the table. To add a row to a DataTable, you must explicitly add it to the Rows collection:
tableEmp.Rows.Add(row);
Note that a DataRow object cannot be associated with more than one table at a time. To load a row into another table, you can use the ImportRow method, which basically duplicates the DataRow object and loads it into the specified table. A row can be detached from its parent table by using the Remove method. If you use the Delete method, on the other hand, the row will be marked for deletion but still remain part of the table.
Note | Objects removed from a parent collection are not automatically destroyed or, at least, not until they go out of scope and become fodder for the garbage collector. This consideration holds true for several ADO.NET objects including, but not limited to, the DataRow. A DataTable, for example, can be detached from the DataSet by simply removing it from the Tables collection. However, this doesn't mean that the DataTable is automatically deleted as an object. |
A constraint is a logical rule set on a table to preserve the integrity of the data. For example, a constraint determines what happens when you delete a record in a table that is related to another one. The .NET Framework supports two types of constraints ForeignKeyConstraint and UniqueConstraint.
In particular, the ForeignKeyConstraint class sets the rules that govern how the table propagates, updates, and deletes child tables. For example, suppose you have two related tables, one with employees and one with orders. What happens when an employee is deleted? Should you delete all the related records, too? The ForeignKeyConstraint object associated with the Employees table will determine what is related to it in the Orders table. You create a ForeignKeyConstraint object as shown here:
DataColumn c1 = tableEmp.Columns("empID"); DataColumn c2 = tableOrd.Columns("empID"); ForeignKeyConstraint fk = new ForeignKeyConstraint("EmpOrders", c1, c2); // Run some code to configure the constraint object ... tableOrd.Constraints.Add(fk);
The ForeignKeyConstraint constructor takes the name of the object plus two DataColumn objects. The first DataColumn object represents the column (or columns) on the parent table; the second DataColumn object represents the column (or columns) in the child table. The constraint is added to the child table and is configured using the UpdateRule, DeleteRule, and AcceptRejectRule properties. While setting the UpdateRule and DeleteRule properties, you use values taken from the Rule enumeration. The AcceptRejectRule is the enumeration used to look for the property with the same name. For updates and deletions, the child table can cascade the change or set the involved rows to null or default values. Alternately, the child table can simply ignore the changes. The AcceptRejectRule property is processed when the AcceptChanges method is called on the parent row to commit changes. The choices for the constraint are limited to either cascading or ignoring changes.
The UniqueConstraint class ensures that a single column (or an array of columns) have unique, nonduplicated values. There are several ways to set a unique constraint. You can create one explicitly using the class constructor and adding the resulting object to the Constraints collection of the DataTable:
UniqueConstraint uc; uc = new UniqueConstraint(tableEmp.Columns("empID")); tableEmp.Constraints.Add(uc);
A unique constraint can also be created implicitly by setting the Unique property of the column to true. In contrast, setting the Unique property to false resets the constraint. In addition, adding a column to the in-memory primary key for a table would automatically create a unique constraint for the column. Note that a primary key on a DataTable object is an array of DataColumn objects that is used to index and sort the rows. The Select method on the DataTable benefits from the index as much as other methods on the DataView class do.
Note | When you define a DataColumn as the primary key for a DataTable object, the table automatically sets the AllowDBNull property of the column to false and the Unique property to true. If the primary key is made of multiple columns, only the AllowDBNull property is automatically set to false. |
A data relation represents a parent/child relationship between two DataTable objects in the same DataSet. In the .NET Framework, a data relation is represented by a DataRelation object. You set a relation between two tables based on matching columns in the parent and child tables. The matching columns in the two related tables can have different names, but they must have the same type. All the relations for the tables in a DataSet are stored in the Relations collection. Table 8-14 lists the properties of the DataRelation class.
Property | Description |
---|---|
ChildColumns | Gets the child DataColumn objects for the relation. |
ChildKeyConstraint | Gets the ForeignKeyConstraint object for the relation. |
ChildTable | Gets the child DataTable object for the relation. |
DataSet | Gets the DataSet to which the relation belongs. |
ExtendedProperties | Gets the collection that stores user information. |
Nested | Gets or sets a value that indicates whether the relation should render its data as nested subtrees when the DataSet is rendered to XML (more on this later in the "Serializing a Data Relation" section). |
ParentColumns | Gets the parent DataColumn objects for the relation. |
ParentKeyConstraint | Gets the UniqueConstraint object that ensures unique values on the parent column of the relation. |
ParentTable | Gets the parent DataTable object for the relation. |
RelationName | Gets or sets the name of the DataRelation object. The name is used to identify the relation in the Relations collection of the parent DataSet object. |
When a DataRelation is created, two constraints are silently created. A foreign-key constraint is set on the child table using the two columns that form the relation as arguments. In addition, the parent table is given a unique constraint that prevents it from containing duplicates. The constraints are created by default, but by using a different constructor you can instruct the DataRelation to skip that step. The DataRelation class has no significant methods.
The DataRelation class can be seen as the memory counterpart of a database table relationship. However, when a DataSet is loaded from a database, DBMS-specific relationships are not processed and loaded. As a result, data relations are exclusively in-memory objects that must be created explicitly with code. The following snippet shows how:
DataColumn c1 = tableEmp.Columns("empID"); DataColumn c2 = tableOrd.Columns("empID"); DataRelation rel = new DataRelation("Emp2Orders", c1, c2); DataSet.Relations.Add(rel);
Given two tables, Employees and Orders, the preceding code sets up a relationship between the two based on the values of the common column empID. What are the practical advantages of such a relation? After the relation is set, the parent DataTable knows that each row might be bound to a bunch of child related rows. In particular, each employee in the Employees table has an array of related rows in the Orders table. The child rows are exactly those where the value of the Orders.empID column matches the empID column on the current Employees row.
ADO.NET provides an automatic mechanism to facilitate the retrieval of these related rows. The method is GetChildRows and is exposed by the DataRow class. GetChildRows takes a relation and returns an array filled with all the DataRow objects that match:
foreach(DataRow childRow in parentRow.GetChildRows("Emp2Orders")) { // Process the child row[s] }
Another important facility ADO.NET provides for data relations has to do with table calculations and expression-based columns.
A common task in many real-world applications entails that you manage two related tables and, given a parent row, process the subset of child records. In many situations, processing the child rows just means performing some aggregate computations on them. This is just one of the facilities that ADO.NET and relations provide for free. Let's suppose that, given the previous employees-to-orders relation, you need to compute the total of orders issued by a given employee. You could simply add a dynamically computed column to the parent table and bind it to the data in the relation:
tableEmp.Columns.Add("Total", typeof(int), "Sum(child(Emp2Orders).Amount)");
The new column Total contains, per each employee, a value that represents the sum of all the values in the Amount column for the child rows of the relation. In other words, now you have a column that automatically computes the total of orders issued by each employee. The keyword child is a special syntax element of the language that ADO.NET expressions support. Basically, the child keyword takes a relation name and returns an array of DataRow objects that is the child of that relation.
The Nested property on the DataRelation object affects the way in which the parent DataSet is rendered to XML. By default, the presence of a relation doesn't change the XML schema used to serialize a DataSet. All the tables are therefore rendered sequentially under the root node. A nested relation changes this default schema. In particular, a nested relation is rendered hierarchically with child rows nested under the parent row.
A DataSet with Employees and Orders tables is rendered according to the following pattern:
<MyDataSet> <Employees emp name="Joe Users" /> ... <Orders emp amount="6897" /> <Orders emp amount="19713" /> ... </MyDataSet>
If a relation exists between the tables and is set as nested, the XML schema changes as follows:
<MyDataSet> <Employees emp name="Joe Users"> <Orders emp amount="6897" /> <Orders emp amount="19713" /> </Employees> ... </MyDataSet>
The child rows are taken out of their natural place and placed within the subtree that represents the parent row.
The DataView class represents a customized view of a DataTable. The relationship between DataTable and DataView objects is governed by the rules of a well-known design pattern: the document/view model. The DataTable object acts as the document, whereas the DataView behaves as the view. At any moment, you can have multiple, different views of the same underlying data. More important, you can manage each view as an independent object with its own set of properties, methods, and events.
The view is implemented by maintaining a separate array with the indexes of the original rows that match the criteria set on the view. By default, the table view is unfiltered and contains all the records included in the table. By using the RowFilter and RowStateFilter properties, you can narrow the set of rows that fit into a particular view. Using the Sort property, you can apply a sort expression to the rows in the view. Table 8-15 lists the properties of the DataView class.
Property | Description |
---|---|
AllowDelete | Gets or sets a value that indicates whether deletes are allowed in the view. |
AllowEdit | Gets or sets a value that indicates whether edits are allowed in the view. |
AllowNew | Gets or sets a value that indicates whether new rows can be added through the view. |
ApplyDefaultSort | Gets or sets a value that indicates whether to use the default sort. |
Count | Gets the number of rows in the view after the filter has been applied. |
DataViewManager | Gets the DataViewManager object associated with this view. |
Item | An indexer property. Gets a row of data from the underlying table. |
RowFilter | Gets or sets the expression used to filter out rows in the view. |
RowStateFilter | Gets or sets the row state filter used in the view. |
Sort | Gets or sets the sorting of the view in terms of columns and order. |
Table | Gets or sets the source DataTable for the view. |
The filter can be an expression, the state of the rows, or both. The RowStateFilter property, in particular, takes its acceptable values from the DataViewRowState enumeration and allows you to filter based on the original or current values of the row, or on modified, added, or deleted rows. The RowFilter property supports the same syntax as the DataTable's Select method.
A DataView does not contain copies of the table's rows. It is limited to storing an array of indexes that is updated whenever any of the filter properties is set. The DataView object is already connected to the underlying DataTable, of which it represents a possibly filtered and/ or sorted view. The AllowXXX properties only let you control whether the view is editable. By default, the view is fully editable. Table 8-16 lists the methods of the DataView class.
Method | Description |
---|---|
AddNew | Adds a new row to the view and the underlying table. |
BeginInit | Begins the initialization of the view. |
CopyTo | Copies items from the view into an array. |
Delete | Deletes the row at the specified index in the view. The row is deleted from the table, too. |
EndInit | Ends the initialization of the view. |
Find | Finds a row in the view by the specified key value. |
FindRows | Returns an array of row objects that match the specified key value. |
GetEnumerator | Returns an enumerator object for the DataView. |
Both the AddNew and Delete methods affect the underlying DataTable object. Multiple changes can be grouped using the pair BeginInit/EndInit.
The contents of a DataView object can be scrolled through a variety of programming interfaces, including collections, lists, and enumerators. The GetEnumerator method, in particular, ensures that you can walk your way through the records in the view by using the familiar for each statement. The following code shows how to access all the rows that fit into the view:
DataView myView = new DataView(table); foreach(DataRowView rowview in myView) { // dereferences the DataRow object DataRow row = rowview.Row; ... }
When client applications access a particular row in the view, the DataView class expects to find it in an internal cache of rows. If the cache is not empty, the specified row is returned to the caller via an intermediate DataRowView object. The DataRowView object is a kind of wrapper for the DataRow object that contains the actual data. You access row data through the Row property. If the row cache is empty, the DataView class fills it up with an array of DataRowView objects, each of which references an original DataRow object. The row cache is refreshed whenever the sort expression or the filter string is updated. The row cache can be empty either because it's never been used or because the sort expression or the filter string has been changed in the meantime. Figure 8-5 illustrates the internal architecture of a DataView object.
Figure 8-5: The internal structure of a DataView object.
The link between the DataTable and the DataView is typically established at creation time through the constructor:
public DataView(DataTable table);
However, you could also create a new view and associate it with a table at a later time using the DataView's Table property, as shown here:
DataView dv = new DataView(); dv.Table = dataSet.Tables["Employees"];
You can also obtain a DataView object from any table. In fact, the DefaultView property of a DataTable object just returns a DataView object initialized to work on that table:
DataView dv = dt.DefaultView;
Originally, the view is unfiltered and the index array contains as many elements as the rows in the table. To quickly find a row, you can use either the Find or FindRows method. The Find method takes one or more values and returns an array with the indexes of the rows that match. The FindRows method works in much the same way, but it returns an array of DataRowView objects rather than indexes. Both methods accept a sort key value to identify the rows to return.
Note | The contents of a DataView can be sorted by multiple fields by using the Sort property. You can assign to Sort a comma-separated list of column names, and even append them with DESC or ASC to indicate the direction. |