The DataSet is the heart of ADO .NET. The DataSet is essentially a collection of DataTable objects; in turn each object contains a collection of DataColumn and DataRow objects. The DataSet also contains a Relations collection, which can be used to define relations between DataTable objects. There is a Constraints collection, which can be used to define unique keys and data validation rules. There are also DataView objects that work hand-in-hand with the DataSet. These objects can be used to define filters, sort orders, and display attributes in a given DataTable. You can change from one DataView to another and instantly change the way data is presented to the user . The DataView is part of the data binding mechanism in ADO .NET.
The DataTable Class
The DataTable is just what it sounds like. It is a collection of rows and columns. When we discuss the DataSet, we are really discussing the DataTable, because that is where most of the action takes place. The approach Microsoft took this time around is a bit different than in past versions of ADO and other data access technologies. In the past, the object used for holding table data was the RecordSet. The RecordSet had a collection of columns or fields. These contained the metadata that described each column. The rows were not represented as a collection, and only one row at a time could be accessed. There was the concept of the current row. You used methods like MoveNext and MovePrevious to change the current record pointer. The only way to access multiple rows at a time was to clone the RecordSet. Additionally, there was an edit buffer that was used when the user wanted to make changes to the data. There was also an insert buffer used to add rows to the RecordSet. In most cases, when a row was changed, as soon as the user moved to another row, the data was updated to the database. The database update was automatic and was the default behavior. You could also programmatically force an update using the Update method of the RecordSet object. The RecordSet mostly relied on the error handling in the underlying database system to validate data. It did very little error checking without a round-trip to the server.
By contrast, the ADO .NET DataTable is organized differently. It also has a Columns collection that contains metadata describing the data in the table. Instead of the current row and row pointer method used in the past, the DataTable rows are represented as a true collection. This means there is no longer any concept of the current row. The rows are accessed via a numerical index. The nice thing about this is that all rows are always available. The bad thing is that it's up to the programmer to make sure you are updating not only the correct column, but also the correct row. Luckily, when you bind the DataTable to controls on a form, the form keeps track of the current row though its BindingContext class. If you are not using a form, then you must track the row index yourself. One bad thing is that there is no way to obtain the index of a row from the row itself.
This is annoying when you are using a For Each x In Data Table1.Rows construct. You must maintain your own loop counter. In this case, a For I = 0 to DataTable1.Rows.Count 1 construct would be a better choice.
The Rows collection is also only loosely connected to the Columns collection. In prior versions of ADO, you accessed the value of a column by positioning the row pointer to the desired row, then accessing the Value property of the field object. To get the value at a given row and column of the DataTable, you must reference the Rows collection by row index and column index, much like an array. To read the data, this creates a rather arcane-looking line of code:
Dim x as String x = DataSet1.Tables(0).Rows(3)("LastName")
This would return the value in the fourth row (the Rows collection is zero-based , like all .NET collections) in the LastName column. Thankfully, the designers of ADO .NET allowed the row items to be indexed by the column name or it would make no sense at all. The previous snippet uses a short-hand form. The full form follows :
x = DataSet1.Tables(0).Rows(3).Items("LastName").Item
The Items collection is the default property for the Rows collection, so it can be omitted. To get other information about the column, you have to reference the Columns collection. To get the data type of the column, the syntax is as follows:
Dim x as String x = DataSet1.Tables(0).Columns.Items("LastName").DataType().ToString
I admit to having a less-than -easy time adapting to the new way of doing things. This is one of the problems I've had with highly object-oriented languages. Sometimes the syntax gets really arcane and confusing because everything is wrapped in collections and objects. To me the old ADO syntax was much easier to understand:
X = RecordSet1("LastName").Value Y = RecordSet1("LastName").Type
Perhaps a combination of the two, where you could use the current record concept or access directly by row and column index would be a good compromise. I can see the advantages of directly accessing rows. You don't have the overhead of looping through the data until you find the correct row. However, usually you don't know which row you want by its ordinal position. The only way to find it would be to loop through the collection testing conditions, so I don't see what's gained except an arcane syntax. Well, maybe they'll clean it up with the next revision.
The DataRelation Class
Objects can be used to create a parent/child relationship between DataTables in a DataSet. While DataRelations appear similar to join-type relationships in databases, they do not create a join. A join creates a third entity that is the flat representation of the two tables. The DataRelation simply defines the relationship between two tables as a parent/child structure. Are you wondering what the difference is? The difference is in the way the child rows are accessed.
Say we have an order header table and a line item table. The order number relates the two tables. In order to get the rows in the line item table that match the header record, you would call the GetChildRows method of the row in the header record, passing as a parameter the DataRelation object that defines the relationship to the line item table. The method returns an array of child (line item) records that match the condition in the relation. The array, in case you were wondering, is an array of DataRow objects.
The DataView Class
The DataView class is the enabler of data binding on the DataSet's part. Each DataTable has a default DataView that is created automatically. Controls on forms are actually bound to the DataTable's default DataView. You can create additional DataView objects as needed. You can use DataViews to predefine things like sort orders and filters. They are especially useful when combined with the DataGrid control. Another great use for the DataView is to control security access levels. The DataView's AllowEdit, AllowDelete, and AllowAdd properties can be used to control access to data based on which user is logged on.