XML and the DataSet
XML in the .NET Framework is very closely related to ADO.NET. The DataSet internally represents its data as XML. This means that the DataSet in the computer's memory is stored as XML and not some abstract data model. So you're viewing XML data in either situation: directly with the XML classes, or indirectly through ADO.NET. The DataSet simply provides a different view than XML.
Let's examine the current situation. On one hand, you have ADO.NET and its objects. There are simple objects that provide quick and easy access to data, such as the OleDbDataReader. Then there are the more complex objects, like the DataSet, which contains relational information and provides more functionality than the OleDbDataReader.
Next you have the XML .NET Framework, which also has simple and complex objects. The XmlTextReader provides simple, lightweight access to reading XML data, and the XmlDocument provides more functionality. The latter doesn't do well representing relational data, however, so the XmlDataDocument was introduced.
The XmlDataDocument is to XML as the DataSet is to ADO.NET. These two objects are very similar to each other, and you can easily convert from one to another. In a way, these two objects are the bridge between ADO.NET and XML.
The XmlDataDocument is similar to the XmlDocument, but it provides a relational data representation that's analogous to the DataSet. The XmlDataDocument can be used anywhere the XmlDocument can, in fact, with the same methods and properties.
Whenever you load data into an XmlDataDocument, the .NET Framework automatically creates a DataSet for you, accessed via the XmlDataDocument.DataSet property. The XML schema is used to build the columns and their data types in the DataSet. If a schema isn't provided, ASP.NET will infer the structure accordingly.
This allows you to modify data however you want. You can open an XML file using the XML objects and then move them into a DataSet for binding with a server control, for instance. Or retrieve data from a database with the DataSet and save it to an XML file. Any changes made to the DataSet will be reflected in the XmlDataDocument. Changes to the XmlDataDocument may or may not result in changes to the DataSet, however. If the new data corresponds to the fields in the DataSet, the row will be added.
Figure 11.11 shows the relationship between the two objects.
Figure 11.11. The relationship between the DataSet and the XmlDataDocument.
Let's take a look at the relationship between these two objects. Using your books.xml file, you'll load the data into a DataSet from an XmlDataDocument and output the data in two different ways. Let's look at Listing 11.12, which shows the code declaration block using a DataSet and XmlDataDocument (the HTML portion of the page simply shows a Label and two DataGrids).
Listing 11.12 Using DataSets and XmlDataDocuments to Display Data
1: <%@ Page Language="VB" %> 2: <%@ Import Namespace="System.Xml" %> 3: <%@ Import Namespace="System.Data" %> 4: <%@ Import Namespace="System.Data.OleDb" %> 5: 6: <script runat=server> 7: private i, j as integer 8: private strOutput as string = "" 9: 10: sub Page_Load(Sender as Object, e as EventArgs) 11: dim xmldoc as new XMLDataDocument() 12: 13: try 14: xmldoc.DataSet.ReadXml(Server.MapPath("books.xml")) 15: 16: 'select data view and bind to server control 17: DataGrid1.DataSource = xmldoc.DataSet 18: DataGrid1.DataMember = xmldoc.DataSet.Tables(0). _ 19: TableName 20: DataGrid2.DataSource = xmldoc.DataSet 21: DataGrid2.DataMember = xmldoc.DataSet.Tables(1). _ 22: TableName 23: 24: DataGrid1.DataBind() 25: DataGrid2.DataBind() 26: 27: For i = 0 To xmldoc.DataSet.Tables.Count - 1 28: strOutput += "TableName = """ & _ 29: xmldoc.DataSet.Tables(i).TableName & """<br>" 30: strOutput += " " & "Columns " & _ 31: "count = " & xmldoc.DataSet.Tables(i). _ 32: Columns.Count.ToString() & "<br>" 33: 34: For j = 0 To xmldoc.DataSet.Tables(i). _ 35: Columns.Count-1 36: strOutput += " " & _ 37: "ColumnName = """ & xmldoc.DataSet. _ 38: Tables(i).Columns(j).ColumnName & """, & _ 39: type = " & xmldoc.DataSet.Tables(i). & _ 40: Columns(j).DataType.ToString() & "<br>" 41: Next 42: Next 43: 44: strOutput += "<p>" 45: 46: catch ex as Exception 47: strOutput = "Error accessing XML file" 48: end try 49: 50: output.Text = strOutput 51: end sub 52: </script> 53: <html><body> 54: <asp:Label runat="server" /> 55: 56: <asp:DataGrid runat="server" 57: BorderColor="black" 58: GridLines="Vertical" 59: cellpadding="4" 60: cellspacing="0" 61: width="450" 62: Font-Name="Arial" 63: Font-Size="8pt" 64: HeaderStyle-BackColor="#cccc99" 65: FooterStyle-BackColor="#cccc99" 66: ItemStyle-BackColor="#ffffff" 67: AlternatingItemStyle-Backcolor="#cccccc" /> 68: <p> 69: <asp:DataGrid runat="server" 70: BorderColor="black" 71: GridLines="Vertical" 72: cellpadding="4" 73: cellspacing="0" 74: width="450" 75: Font-Name="Arial" 76: Font-Size="8pt" 77: HeaderStyle-BackColor="#cccc99" 78: FooterStyle-BackColor="#cccc99" 79: ItemStyle-BackColor="#ffffff" 80: AlternatingItemStyle-Backcolor="#cccccc" /> 81: </body></html>
| || |
This listing starts off by creating an XmlDataDocument. Rather than loading this object directly with data, you use the ReadXml method of the DataSet property to read in the data, as shown on line 14. This method creates a relational view of the data automatically, as you'll see in a moment.
The code in Listing 11.12 will only work if the elements in your XML file follow the same formats. For example, if you ran Listing 11.11 as is, you would see a new <book> element near the bottom of books.xml:
<book style="hardcover" xmlns="" />
Even though this element is well-formed by our previous definition (it will even pass validation), it doesn't follow the same format as the other <book> elements in books.xml. Listing 11.12 will get confused, thinking this is a new element with the same name, and the created DataSet will error out.
To remedy this problem, simply remove the offending <book> element.
Let's skip to line 27 for a moment. You'll loop through the tables in the DataSet, outputting the names of each table and the column count with lines 27?2. Your second for loop on lines 34?1 outputs the names of the columns and their data types, as represented in the DataSet. The DataSet knows which kind of data each field represents because it infers the schema from the data structure. These two loops will print a representation of the relational XML data.
However, since you're using a DataSet, there's an easier way to do this. On lines 17?2, you simply bind the data to two different DataGrid controls defined in your page (you'll find out why you're using two DataGrids in a moment). Figure 11.12 shows the output of this listing.
Figure 11.12. Viewing relational XML data with an XmlDataDocument and DataSet.
Wait a minute… this figure shows two tables, but you only have a single XML file. What happened?
The .NET Framework read the XML schema and saw that the data could be represented relationally. Specifically, it separated the author information and placed it into another table. It also automatically generated a foreign key column to link the two tables! In the two DataGrids, you can see the data represented in a more traditional format.
How exactly did the DataSet determine the data structure without a schema? It's very simple, actually:
Any elements with attributes become tables.
Any elements that contain other elements become tables.
If there are two or more elements with the same name, they become a table.
All direct children of the root node become tables.
Everything else becomes a column. Then, if any data in the XML file matches the columns, it's added as a row to the DataSet.
These reasons are why Listing 11.12 would generate an error, as mentioned in the previous note. The DataSet will try to make two different tables one for the regular <book> elements and one for the new, offending element and because they both have the same name, you'll receive an error.
Is this truly relational? What happens when you add another book node in the XML file with the same author as an existing node? Figure 11.13 shows this situation using the DataGrids.
Figure 11.13. Adding more relational data.
Uh-oh, that didn't turn out right. ASP.NET failed to detect that you already have an author named Barbara Kingsolver, and it added a new entry to the author table. Trying to create a primary key over the first name and last name columns will result in an error. Unfortunately, there's no easy way to resolve this. You'd have to manually update the foreign key and delete the extra row in the author table.
Regardless of this limitation, being able to represent data in a DataSet or with XML is a powerful feature. Relational information can be created in XML and transferred to a DataSet for storage in a database, or the other way around for transportation as an XML file.
As you've been learning over the past few days, relational data is a very common and efficient manner for representing data, and XML is certainly no slouch in this area. The XmlDataDocument, coupled with the DataSet, can access and manipulate relational data from any source.