XML And ADO.NET | Pro Visual C++ 2005 for C# Developers

XML is the glue that binds ADO.NET to the rest of the world. ADO.NET was designed from the ground up to work within the XML environment. XML is used to transfer the data to and from the data store and the application or Web page. Because ADO.NET uses XML as the transport in remoting scenarios, data can be exchanged with applications and systems that are not even aware of ADO.NET. Because of the importance of XML in ADO.NET, there are some powerful features in ADO.NET that allow the reading and writing of XML documents. The System.Xml namespace also contains classes that can consume or utilize ADO.NET relational data.

Converting ADO.NET Data to XML

The first example uses ADO.NET, streams, and XML to pull some data from the Northwind database into a DataSet, load an XmlDocument object with the XML from the DataSet, and load the XML into a list box. To run the next few examples, you need to add the following using statements:

 using System.Data;  using System.Xml; using System.Data.SqlClient; using System.IO;

Because you will be using XmlDocument, you also need to add the following at the module level:

 private XmlDocument doc = new XmlDocument();

Also, the ADO.NET samples have a DataGrid object added to the forms. This will allow you to see the data in the ADO.NET DataSet because it is bound to the grid, as well as the data from the generated XML documents that you load in the list box. Here is the code for the first example:

 private void button1_Click(object sender, EventArgs e) { listBox1.SelectedIndexChanged += new EventHandler(example1_SelectedIndexChanged); //create a dataset DataSet ds = new DataSet("XMLAuthors"); //connect to the northwind database and  SqlConnection conn = new SqlConnection(_connectString); SqlDataAdapter da = new SqlDataAdapter("SELECT * FROM authors", conn);

After you create the SqlDataAdapter, da, and the DataSet, ds, you instantiate a MemoryStream object, a StreamReader object, and a StreamWriter object. The StreamReader and StreamWriter objects will use the MemoryStream to move the XML around:

 MemoryStream memStrm=new MemoryStream();  StreamReader strmRead=new StreamReader(memStrm);  StreamWriter strmWrite=new StreamWriter(memStrm);

You use a MemoryStream so that you don't have to write anything to disk; however, you could have used any object that was based on the Stream class, such as FileStream. Next, you fill the DataSet and bind it to the DataGrid. The data in the DataSet will now be displayed in the DataGrid:

 da.Fill(ds, "Authors");  //load data into grid  dataGridView1.DataSource = ds.Tables[0];

This next step is where the XML is generated. You call the WriteXml() method from the DataSet class. This method generates an XML document. WriteXml() has two overloads: one takes a string with the file path and name, and the other adds a mode parameter. This mode is an XmlWriteMode enumeration, with the following possible values:

IgnoreSchema
WriteSchema
DiffGram

IgnoreSchema is used if you don't want WriteXml() to write an inline schema at the start of your XML file; use the WriteSchema parameter if you do want one. You look at DiffGrams later in this section.

 ds.WriteXml(strmWrite,XmlWriteMode.IgnoreSchema); memStrm.Seek(0,SeekOrigin.Begin); //read from the memory stream to an XmlDocument object doc.Load(strmRead); XmlNodeList nodeLst = doc.SelectNodes("//au_lname");   //load them into the list box foreach(XmlNode nd in nodeLst) listBox1.Items.Add(nd.InnerText); } void example1_SelectedIndexChanged(object sender, EventArgs e) { //when you click on the listbox. //a message box with the XML of the selected node string srch = "XMLAuthors/Authors[au_lname=" + '"' +  listBox1.SelectedItem.ToString() + '"' + "]"; XmlNode foundNode = doc.SelectSingleNode(srch); if (foundNode != null) MessageBox.Show(foundNode.OuterXml); else MessageBox.Show("Not found"); }

Figure 21-5 shows the data in the list as well as the bound data grid.

image from book
Figure 21-5

If you had only wanted the schema, you could have called WriteXmlSchema() instead of WriteXml(). This method has four overloads. One takes a string, which is the path and file name of where to write the XML document. The second overload uses an object that is based on the XmlWriter class. The third overload uses an object based on the TextWriter class. The fourth overload is derived from the Stream class.

Also, if you wanted to persist the XML document to disk, you would have used something like this:

 string file = "c:\\test\\product.xml";  ds.WriteXml(file);

This would give you a well-formed XML document on disk that could be read in by another stream, or by DataSet, or used by another application or Web site. Because no XmlMode parameter is specified, this XmlDocument would have the schema included. In this example, you use the stream as a parameter to the XmlDocument.Load() method.

Once the XmlDocument is prepared, you load the list box using the same XPath statement that you used before. If you look closely, you'll see that you changed the listBox1_SelectedIndexChanged() event slightly. Instead of showing the InnerText of the element, you do another XPath search using SelectSingleNode() to get the authors last name element. So now every time you select an author in the list box, a MessageBox pops up with the xml of the element. You now have two views of the data, but more importantly, you can manipulate the data using two different models. You can use the System.Data namespace to use the data or you can use the System.Xml namespace on the data. This can lead to some very flexible designs in your applications, because now you are not tied to just one object model to program with. This is the real power to the ADO.NET and System.Xml combination. You have multiple views of the same data and multiple ways to access the data.

The following example simplifies the process by eliminating the three streams and by using some of the ADO capabilities built into the System.Xml namespace. You will need to change the module-level line of code

 private XmlDocument doc = new XmlDocument();

 private XmlDataDocument doc;

You need this because you are now using the XmlDataDocument. Here is the code (which you can find in the ADOSample2 folder):

 private void button3_Click(object sender, EventArgs e) { listBox1.SelectedIndexChanged += new EventHandler(example1_SelectedIndexChanged); //create a dataset DataSet ds = new DataSet("XMLAuthors"); //connect to the northwind database and  //select all of the rows from products table SqlConnection conn = new SqlConnection(_connectString); SqlDataAdapter da = new SqlDataAdapter("SELECT * FROM authors", conn); //fill the dataset da.Fill(ds, "Authors"); ds.WriteXml("sample.xml", XmlWriteMode.WriteSchema); //load data into grid dataGridView1.DataSource = ds.Tables[0]; doc = new XmlDataDocument(ds); //get all of the products elements XmlNodeList nodeLst = doc.GetElementsByTagName("au_lname"); //load them into the list box for (int ctr = 0; ctr < nodeLst.Count; ctr++) //foreach(XmlNode node in nodeLst) // listBox1.Items.Add(node.InnerText); listBox1.Items.Add(nodeLst[ctr].InnerText);

As you can see, the code to load the DataSet object into the XML document has been simplified. Instead of using the XmlDocument class, you are using the XmlDataDocument class. This class was built specifically for using data with a DataSet object.

The XmlDataDocument is based on the XmlDocument class, so it has all of the functionality that the XmlDocument class has. One of the main differences is the overloaded constructor that the XmlDataDocument has. Note the line of code that instantiates XmlDataDocument (doc):

doc = new XmlDataDocument(ds);

It passes in the DataSet object that you created, ds, as a parameter. This creates the XML document from the DataSet, and you don't have to use the Load() method. In fact, if you instantiate a new XmlDataDocument object without passing in a DataSet as the parameter, it will contain a DataSet with the name NewDataSet that has no DataTables in the tables collection. There is also a DataSet property that you can set after an XmlDataDocument-based object is created.

Suppose you add the following line of code after the DataSet.Fill() call:

 ds.WriteXml("c:\\test\\sample.xml", XmlWriteMode.WriteSchema);

In this case, the following XML file, sample.xml, is produced in the folder c:\test:

 ?xml version="1.0" standalone="yes"?> <XMLAuthors> <xs:schema  xmlns="" xmlns:xs=http://www.w3.org/2001/XMLSchema xmlns:msdata="urn:schemas-microsoft-com:xml-msdata"> <xs:element name="XMLAuthors" msdata:IsDataSet="true" msdata:UseCurrentLocale="true"> <xs:complexType> <xs:choice minOccurs="0" maxOccurs="unbounded"> <xs:element name="Authors"> <xs:complexType> <xs:sequence> <xs:element name="au_id" type="xs:string" minOccurs="0" /> <xs:element name="au_lname" type="xs:string" minOccurs="0" /> <xs:element name="au_fname" type="xs:string" minOccurs="0" /> <xs:element name="phone" type="xs:string" minOccurs="0" /> <xs:element name="address" type="xs:string" minOccurs="0" /> <xs:element name="city" type="xs:string" minOccurs="0" /> <xs:element name="state" type="xs:string" minOccurs="0" /> <xs:element name="zip" type="xs:string" minOccurs="0" /> <xs:element name="contract" type="xs:boolean" minOccurs="0" /> </xs:sequence> </xs:complexType> </xs:element> </xs:choice> </xs:complexType> </xs:element> </xs:schema> <Authors> <au_id>172-32-1176</au_id> <au_lname>White</au_lname> <au_fname>Johnson</au_fname> <phone>408 496-7223</phone> <address>10932 Bigge Rd.</address> <city>Menlo Park</city> <state>CA</state> <zip>94025</zip> <contract>true</contract> </Authors>

Only the first products element is shown. The actual XML file would contain all of the products in the Products table of Northwind database.

Converting relational data

This looks simple enough for a single table, but what about relational data, such as multiple DataTables and Relations in the DataSet? It all still works the same way. Make the following changes to the code that you've been using (this version is in ADOSample3):

 private void button5_Click(object sender, EventArgs e) { listBox1.SelectedIndexChanged += new EventHandler(example1_SelectedIndexChanged); //create a dataset DataSet ds = new DataSet("XMLAuthors"); //connect to the pubs database and  //select all of the rows from authors table and from titleauthors table //make sure you connect string matches you server configuration SqlConnection conn = new SqlConnection(_connectString); SqlDataAdapter daProd = new SqlDataAdapter("SELECT * FROM authors", conn); SqlDataAdapter daSup = new SqlDataAdapter("SELECT * FROM titleauthor", conn); //Fill DataSet from both SqlAdapters daProd.Fill(ds, "Authors"); daSup.Fill(ds, "Titles"); //Add the relation ds.Relations.Add(ds.Tables["Authors"].Columns["au_id"], ds.Tables["Titles"].Columns["au_id"]); //Write the Xml to a file so we can look at it later ds.WriteXml("AuthorTitle.xml", XmlWriteMode.WriteSchema); //load data into grid dataGridView1.DataSource = ds.Tables[0]; //create the XmlDataDocument doc = new XmlDataDocument(ds); //Select the productname elements and load them in the grid XmlNodeList nodeLst = doc.SelectNodes("//au_lname"); foreach (XmlNode nd in nodeLst) listBox1.Items.Add(nd.InnerXml); }

In this sample you are creating two DataTables in the XMLAuthors DataSet: Authors and Titles. You create a new relation on the au_id column in both tables.

By making the same WriteXml() method call that you did in the previous example, you will get the following XML file (SuppProd.xml):

 <?xml version="1.0" standalone="yes"?> <XMLAuthors> <xs:schema  xmlns="" xmlns:xs="http://www.w3.org/2001/XMLSchema" xmlns:msdata="urn:schemas-microsoft-com:xml-msdata"> <xs:element name="XMLAuthors" msdata:IsDataSet="true" msdata:UseCurrentLocale="true"> <xs:complexType> <xs:choice minOccurs="0" maxOccurs="unbounded"> <xs:element name="Authors"> <xs:complexType> <xs:sequence> <xs:element name="au_id" type="xs:string" minOccurs="0" /> <xs:element name="au_lname" type="xs:string" minOccurs="0" /> <xs:element name="au_fname" type="xs:string" minOccurs="0" /> <xs:element name="phone" type="xs:string" minOccurs="0" /> <xs:element name="address" type="xs:string" minOccurs="0" /> <xs:element name="city" type="xs:string" minOccurs="0" /> <xs:element name="state" type="xs:string" minOccurs="0" /> <xs:element name="zip" type="xs:string" minOccurs="0" /> <xs:element name="contract" type="xs:boolean" minOccurs="0" /> </xs:sequence> </xs:complexType> </xs:element> <xs:element name="Titles"> <xs:complexType> <xs:sequence> <xs:element name="au_id" type="xs:string" minOccurs="0" /> <xs:element name="title_id" type="xs:string" minOccurs="0" /> <xs:element name="au_ord" type="xs:unsignedByte" minOccurs="0" /> <xs:element name="royaltyper" type="xs:int" minOccurs="0" /> </xs:sequence> </xs:complexType> </xs:element> </xs:choice> </xs:complexType> <xs:unique name="Constraint1"> <xs:selector xpath=".//Authors" /> <xs:field xpath="au_id" /> </xs:unique> <xs:keyref name="Relation1" refer="Constraint1"> <xs:selector xpath=".//Titles" /> <xs:field xpath="au_id" /> </xs:keyref> </xs:element> </xs:schema> <Authors> <au_id>213-46-8915</au_id> <au_lname>Green</au_lname> <au_fname>Marjorie</au_fname> <phone>415 986-7020</phone> <address>309 63rd St. #411</address> <city>Oakland</city> <state>CA</state> <zip>94618</zip> <contract>true</contract> </Authors> <Titles> <au_id>213-46-8915</au_id> <title_id>BU1032</title_id> <au_ord>2</au_ord> <royaltyper>40</royaltyper> </Titles> <Titles> <au_id>213-46-8915</au_id> <title_id>BU2075</title_id> <au_ord>1</au_ord> <royaltyper>100</royaltyper> </Titles> </XMLAuthors>

The schema includes both DataTables that were in the DataSet. In addition, the data includes all of the data from both tables. For the sake of brevity, only the first author and title records are shown here. As before, you could have saved just the schema or just the data by passing in the correct XmlWriteMode parameter.

Converting XML to ADO.NET Data

Suppose that you have an XML document that you would like to get into an ADO.NET DataSet. You would want to do this so you could load the XML into a database, or perhaps bind the data to a .NET data control such as DataGrid. This way, you could actually use the XML document as your data store and eliminate the overhead of the database altogether. If your data is reasonably small in size, this is an attractive possibility. Here is some code to get you started (ADOSample5):

private void button1_Click(object sender, System.EventArgs e) {    //create a new DataSet    DataSet ds=new DataSet("XMLAuthors"); //read in the XML document to the Dataset ds.ReadXml("sample.xml");    //load data into grid    dataGridView1.DataSource = ds;       //create the new XmlDataDocument    Doc = new XmlDataDocument(ds);    //load the product names into the listbox    XmlNodeList nodeLst=doc.SelectNodes("//au_lname");    foreach(XmlNode nd in nodeLst)       listBox1.Items.Add(nd.InnerXml); }

It is that easy. You instantiate a new DataSet object. Then you call the ReadXml() method, and you have XML in a DataTable in your DataSet. As with the WriteXml() methods, ReadXml() has an XmlReadMode parameter. ReadXml() has a few more options in the XmlReadMode, as shown in the following table.

Value	Description
Auto	Sets the XmlReadMode to the most appropriate setting. If data is in DiffGram format, DiffGram is selected. If a schema has already been read, or an inline schema is detected, then ReadSchema is selected. If no schema has been assigned to the DataSet, and none is detected inline, then IgnoreSchema is selected.
DiffGram	Reads in the DiffGram and applies the changes to the DataSet. DiffGrams are described later in the chapter.
Fragment	Reads documents that contain XDR schema fragments, such as the type cre- ated by SQL Server.
IgnoreSchema	Ignores any inline schema that may be found. Reads data into the current DataSet schema. If data does not match DataSet schema it is discarded.
InferSchema	Ignores any inline schema. Creates the schema based on data in the XML doc- ument. If a schema exists in the DataSet, that schema is used, and extended with additional columns and tables if needed. An exception is thrown if a col- umn exists but is of a different data type.
ReadSchema	Reads the inline schema and loads the data. Will not overwrite a schema in the DataSet, but will throw an exception if a table in the inline schema already exists in the DataSet.

There is also the ReadXmlSchema() method. This reads in a standalone schema and creates the tables, columns, and relations. You would use this if your schema is not inline with your data. ReadXmlSchema() has the same four overloads: a string with file and path name, a Stream-based object, a TextReader-based object, and an XmlReader-based object.

To show that the data tables are getting created properly, load the XML document that contains the Products and Suppliers tables that you used in an earlier example. This time, however, load the list box with the DataTable names and the DataColumn names and data types. You can look at this and compare it to the original Northwind database to see that all is well. Here is the code for this example (ADOSample5):

 private void button7_Click(object sender, EventArgs e) { //create the DataSet DataSet ds = new DataSet("XMLAuthors"); //read in the xml document ds.ReadXml("AuthorTitle.xml"); //load data into grid dataGridView1.DataSource = ds.Tables[0]; //load the listbox with table, column and datatype info foreach (DataTable dt in ds.Tables) { listBox1.Items.Add(dt.TableName); foreach (DataColumn col in dt.Columns) { listBox1.Items.Add ('\t' + col.ColumnName + " - " + } } }

Note the addition of the two foreach loops. The first loop gets the table name from each table in the Tables collection of the DataSet. Inside the inner foreach loop you get the name and data type of each column in the DataTable. You load this data into the list box, allowing you to display it. Figure 21-6 shows the output.

image from book
Figure 21-6

Looking at the list box you can check that the data tables were created with the columns all having the correct names and data types.

Something else you might want to note is that because the previous two examples didn't transfer any data to or from a database, no SqlDataAdapter or SqlConnection was defined. This shows the real flexibility of both the System.Xml namespace and ADO.NET: you can look at the same data in multiple formats. If you need to do a transform and show the data in HTML format, or if you need to bind the data to a grid, you can take the same data and, with just a method call, have it in the required format.

Reading and Writing a DiffGram

A DiffGram is an XML document that contains the before and after data of an edit session. This can include any combination of data changes, additions, and deletions. A DiffGram can be used as an audit trail or for a commit/rollback process. Most DBMS systems today have this built in, but if you happen to be working with a DBMS that does not have these features or if XML is your data store and you do not have a DBMS, you can implement commit or rollback features yourself.

The beginning part of this code should look familiar. You define and set up a new DataSet, ds, a new SqlConnection, conn, and a new SqlDataAdapter, da. You connect to the database, select all of the rows from the authors table, create a new DataTable named authors, and load the data from the database into the DataSet:

 private void button1_Click(object sender, EventArgs e) { string connectString = "data source=(local)\\sqlexpress;initial catalog=pubs;integrated security=SSPI;"; //create a dataset DataSet ds = new DataSet("XMLAuthors"); //connect to the pubs database and  //select all of the rows from products table SqlConnection conn = new SqlConnection(connectString); SqlDataAdapter da = new SqlDataAdapter("SELECT * FROM authors", conn); da.Fill(ds, "Authors");

In this next code block, you do two things. First, you modify the au_lame column in the first row to NewName. Second, you create a new row in the DataTable, set the column values, and finally add the new data row to the DataTable:

 ds.Tables[0].Rows[0]["au_lname"] = "NewName"; DataRow dr = ds.Tables[0].NewRow(); dr["au_id"] = "123-123-1234"; dr["au_lname"] = "Bunny"; dr["au_fname"] = "Bugs"; dr["contract"]= true;

The next block is the interesting part of the code. First, you write out the schema with WriteXmlSchema(). This is important because you cannot read back in a DiffGram without the schema. WriteXml() with the XmlWriteMode.DiffGram parameter passed to it actually creates the DiffGram. The next line accepts the changes that you made. It is important that the DiffGram is created before calling AcceptChanges(); otherwise there would not appear to be any modifications to the data:

 ds.Tables[0].Rows.Add(dr); ds.WriteXmlSchema("author.xdr"); ds.WriteXml("authdiff.xml", XmlWriteMode.DiffGram); ds.AcceptChanges(); }

To get the data back into a DataSet, you can do the following:

 DataSet dsNew=new DataSet(); dsNew.ReadXmlSchema("author.xsd"); dsNew.XmlRead("authdiff.xml",XmlReadMode.DiffGram);

Here you are creating a new DataSet, dsNew. The call to the ReadXmlSchema() method creates a new DataTable based on the schema information. In this case it would be a clone of the products DataTable. Now you can read in the DiffGram. The DiffGram does not contain schema information, so it is important that the DataTable be created and ready before you call the ReadXml() method.

Here is a sample of what the DiffGram (diffgram.xml) looks like:

 ?xml version="1.0" standalone="yes"?> <diffgr:diffgram xmlns:msdata="urn:schemas-microsoft-com:xml-msdata"  xmlns:diffgr="urn:schemas-microsoft-com:xml-diffgram-v1"> <XMLAuthors> <Authors diffgr: msdata:rowOrder="0" diffgr:hasChanges="modified"> <au_id>172-32-1176</au_id> <au_lname>NewName</au_lname> <au_fname>Johnson</au_fname> <phone>408 496-7223</phone> <address>10932 Bigge Rd.</address> <city>Menlo Park</city> <state>CA</state> <zip>94025</zip> <contract>true</contract> </Authors> <Authors diffgr: msdata:rowOrder="1"> <au_id>213-46-8915</au_id> <au_lname>Green</au_lname> <au_fname>Marjorie</au_fname> <phone>415 986-7020</phone> <address>309 63rd St. #411</address> <city>Oakland</city> <state>CA</state> <zip>94618</zip> <contract>true</contract> </Authors> <diffgr:before> <Authors diffgr: msdata:rowOrder="0"> <au_id>172-32-1176</au_id> <au_lname>White</au_lname> <au_fname>Johnson</au_fname> <phone>408 496-7223</phone> <address>10932 Bigge Rd.</address> <city>Menlo Park</city> <state>CA</state> <zip>94025</zip> <contract>true</contract> </Authors> </diffgr:before> </diffgr:diffgram>

Note that each DataTable row is repeated and that there is a diffgr:id attribute for each <Authors> element (only the first and last of the <Authors> elements is shown in order to save space). diffgr is the namespace prefix for urn:schemas-microsoft-com:xml-diffgram-v1. For rows that were modified or inserted, ADO.NET adds a diffgr:hasChanges attribute. There'salso a <diffgr:before> element after the <XMLAuthors> element, which contains an <Authors> element indicating the previous contents of any modified rows. Obviously the inserted row didn't have any previous contents, so this doesn't have an element in <diffgr:before>.

After the DiffGram has been read into the DataTable, it is in the state that it would be in after changes were made to the data but before AcceptChanges() is called. At this point, you can actually roll back changes by calling the RejectChanges() method. By looking at the DataRow.Item property and passing in either DataRowVersion.Original or DataRowVersion.Current, you can see the before and after values in the DataTable.

If you keep a series of DiffGrams, it is important that you are able to reapply them in the proper order. You probably would not want to try to roll back changes for more than a couple of iterations. You could, however, use the DiffGrams as a form of logging or for auditing purposes if the DBMS that is being used does not offer these facilities.