Using The DOM In .NET | Pro Visual C++ 2005 for C# Developers

The DOM implementation in .NET supports the W3C DOM Level 1 and Core DOM Level 2 specifications. The DOM is implemented through the XmlNode class, which is an abstract class that represents a node of an XML document.

There is also an XmlNodeList class, which is an ordered list of nodes. This is a live list of nodes, and any changes to any node are immediately reflected in the list. XmlNodeList supports indexed access or iterative access. Another abstract class, XmlCharacterData, extends XmlLinkedNode and provides text manipulation methods for other classes.

The XmlNode and XmlNodeList classes make up the core of the DOM implementation in the .NET Framework. The following table lists some of the classes that are based on XmlNode.

Class Name	Description
XmlLinkedNode	Returns the node immediately before or after the current node. Adds NextSibling and PreviousSibling properties to XmlNode.
XmlDocument	Represents the entire document. Implements the DOM Level 1 and Level 2 specifications.
XmlDocumentFragment	Represents a fragment of the document tree.
XmlAttribute	Represents an attribute object of an XmlElement object.
XmlEntity	Represents a parsed or unparsed entity node.
XmlNotation	Contains a notation declared in a DTD or schema.

The following table lists classes that extend XmlCharacterData.

Class Name	Description
XmlCDataSection	Represents a CData section of a document.
XmlComment	Represents an XML comment object.
XmlSignificantWhitespace	Represents a node with whitespace. Nodes are created only if the PreserveWhiteSpace flag is true.
XmlWhitespace	Represents whitespace in element content. Nodes are created only if the PreserveWhiteSpace flag is true.
XmlText	Represents the textual content of an element or attribute.

The following table lists classes that extend the XmlLinkedNode.

Class Name	Description
XmlDeclaration	Represents the declaration node (<?xml version='1.0'...>).
XmlDocumentType	Represents data relating to the document type declaration.
XmlElement	Represents an XML element object.
XmlEntityReferenceNode	Represents an entity reference node.
XmlProcessingInstruction	Contains an XML processing instruction.

As you can see, .NET makes available a class to fit just about any XML type that you might encounter. Because of this, you end up with a very flexible and powerful tool set. This section won't look at every class in detail, but you will see several examples to give you an idea of what you can accomplish. Figure 21-1 illustrates what the inheritance diagram looks like.

image from book
Figure 21-1

Using the XmlDocument Class

XmlDocument and its derived class XmlDataDocument (discussed later in this chapter) are the classes that you will be using to represent the DOM in .NET. Unlike XmlReader and XmlWriter, XmlDocument gives you read and write capabilities as well as random access to the DOM tree. XmlDocument resembles the DOM implementation in MSXML. If you have experience programming with MSXML, you will feel comfortable using XmlDocument.

This section introduces an example that creates an XmlDocument object, loads a document from disk, and loads a list box with data from the title elements. This is similar to one of the examples that you constructed in the XmlReader section. The difference here is that you will be selecting the nodes you want to work with, instead of going through the entire document as in the XmlReader-based example.

Here is the code. Notice how simple it looks in comparison to the XmlReader example (you can find the file in the DOMSample1 folder of the download):

private void button1_Click(object sender, System.EventArgs e) { //doc is declared at the module level //change path to math your path structure _doc.Load("books.xml"); //get only the nodes that we want. XmlNodeList nodeLst = _doc.GetElementsByTagName("title"); //iterate through the XmlNodeList foreach (XmlNode node in nodeLst) listBox1.Items.Add(node.InnerText); }

Note that you also add the following declaration at the module level for the examples in this section:

 private XmlDocument doc=new XmlDocument();

If this is all that you wanted to do, using the XmlReader would have been a much more efficient way to load the list box, because you just go through the document once and then you are finished with it. This is exactly the type of work that XmlReader was designed for. However, if you wanted to revisit a node, using XmlDocument is a better way. Extend the previous example by adding another event handler:

 private void listBox1_SelectedIndexChanged(object sender, EventArgs e) { //create XPath search string string srch="bookstore/book[title='" + listBox1 //look for the extra data XmlNode foundNode = _doc.SelectSingleNode(srch); if (foundNode != null) MessageBox.Show(foundNode.OuterXml); else MessageBox.Show("Not found"); }

In this example, you load the list box with the titles from the books.xml document, as in the previous example. When you click on the list box, it triggers the SelectedIndexChanged() event handler. In this case, you take the text of the selected item in the list box (the book title), create an XPath statement, and pass it to the SelectSingleNode() method of the doc object. This returns the book element that the title is part of (foundNode). Then you display the OuterXml of the node in a message box. You can keep clicking items in the list box as many times as you want, because the document is loaded and stays loaded until you release it.

A quick comment regarding the SelectSingleNode() method: This is an XPath implementation in the XmlDocument class. Both SelectSingleNode() and SelectNodes()methods are defined in XmlNode, which XmlDocument in based on. SelectSingleNode() returns an XmlNode and SelectNodes() returns an XmlNodeList. However, the System.Xml.XPath namespace contains a richer XPath implementation, and you look at that in a later section.

Inserting nodes

Earlier, you looked at an example using XmlTextWriter that created a new document. The limitation was that it would not insert a node into a current document. With the XmlDocument class you can do just that. Change the button1_Click() event handler from the last example to the following(DOMSample3 in the download code):

private void button1_Click(object sender, System.EventArgs e) { //change path to match your structure doc.Load("..\\..\\..\\books.xml"); //create a new 'book' element XmlElement newBook=doc.CreateElement("book"); //set some attributes newBook.SetAttribute("genre","Mystery"); newBook.SetAttribute("publicationdate","2001"); newBook.SetAttribute("ISBN","123456789"); //create a new 'title' element XmlElement newTitle=doc.CreateElement("title"); newTitle.InnerText="The Case of the Missing Cookie"; newBook.AppendChild(newTitle); //create new author element XmlElement newAuthor=doc.CreateElement("author"); newBook.AppendChild(newAuthor); //create new name element XmlElement newName=doc.CreateElement("name"); newName.InnerText="C. Monster"; newAuthor.AppendChild(newName); //create new price element XmlElement newPrice=doc.CreateElement("price"); newPrice.InnerText="9.95"; newBook.AppendChild(newPrice); //add to the current document doc.DocumentElement.AppendChild(newBook); //write out the doc to disk XmlTextWriter tr=new XmlTextWriter("..\\..\\..\\booksEdit.xml",null); tr.Formatting=Formatting.Indented; doc.WriteContentTo(tr); tr.Close(); //load listBox1 with all of the titles, including new one XmlNodeList nodeLst=doc.GetElementsByTagName("title"); foreach(XmlNode node in nodeLst) listBox1.Items.Add(node.InnerText); }

After executing this code, you end up with the same functionality as in the previous example, but there is one additional book in the list box, The Case of the Missing Cookie (a soon-to-be classic). Clicking on the cookie caper title will show all of the same info as the other titles. Breaking down the code, you can see that this is actually a fairly simple process. The first thing that you do is create a new book element:

XmlElement newBook = doc.CreateElement("book");

CreateElement() has three overloads that allow you to specify the following:

The element name
The name and namespace URI
The prefix, localname, and namespace

Once the element is created you need to add attributes:

newBook.SetAttribute("genre","Mystery");  newBook.SetAttribute("publicationdate","2001");  newBook.SetAttribute("ISBN","123456789");

Now that you have the attributes created, you need to add the other elements of a book:

XmlElement newTitle = doc.CreateElement("title");  newTitle.InnerText = "The Case of the Missing Cookie";  newBook.AppendChild(newTitle);

Once again, you create a new XmlElement-based object (newTitle). Then you set the InnerText property to the title of our new classic, and append the element as a child to the book element. You repeat this for the rest of the elements in this book element. Note that you add the name element as a child to the author element. This will give you the proper nesting relationship as in the other book elements.

Finally, you append the newBook element to the doc.DocumentElement node. This is the same level as all of the other book elements. You have now updated an existing document with a new element.

The last thing to do is to write the new XML document to disk. In this example, you create a new XmlTextWriter and pass it to the WriteContentTo() method. WriteContentTo() and WriteTo() both take an XmlTextWriter as a parameter. WriteContentTo() saves the current node and all of its children to the XmlTextWriter, whereas WriteTo() just saves the current node. Because doc is an XmlDocument-based object, it represents the entire document and so that is what is saved. You could also use the Save() method. It will always save the entire document. Save() has four overloads. You can specify a string with the file name and path, a Stream-based object, a TextWriter-based object, or an XmlWriter-based object.

You also call the Close() method on XmlTextWriter to flush the internal buffers and close the file.

Figure 21-2 shows what you get when you run this example. Notice the new entry at the bottom of the list.

image from book
Figure 21-2

If you wanted to create a document from scratch, you could use the XmlTextWriter, which you saw in action earlier in the chapter. You can also use XmlDocument. Why would you use one in preference to the other? If the data that you want streamed to XML is available and ready to write, then the XmlTextWriter class would be the best choice. However, if you need to build the XML document a little at a time, inserting nodes into various places, then creating the document with XmlDocument might be the better choice. You can accomplish this by changing the following line:

doc.Load("books.xml");

to this code (example DOMSample4):

 //create the declaration section XmlDeclaration newDec = doc.CreateXmlDeclaration("1.0",null,null); doc.AppendChild(newDec); //create the new root element XmlElement newRoot = doc.CreateElement("newBookstore"); doc.AppendChild(newRoot);

First, you create a new XmlDeclaration. The parameters are the version (always 1.0 for now), the encoding, and the standalone flag. The encoding parameter should be set to a string that is part of the System.Text.Encoding class if null isn't used. (null defaults to UTF-8). The standalone flag can be either yes, no, or null. If it is null, the attribute is not used and will not be included in the document.

The next element that is created will become the DocumentElement. In this case, it is called newBookstore so that you can see the difference. The rest of the code is the same as in the previous example and works in the same way. This is booksEdit.xml, which is generated from the code:

 <?xml version="1.0"?> <newBookstore> <book genre="Mystery" publicationdate="2001" ISBN="123456789"> <title>The Case of the Missing Cookie</title> <author> <name>C. Monster</name> </author> <price>9.95</price> </book> </newBookstore>

You will want to use the XmlDocument class when you want to have random access to the document, or the XmlReader-based classes when you want a streaming type model instead. Remember that there is a cost for the flexibility of the XmlNode-based XmlDocument class — memory requirements are higher and the performance of reading the document is not as good as using XmlReader. There is another way to traverse an xml document: the XPathNavigator.