XML is used internally in the .NET Framework to pass all types of data around. From ADO.NET to Web Services, XML is the data format of choice for shuffling information to and from Framework services. Fortunately, we aren't reprogramming the entire Framework, so most of that isn't really important to our understanding of .NET development. But some Framework classes expose XML services right out there for the programmer to use.
Because XML is no fun to manage as a big chunk of text, .NET includes several classes that manage XML data. All of these tools appear in the System.Xml namespace and its subordinate namespaces.
The features included in each class tie pretty closely to the structure of the XML, XSD, and XSLT documents themselves. They include a whole lot of features that weren't covered previously, because there are gobs of ways to manipulate XML data.
The Basic XML Classes, Basically
The System.Xml namespace includes the most basic classes you will use to manage XML data. An XmlDocument object is the in-memory view of your actual XML document:
Dim myData As New System.Xml.XmlDocument
Your document is made up of declarations (that <?xml...?> thing at the top), data elements (all the specific tags in your document), attributes (inside of each starting element tag), and comments. These are represented by the XmlDeclaration, XmlElement, XmlAttribute, and XmlComment classes, respectively. Together, these four main units of your document are called nodes, represented generically by the XmlNode class. (The four specific classes all inherit from the more basic XmlNode class.) Usually, when you build an XML document by hand in memory, you use the individual classes like XmlElement. Later on, when you need to scan through an existing document, it is easier to use the generic XmlNode class.
Let's build a subset of our sample XML product data.
<?xml version="1.0"?> <productList> <!-- We currently sell these items. --> <supplier fullName="Beverages R Us"> <product available="Yes"> <productName>Chai</productName> <category>Beverages</category> <unitPrice>18.00</unitPrice> </product> </supplier> <productList>
Declare all the variables you will use, and then use them.
Dim products As XmlDocument Dim prodDeclare As XmlDeclaration Dim rootSet As XmlElement Dim supplier As XmlElement Dim product As XmlElement Dim productValue As XmlElement Dim comment As XmlComment ' ----- Create the document with a valid declaration. products = New XmlDocument prodDeclare = products.CreateXmlDeclaration("1.0", _ Nothing, String.Empty) products.InsertBefore(prodDeclare, products.DocumentElement) ' ----- Create the root element, <productList>. rootSet = products.CreateElement("productList") products.InsertAfter(rootSet, prodDeclare) ' ----- Add a nice comment. comment = products.CreateComment( _ " We currently sell these items. ") rootSet.AppendChild(comment) ' ------ Create the supplier element, <supplier>. ' Include the attributes. supplier = products.CreateElement("supplier") supplier.SetAttribute("ID", "652") supplier.SetAttribute("fullName", "Beverages R Us") rootSet.AppendChild(supplier) ' ----- Create the product element, <product>, with the ' subordinate data values. product = products.CreateElement("product") product.SetAttribute("ID", "1") product.SetAttribute("available", "yes") supplier.AppendChild(product) productValue = products.CreateElement("productName") productValue.InnerText = "Chai" product.AppendChild(productValue) productValue = products.CreateElement("category") productValue.InnerText = "Beverages" product.AppendChild(productValue) productValue = products.CreateElement("unitPrice") productValue.InnerText = "18.00" product.AppendChild(productValue)
It really works, too. To prove it, put this code in the click event of a button, and end it with the following line:
Run the program and view the c:\products.xml file to see the XML product data. There are many different ways to use the XML classes to create an XML document in memory. For instance, although I used the SetAttribute method to add attributes to the supplier and product nodes, I could have created separate attribute objects, and appended them on to these nodes, just like I did for the main elements.
Dim attrData As XmlAttribute attrData = products.CreateAttribute("ID") attrData.Value = "652" supplier.SetAttributeNode(attrData)
So, this is nice and all, but what if you already have some XML in a file, and you just want to load it into an XmlDocument object? Simply use the XmlDocument object's Load method.
Dim products As XmlDocument products = New XmlDocument products.Load("c:\products.xml")
For those instances where you just want to read or write some XML from or to a file, and you don't care much about manipulating it in memory, the XmlTextReader and XmlTextWriter classes let you quickly read and write XML data via a text stream. But if you are going to do things with the XML data in your program, the Load and Save methods of the XmlDocument object are a better choice.
Finding Needles and Haystacks
In our sample data, all of the products appear in supplier groups. If we just wanted a list of products, regardless of supplier, we ask the XmlDocument to supply that data via an XmlNodeList object.
Dim justProducts As XmlNodeList Dim oneProduct As XmlNode ' ----- First, get the list. justProducts = products.GetElementsByTagName("product") ' ----- Then do something with them. For Each oneProduct In justProducts ' ----- Put interesting code here. Next oneProduct MsgBox("Processed " & justProducts.Count.ToString() & _ " product(s).")
For a more complex selection of nodes within the document, the System.Xml.XPath namespace implements the XPath searching language, which gives you increased flexibility in locating items. The Visual Studio documentation describes the methods and searching syntax used with these classes.
An XmlDocument object can hold any type of random yet valid XML content, but you can also verify the document against an XSD schema. If your XML document refers to an XSD schema, includes a document type definition (DTD), or uses XDR (XML Data Reduced Schemas, similar to XSD), an XmlReader, when configured with the appropriate XmlReaderSettings, will properly compare your XML data against the defined rules, and throw an exception if there's a problem.
Dim products As New XmlDocument Dim xmlRead As XmlTextReader Dim withVerify As New XmlReaderSettings Dim xmlReadGood As XmlReader ' ----- Open the XML file and process schemas ' referenced within the content. withVerify.ValidationType = ValidationType.Schema xmlRead = New XmlTextReader("c:\temp\products.xml") xmlReadGood = XmlReader.Create(xmlRead, withVerify) ' ----- Load content, or throw exception on ' validation failure. products.Load(xmlReadGood) ' ----- Clean up. xmlReadGood.Close() xmlRead.Close()
Before we move on to the project code, let's look at XSL Transformations in the .NET classes. It's no more difficult than any of the other manipulations of XML. Just as there are many ways to get XML source data (from a file, building it by hand with XmlDocument, and so on), there are many ways to transform the data. If you just want to go from input file to output file, the following code provides a quick and efficient method. XSL Transformation is generally a performance-poor activity. But you can speed up performance by putting your source XML document into an XPathDocument object instead of a plain XmlDocument object.
' ----- Above: Imports System.IO ' Imports System.Xml ' Imports System.Xml.Xsl ' Imports System.Xml.Xpath Dim xslTrans As XslTransform Dim inFile As XPathDocument Dim outFile As StreamWriter Dim outWriter As XmlTextWriter ' ----- Open the source file using XPath. inFile = New XPathDocument("c:\input.xml") ' ----- Open the XSL file as a transformation. xslTrans = New XslTransform() xslTrans.Load("c:\convert.xsl") ' ----- Open the output file as a stream. outFile = New System.IO.StreamWriter("c:\output.txt") outWriter = New XmlTextWriter(outFile) ' ----- Convert and save the output. outWriter.Formatting = Formatting.Indented xslTrans.Transform(inFile, Nothing, outWriter, Nothing) outFile.Close()