XML

Overview

XML is the lingua franca of application development—a common syntax that underlies Web services, Microsoft ADO.NET, and a slew of cross-platform programming initiatives. At times, the sheer number of XML extensions and grammars can be overwhelming. Common XML tasks don't just include parsing an XML file, but also validating it against a schema, applying an XSL transform to create a new document or HTML page, and searching intelligently with XPath. All of these topics are covered in this chapter.

The Microsoft .NET Framework includes a rich complement of classes for manipulating XML documents in the System.Xml group of namespaces. These namespaces, outlined in the following list, contain the classes we concentrate on in this chapter.

  • System.Xmlcontains the core classes for manipulating XML documents, including XmlDocument, which provides an in-memory representation of XML (see recipes 6.1 to 6.3), and the XmlTextReader and XmlTextWriter classes, which transfer XML information to and from a file (see recipe 6.6).
  • System.Xml.Schemacontains the classes for manipulating XSD schema files and applying schema validation (see recipe 6.9).
  • System.Xml.Serializationcontains classes that allow you to convert objects into XML, without using the formatters described in Chapter 4. See recipe 6.7 for more information.
  • System.Xml.XPathcontains a .NET implementation of an XPath parser for searching XML documents. You'll use this functionality indirectly with XSL transformations (see recipe 6.8) and the XmlNode.SelectNodes method (see recipe 6.5).
  • System.Xml.Xslcontains classes that allow you to transform an XML document into another document using an XSLT stylesheet (see recipe 6.8). :

Many of the examples in this chapter require a sample XML document. The sample we will use is called orders.xml. It contains a simple list of ordered items along with information about the ordering client, and it's shown here



 
 CompuStation
 
 
 
 Calculator
 24.99
 
 
 Laser Printer
 400.75
 
 
  Note

Before using the examples in this chapter, you should import the System.Xml namespace.


Load an XML Document into Memory

Problem

You need to load an XML document into memory, perhaps so you can browse its nodes, change its structure, or perform other operations.

Solution

Use the XmlDocument class, which provides a Load method for retrieving XML information and a Save method for storing it.

Discussion

.NET provides a slew of XML objects. The ones you use depend in part upon your programming task. The XmlDocument class provides an in-memory representation of XML. It allows you to deal with XML data in your application as XML. The XmlDocument class also allows you to browse through the nodes in any direction, insert and remove nodes, and change the structure on the fly. These tasks are not as easy with the simpler XmlTextWriter and XmlTextReader classes, which are explained in recipe 6.6.

To use the XmlDocument class, simply create a new instance of the class, and call the Load method with a filename, Stream, TextReader, or XmlReader object. You can even supply a URL that points to an XML document. The XmlDocument instance will be populated with the tree of elements, or nodes. The jumping-off point for accessing these nodes is the root element, which is provided through the XmlDocument.DocumentElement property. DocumentElement is an XmlElement object that can contain one or more nested XmlNode objects, which in turn can contain more XmlNode objects, and so on. An XmlNode is the basic ingredient of an XML file and can be an element, an attribute, a comment, or contained text. Figure 6-1 shows part of the hierarchy created by XmlDocument for the orders.xml file.

click to expand
Figure 6-1: A partial tree of the orders.xml document loaded into an XmlDocument.

When dealing with an XmlNode or a class that derives from it (such as XmlElement or XmlAttribute), you can use the following basic properties:

  • ChildNodesis an XmlNodeList collection that contains the first level of nested nodes.
  • Nameis the name of the node.
  • NodeTypereturns an enumerated value that indicates the type of the node (element, attribute, text, and so on).
  • Valueis the content of the node, if it's a text or CDATA node.
  • Attributesprovides a collection of node objects representing the attributes applied to the element.
  • InnerTextretrieves a string with the concatenated value of the node and all nested nodes.
  • InnerXmlretrieves a string with the concatenated XML markup for the current node and all nested nodes.

The following code loads the orders.xml document into memory and displays some information from the node tree.

Public Module XmlDocumentTest
 
 Public Sub Main()
 ' Load the document.
 Dim Doc As New XmlDocument
 Doc.Load("orders.xml")
 
 ' Display some information from the document.
 Dim Node As XmlNode
 Node = Doc.DocumentElement
 
 Console.WriteLine("This is order " & Node.Attributes(0).Value)
 
 For Each Node In Doc.DocumentElement.ChildNodes
 Select Case Node.Name
 Case "Client"
 Console.WriteLine("Prepared for " & _
 Node.ChildNodes(0).ChildNodes(0).Value)
 Case "Items"
 Console.WriteLine("Contains " & _
 Node.ChildNodes.Count.ToString() & " items")
 End Select
 Next
 
 Console.ReadLine()
 End Sub
 
End Module

The output is shown here:

This is order 2003-04-12-4996
Prepared for CompuStation
Contains 2 items


Process All Nodes in a Document

Problem

You want to iterate through all nodes in an XML tree and display or modify the related information.

Solution

Create a generic procedure for processing the node, and call it recursively.

Discussion

The XmlDocument stores a tree of XmlNode objects. You can walk through this tree structure recursively to process every node.

For example, consider the following code, which displays information about every node in a document. A depth parameter tracks how many layers deep the nesting is and uses it to format the output with a variable-sized indent.

Public Module XmlOuputTest
 
 Public Sub Main()
 ' Load the document.
 Dim Doc As New XmlDocument
 Doc.Load("orders.xml")
 
 ' Start the node walk at the root node (depth = 0).
 DisplayNode(Doc.DocumentElement, 0)
 
 Console.ReadLine()
 End Sub
 
 Private Sub DisplayNode(ByVal node As XmlNode, ByVal depth As Integer)
 ' Define the indent level.
 Dim Indent As New String(" "c, depth * 4)
 
 ' Display the node type.
 Console.WriteLine(Indent & node.NodeType.ToString() & _
 ": <" & node.Name & ">")
 
 ' Display the node content, if applicable.
 If node.Value <> String.Empty Then
 Console.WriteLine(Indent & "Value: " & node.Value)
 End If
 
 ' Display all nested nodes.
 Dim Child As XmlNode
 For Each Child In node.ChildNodes
 DisplayNode(Child, depth + 1)
 Next
 End Sub
 
End Module

When using the orders.xml document, the output is as follows:

Element: 
 Element: 
 Element: 
 Text: <#text>
 Value: CompuStation
 Elements: 
 Element: 
 Element: 
 Text: <#text>
 Value: Calculator
 Element: 
 Text: <#text>
 Value: 24.99
 Element: 
 Element: 
 Text: <#text>
 Value: Laser Printer
 Element: 
 Text: <#text>
 Value: 400.75

An alternative solution to this problem is to use the XmlTextReader, which always steps through nodes one at a time, in order.


Insert Nodes in an XML Document

Problem

You need to modify an XML document by inserting new data.

Solution

Create the node using the appropriate XmlDocument method (such as CreateElement, CreateAttribute, CreateNode, and so on). Then insert it using the appropriate XmlNode method (such as InsertAfter, InsertBefore, or AppendChild).

Discussion

Inserting a node is a two-step process. You must first create the node, and then you insert it in the appropriate location. Optionally, you can then call XmlDocument.Save to persist changes to a file.

To create a node, you use one of the XmlDocument methods that starts with the word Create, depending on the type of node. This ensures that the node will have the same namespace as the rest of the document. Next you must find a suitable related node and use one of its insertion methods to add the new node to the tree. The following example demonstrates this technique to add a new item:

Public Module XmlInsertTest
 
 Public Sub Main()
 ' Load the document.
 Dim Doc As New XmlDocument
 Doc.Load("orders.xml")
 
 ' Create a new element.
 Dim ItemNode As XmlNode
 ItemNode = Doc.CreateElement("Item")
 
 ' Add the attribute.
 Dim Attribute As XmlAttribute
 Attribute = Doc.CreateAttribute("id")
 Attribute.Value = "4312"
 ItemNode.Attributes.Append(Attribute)
 
 ' Create and add the sub-elements for this node.
 Dim NameNode, PriceNode As XmlNode
 NameNode = Doc.CreateElement("Name")
 PriceNode = Doc.CreateElement("Price")
 ItemNode.AppendChild(NameNode)
 ItemNode.AppendChild(PriceNode)
 
 ' Add the text data.
 NameNode.AppendChild(Doc.CreateTextNode("Stapler"))
 PriceNode.AppendChild(Doc.CreateTextNode("12.20"))
 
 ' Add the new element.
 ' In this case, we add it as a child at the end of the item list.
 Doc.DocumentElement.ChildNodes(1).AppendChild(ItemNode)
 
 ' Save the document.
 Doc.Save("orders.xml")
 
 Console.WriteLine("Changes saved.")
 Console.ReadLine()
 End Sub
 
End Module

The new document looks like this:



 
 CompuStation
 
 
 Calculator
 24.99
 
 
 Laser Printer
 400.75
 

Stapler
12.20


Alternatively, you might be able to use CloneNode, which creates an exact copy of a node, to simplify the task of adding similar data. CloneNode accepts a Boolean depth parameter. If you supply True, CloneNode will duplicate the entire branch, with all nested nodes. Here's the equivalent code using CloneNode:

' Load the document.
Dim Doc As New XmlDocument
Doc.Load("orders.xml")
 
' Create a new element based on an existing product.
Dim ItemNode As XmlNode
ItemNode = Doc.DocumentElement.ChildNodes(1).LastChild.CloneNode(True)
 
' Modify the node data.
ItemNode.Attributes(0).Value = "4312"
ItemNode.ChildNodes(0).ChildNodes(0).Value = "Stapler"
ItemNode.ChildNodes(1).ChildNodes(0).Value = "12.20"
 
' Add the new element.
Doc.DocumentElement.ChildNodes(1).AppendChild(ItemNode)
 
' Save the document.
Doc.Save("orders.xml")

Notice that in this case, certain assumptions are being made about the existing nodes (for example, that the first child in the item node is always the name, and the second child is always the price). If this assumption isn't guaranteed to be true, you might need to examine the node name programmatically.


Find Specific Elements by Name

Problem

You need to retrieve a specific node from an XmlDocument, and you know its name but not its position.

Solution

Use the XmlDocument.GetElementsByTagName method.

Discussion

The XmlDocument class provides a convenient GetElementsByTagName method that searches an entire document for nodes that have the indicated element name. It returns the results as a collection of XmlNode objects.

This code demonstrates how you could use GetElementsByTagName to calculate the total price of an order:

Public Module XmlSearchTest
 
 Public Sub Main()
 ' Load the document.
 Dim Doc As New XmlDocument
 Doc.Load("orders.xml")
 
 ' Retrieve all prices.
 Dim PriceNodes As XmlNodeList
 PriceNodes = Doc.GetElementsByTagName("Price")
 
 Dim PriceNode As XmlNode
 Dim Price As Decimal
 For Each PriceNode In PriceNodes
 Price += Decimal.Parse(PriceNode.ChildNodes(0).Value)
 Next
 
 Console.WriteLine("Total order costs: " & Price.ToString())
 Console.ReadLine()
 End Sub
 
End Module

If your elements include an attribute of type ID, you can also use a method called GetElementById to retrieve an element that has a matching ID value. However, neither method allows you the flexibility to search portions of an XML document—for that flexibility, you need XPath, as described in recipe 6.5.


Find Elements with an XPath Search

Problem

You need to search an XML document or a portion of an XML document for nodes that match certain criteria.

Solution

Use an XPath expression with the SelectNodes or SelectSingleNode method.

Discussion

The XmlNode class defines two methods that perform XPath searches: SelectNodes and SelectSingleNode. These methods operate on all contained child nodes. Because the XmlDocument inherits from XmlNode, you can call XmlDocument.SelectNodes to search an entire document.

Basic XPath syntax uses a pathlike notation. For example, the path /Order/Items/Item indicates an Item element that is nested inside an Items element, which, in turn, in nested in a root Order element. This is an absolute path. The following example uses an XPath absolute path to find the name of every item in an order.

Public Module XPathSearchTest
 
 Public Sub Main()
 ' Load the document.
 Dim Doc As New XmlDocument
 Doc.Load("orders.xml")
 
 ' Retrieve the name of every item.
 ' This could not be accomplished as easily with the
 ' GetElementsByTagName() method, because Name elements are
 ' used in Item elements and Client elements.
 Dim Nodes As XmlNodeList
 Nodes = Doc.SelectNodes("/Order/Items/Item/Name")
 
 Dim Node As XmlNode
 For Each Node In Nodes
 Console.WriteLine(Node.InnerText)
 Next
 
 Console.ReadLine()
 End Sub
 
End Module

XPath provides a rich and powerful search syntax, and it's impossible to explain all of the variations you can use in a short recipe. However, Table 6-1 outlines some of the key ingredients in more advanced XPath expressions and includes examples that show how they would work with the orders.xml document.

Table 6-1: XPath Expression Syntax

Expression

Meaning

/

Starts an absolute path that selects from the root node.

/Order/Items/Item selects all Item elements that are children of an Items element, which is itself a child of the root Order element.

//

Starts a relative path that selects nodes anywhere.

//Item/Name selects all of the Name elements that are children of an Item element, regardless of where they appear in the document.

@

Selects an attribute of a node.

/Order/@id selects the attribute named id from the root Order element.

*

Selects any element in the path.

/Order/* selects both Items and Client nodes because both are contained by a root Order element.

|

Combines multiple paths.

/Order/Items/Item/Name|Order/Client/Name selects the Name nodes used to describe a Client and the Name nodes used to describe an Item.

.

Indicates the current (default) node.

..

Indicates the parent node.

//Name/.. selects any element that is parent to a Name, which includes the Client and Item elements.

[ ]

Define selection criteria that can test a contained node or attribute value.

/Order[@] selects the Order elements with the indicated attribute value.

/Order/Items/Item[Price > 50] selects products above $50 in price.

/Order/Items/Item[Price > 50 and Name="Laser Printer"] selects products that match two criteria.

starts-with

This function retrieves elements based on what text a contained element starts with.

/Order/Items/Item[starts-with(Name, "C")] finds all Item elements that have a name element that starts with the letter C.

position

This function retrieves elements based on position.

/Order/Items/Item[position()=2] selects the second Item element.

count

This function counts elements. You specify the name of the child element to count, or an asterisk (*) for all children.

/Order/Items/Item[count(Price) = 1] retrieves Item elements that have exactly one nested Price element.

  Note

XPath expressions and all element and attribute names that you use inside them are always case sensitive.


Load an XML Document into a Class

Problem

You want to use an XML document to persist information, but interact with the data using a custom object in your code.

Solution

Use the XmlDocument or XmlTextReader class to read XML data, and transfer it into an object. Use XmlDocument or XmlTextWriter class to persist the XML data.

Discussion

It's common to want to work with full-fledged objects in your code and use XML only as a file format for persisting data. To support this design, you can create a class with Save and Load methods. The Save method commits the current data in the object to an XML format, whereas the Load method reads the XML document and uses its data to populate the object.

For example, the data in the orders.xml would require three classes to represent the Order, Item, and Client entities. You might create the Item and Client classes as follows:

Public Class Item
 Private _ID As String
 Private _Name As String
 Private _Price As Decimal
 
 Public Property ID() As String
 Get
 Return _ID
 End Get
 Set(ByVal Value As String)
 _ID = Value
 End Set
 End Property
 
 Public Property Name As String
 Get
 Return _Name
 End Get
 Set(ByVal Value As String)
 _Name = Value
 End Set
 End Property
 
 Public Property Price As Decimal
 Get
 Return _Price
 End Get
 Set(ByVal Value As Decimal)
 _Price = Value
 End Set
 End Property
 
 Public Sub New(ByVal id As String, ByVal name As String, _
 ByVal price As Decimal)
 Me.ID = id
 Me.Name = name
 Me.Price = price
 End Sub
 
End Class
 
Public Class Client
 Private _ID As String
 Private _Name As String
 
 Public Property ID() As String
 Get
 Return _ID
 End Get
 Set(ByVal Value As String)
 _ID = Value
 End Set
 End Property
 
 Public Property Name As String
 Get
 Return _Name
 End Get
 Set(ByVal Value As String)
 _Name = Value
 End Set
 End Property
 
 Public Sub New(ByVal id As String, ByVal name As String)
 Me.ID = id
 Me.Name = name
 End Sub
 
End Class

The Order class would then contain a single Client, and a collection of Item objects. It would also add the Save and Load methods that transfer the data to and from the XML file. Here's an example that supports loading only:

Public Class Order
 Private _ID As String
 Private _Client As Client
 Private _Items() As Item
 
 Public Property ID() As String
 Get
 Return _ID
 End Get
 Set(ByVal Value As String)
 _ID = Value
 End Set
 End Property
 
 Public Property Client As Client
 Get
 Return _Client
 End Get
 Set(ByVal Value As Client)
 _Client = Value
 End Set
 End Property
 
 Public Property Items() As Item()
 Get
 Return _Items
 End Get
 Set(ByVal Value As Item())
 _Items = Value
 End Set
 End Property
 
 Public Sub New(ByVal id As String, ByVal client As Client, _
 ByVal items As Item())
 Me.ID = id
 Me.Client = client
 Me.Items = items
 End Sub
 
 Public Sub New(ByVal xmlFilePath As String)
 Me.Load(xmlFilePath)
 End Sub
 
 Public Sub Load(ByVal xmlFilePath As String)
 Dim Doc As New XmlDocument
 Doc.Load(xmlFilePath)
 
 ' Find the Order node.
 Dim Node As XmlNode
 Node = Doc.GetElementsByTagName("Order")(0)
 Me.ID = Node.Attributes(0).Value
 
 ' Find the Client node.
 Node = Doc.GetElementsByTagName("Client")(0)
 Me.Client = New Client(Node.Attributes(0).Value, Node.InnerText)
 
 ' Find the Item nodes.
 Dim Nodes As XmlNodeList
 Nodes = Doc.GetElementsByTagName("Item")
 Dim Items As New ArrayList
 For Each Node In Nodes
 Items.Add(New Item(Node.Attributes(0).Value, _
 Node.ChildNodes(0).InnerText, _
 Decimal.Parse(Node.ChildNodes(1).InnerText)))
 Next
 
 ' Convert the collection of items into a strongly typed array.
 Me.Items = CType(Items.ToArray(GetType(Item)), Item())
 End Sub
 
 Public Sub Save(ByVal xmlFilePath As String)
 ' (Save code omitted.)
 End Sub
 
End Class
  Note

To improve this design, you might want to substitute the array of Item objects with a strongly typed collection, as described in recipe 3.16.

The client can then use the following code to inspect products, without having to interact with the underlying XML format at all:

Dim XmlOrder As New Order("orders.xml")
 
' Display the prices of all items.
Dim Item As Item
For Each Item In XmlOrder.Items
 Console.WriteLine(Item.Name & ": " & Item.Price.ToString())
Next

There are countless variations of this design. For example, you might create a class that writes a file directly to disk. Or, you might add another layer of abstraction using streams, so that the client could save the serialization data to disk, transmit it to another component, or even add encryption with a CryptoStream wrapper. Alternatively, you could use the XmlSerializer class to automate the work for you, as described in recipe 6.7.


Use XML Serialization with Custom Objects

Problem

You want to use an XML document as a serialization format and load the data into an object for manipulation in your code, preferably with as little code as possible.

Solution

Use XmlSerializer to transfer data from your object to XML, and vice versa.

Discussion

The XmlSerializer class allows you to convert objects to XML data, and vice versa. This process is used natively by Web services and provides a customizable serialization mechanism that won't require a single line of custom code. The XmlSerializer class is even intelligent enough to correctly create arrays when it finds nested elements.

The only requirements for using XmlSerializer are as follows:

  • The XmlSerializer only serializes properties and public variables.
  • The classes you want to serialize must include a default zero-argument constructor. The XmlSerializer uses this constructor when creating the new object during deserialization.
  • All class properties must be readable and writable. This is because XmlSerializer uses property get procedures to retrieve information, and property set procedures to restore the data after deserialization.

To use serialization, you must first mark up your data objects with attributes that indicate the desired XML mapping. These attributes are found in the System.Xml.Serialization namespace and include the following:

  • XmlRootspecifies the name of the root element of the XML file. By default, XmlSerializer will use the name of the class. This attribute can be applied to the class declaration.
  • XmlElementindicates the element name to use for a property or public variable. By default, XmlSerializer will use the name of the property or public variable.
  • XmlAttributeindicates that a property or public variable should be serialized as an attribute, not an element, and specifies the attribute name.
  • XmlEnumconfigures the text that should be used when serializing enumerated values. By default, the name of the enumerated constant is used.
  • XmlIgnoreindicates that a property or public variable should not be serialized.

For example, the following code shows the classes needed to represent the orders.xml items. In this case, the only attribute that was needed was XmlAttribute, which maps the ID property to an attribute named id. To use the code as written, you must import the System.Xml.Serialization namespace.

Public Class Order
 Private _ID As String
 Private _Client As Client
 Private _Items() As Item
 
 _
 Public Property ID() As String
 Get
 Return _ID
 End Get
 Set(ByVal Value As String)
 _ID = Value
 End Set
 End Property
 
 Public Property Client() As Client
 Get
 Return _Client
 End Get
 Set(ByVal Value As Client)
 _Client = Value
 End Set
 End Property
 
 Public Property Items() As Item()
 Get
 Return _Items
 End Get
 Set(ByVal Value As Item())
 _Items = Value
 End Set
 End Property
 
 Public Sub New(ByVal id As String, ByVal client As Client, _
 ByVal items As Item())
 Me.ID = id
 Me.Client = client
 Me.Items = items
 End Sub
 
Public Sub New()
' (XML serialization requires the default constructor.)
End Sub

End Class
 
Public Class Item
 Private _ID As String
 Private _Name As String
 Private _Price As Decimal
 
 _
 Public Property ID() As String
 Get
 Return _ID
 End Get
 Set(ByVal Value As String)
 _ID = Value
 End Set
 End Property
 
 Public Property Name() As String
 Get
 Return _Name
 End Get
 Set(ByVal Value As String)
 _Name = Value
 End Set
 End Property
 
 Public Property Price() As Decimal
 Get
 Return _Price
 End Get
 Set(ByVal Value As Decimal)
 _Price = Value
 End Set
 End Property
 
 Public Sub New(ByVal id As String, ByVal name As String, _
 ByVal price As Decimal)
 Me.ID = id
 Me.Name = name
 Me.Price = price
 End Sub
 
Public Sub New()
' (XML serialization requires the default constructor.)
End Sub

End Class
 
Public Class Client
 Private _ID As String
 Private _Name As String
 
 _
 Public Property ID() As String
 Get
 Return _ID
 End Get
 Set(ByVal Value As String)
 _ID = Value
 End Set
 End Property
 
 Public Property Name() As String
 Get
 Return _Name
 End Get
 Set(ByVal Value As String)
 _Name = Value
 End Set
 End Property
 
 Public Sub New(ByVal id As String, ByVal name As String)
 Me.ID = id
 Me.Name = name
 End Sub
 
Public Sub New()
' (XML serialization requires the default constructor.)
End Sub

End Class

Here's the code needed to create a new Order object, serialize the results to an XML document, deserialize the document back to an object, and display some basic order information.

' Create the order.
Dim Client As New Client("CMPSO33UL", "CompuStation")
 
Dim Item1 As New Item("2003", "Calculator", Convert.ToDecimal(24.99))
Dim Item2 As New Item("4311", "Laser Printer", Convert.ToDecimal(400.75))
Dim Items() As Item = {Item1, Item2}
 
Dim Order As New Order("2003-04-12-4996", Client, Items)
 
' Serialize the order to a file.
Dim Serializer As New System.Xml.Serialization.XmlSerializer(GetType(Order))
Dim fs As New FileStream("orders.xml", FileMode.Create)
Serializer.Serialize(fs, Order)
fs.Close()
 
' Deserialize the order from the file.
fs = New FileStream("orders.xml", FileMode.Open)
Order = CType(Serializer.Deserialize(fs), Order)
fs.Close()
 
' Display the prices of all items.
Dim Item As Item
For Each Item In Order.Items
 Console.WriteLine(Item.Name & ": " & Item.Price.ToString())
Next
  Note

This approach isn't necessarily better than that presented in recipe 6.6. It does require less code and can prevent some types of error. However, it also forces you to give up a layer of abstraction (the custom reading and writing code) that can be used to perform validation, manage multiple versions of the same XML document, or map XML documents to .NET objects that don't match exactly. The approach you use depends on the needs of your application.


Perform an XSL Transform

Problem

You want to transform an XML document into another document using an XSLT stylesheet.

Solution

Use the Transform method of the System.Xml.Xsl.XslTransform class.

Discussion

XSLT (or XSL transforms) is an XML-based language designed to transform one XML document into another document. XSLT can be used to create a new XML document with the same data but arranged in a different structure, or to select a subset of the data in a document. It can also be used to create a different type of structured document. XSLT is commonly used in this manner to format an XML document into an HTML page.

XSLT is a rich language, and creating XSL transforms is beyond the scope of this book. However, you can learn how to create simple XSLT documents by looking at a basic example. Here's a stylesheet that could be used to transform orders.xml into an HTML summary page:



 
 

Order for

 
ID Name Price  

Essentially, every XSL stylesheet consists of a set of templates. Each template matches some set of elements in the source document and then describes the contribution that the matched element will make to the resulting document. In order to match the template, the XSLT document uses XPath expressions, as described in recipe 6.5.

The orders.xslt stylesheet contains two template elements (as children of the root stylesheet element). The first template matches the root Order element. When it finds it, it output the tags necessary to start an HTML table with appropriate column headings and inserts some data about the client using the value-of command, which outputs the text result of an XPath expression. In this case, the XPath expressions (Client/@id and Client/Name) match the id attribute and the Name element.

Next, the apply-templates command is used to branch off and perform processing of any contained Item elements. This is required because there might be multiple Item elements. Each Item element is matched using the XPath expression Items/Item. The root Order node isn't specified because Order is the current node. Finally, the initial template writes the tags necessary to end the HTML document.

To apply this XSLT stylesheet in .NET, use the XslTransform class, as shown in the following code. In this case, the code uses the overloaded version of the Transform method that saves the result document directly to disk, although you could receive it as a stream and process it inside your application instead.

Public Module TransformTest
 
 Public Sub Main()
 Dim Transform As New System.Xml.Xsl.XslTransform
 
 ' Load the XSL stylesheet.
 Transform.Load("orders.xslt")
 
 ' Transform orders.xml into orders.html using orders.xslt.
 Transform.Transform("orders.xml", "orders.html")
 
 Console.WriteLine("File 'orders.html' written successfully.")
 Console.ReadLine()
 End Sub
 
End Module

The final result of this process is the HTML file shown in the following listing. Figure 6-2 shows how this HTML is displayed in a browser.

click to expand
Figure 6-2: The stylesheet output for orders.xml


 

Order CMPSO33UL for CompuStation

ID Name Price
2003 Calculator 24.99
4311 Laser Printer 400.75


Validate an XML Document Against a Schema

Problem

You want to ensure that an XML document conforms to an XML schema.

Solution

Use XmlValidatingReader and handle the ValidationEventHandler event.

Discussion

An XML schema defines the rules that a given type of XML document must follow. The schema includes rules that define

  • The elements and attributes that can appear in a document.
  • The data types for elements and attributes.
  • The structure of a document, including what elements are children of other elements.
  • The order and number of child elements that appear in a document.
  • Whether elements are empty, can include text, or require fixed values.

XML schema documents are beyond the scope of this chapter, but much can be learned from a simple example. Essentially, an XSD document lists the elements that can occur using element tags. The type attribute indicates the data type. Here's an example for the product name:


 

The basic schema data types are defined at http://www.w3.org/TR/xmlschema-2 . They map closely to .NET data types and include string, int, long, decimal, float, dateTime, boolean, and base64Binary, to name a few of the most frequently used types.

Elements that consist of more than one subelement are called complex types. You can nest them together using a sequence tag, if order is important, or a choice tag if it's not. Here's how you might model the Client element:


 

By default, a listed element can occur exactly one time in a document. You can configure this behavior by specifying the maxOccurs and minOccurs attributes:


 

Here's the complete schema for the orders.xml file:



 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

The XmlValidatingReader class enforces all of these schema rules, and it also checks that the XML document is well formed (which means there are no illegal characters, all opening tags have a corresponding closing tag, and so on). To check a document, you read through it one node at a time by calling the XmlValidatingReader.Read method. If an error is found, XmlValidatingReader raises a ValidationEventHandler event with information about the error. If you wish, you can handle this event and continue processing the document to find more errors. If you don't handle this event, an XmlException will be raised when the first error is encountered, and processing will be aborted. To test only if a document is well-formed, you can use the XmlValidatingReader without a schema.

The next example shows a utility class that displays all errors in an XML document when the ValidateXml method is called. Errors are displayed in a Console window, and a final Boolean variable is returned to indicate the success or failure of the entire validation operation. Remember that you'll need to import the System.Xml.Schema namespace in order to use this class.

Public Class ConsoleValidator
 
 ' Set to True if at least one error exist.
 Private Failed As Boolean
 
 Public Function ValidateXml(ByVal XmlFilename As String, _
 ByVal schemaFilename As String) As Boolean
 
 ' Create the validator.
 Dim r As New XmlTextReader(XmlFilename)
 Dim Validator As New XmlValidatingReader(r)
 Validator.ValidationType = ValidationType.Schema
 Dim Schema As New System.Xml.Schema.XmlSchema
 
 ' Load the schema file into the validator.
 Dim Schemas As New XmlSchemaCollection
 Schemas.Add(Nothing, schemaFilename)
 Validator.Schemas.Add(Schemas)
 
 ' Set the validation event handler.
 AddHandler Validator.ValidationEventHandler, _
 AddressOf Me.ValidationEventHandler
 
 Failed = False
 
 Try
 ' Read all XML data.
 While Validator.Read()
 End While
 Catch Err As XmlException
 ' This happens if the XML document includes illegal characters
 ' or tags that aren't properly nested or closed.
 Console.WriteLine("A critical XML error has occured.")
 Failed = True
 End Try
 
 Validator.Close()
 
 Return Not Failed
 
 End Function
 
 Private Sub ValidationEventHandler(ByVal sender As Object, _
 ByVal args As System.Xml.Schema.ValidationEventArgs)
 Failed = True
 
 ' Display the validation error.
 Console.WriteLine("Validation error: " & args.Message)
 End Sub
 
End Class

Here's how you would use the class:

Dim ConsoleValidator As New ConsoleValidator
Console.WriteLine("Validating XML file orders.xml with orders.xsd.")
 
Dim Success As Boolean
Success = ConsoleValidator.ValidateXml("orders.xml", "orders.xsd")

If the document is valid, no messages will appear, and the Success variable will be set to True. But consider what happens if you use a document that breaks schema rules, like the orders_wrong.xml file shown here:



 
 CompuStation
 
 
 
 Calculator
 twenty-four
 
 
400.75
 Laser Printer
 
 

If you attempt to validate this document, the output will indicate each error, and the Success variable will be set to False:

Validation error: Element 'Client' has invalid child element 'Namely'.
Expected 'Name'.
Validation error: The 'Namely' element is not declared.
Validation error: The 'Price' element has an invalid value according to its 
data type.
Validation error: Element 'Item' has invalid child element 'Price'.
Expected 'Name'.

If you want to validate an XML document and then process it, you can use XmlValidatingReader to scan a document as it's read into an in-memory XmlDocument. Here's how it works:

Dim Doc As New XmlDocument()
Dim r As New XmlTextReader("orders.xml")
Dim Validator As New XmlValidatingReader(r)
 
' Load the schema into the validator.
Validator.ValidationType = ValidationType.Schema
Dim Schema As New System.Xml.Schema.XmlSchema()
Dim Schemas As New XmlSchemaCollection()
Schemas.Add(Nothing, "......orders.xsd")
Validator.Schemas.Add(Schemas)
 
' Load the document and validate it at the same time.
' Don't handle the ValidationEventHandler event. Instead, allow any errors
' to be thrown as an XmlSchemaException.
Try
 Doc.Load(Validator)
 ' (Validation succeeded if you reach here.)
Catch Err As XmlSchemaException
 ' (Validation failed if you reach here.)
End Try
  Note

Microsoft Visual Studio .NET includes a visual schema designer that allows you to create schema files at design-time using graphical elements. You can also use the command-line utility xsd.exe to quickly create a schema from an XML document, which you can use as a starting point.


Store Binary Data with a Base64 Transform

Problem

You need to store binary data in an XML file.

Solution

Use Convert.ToBase64String to create a string representation of the data that will not contain any illegal characters.

Discussion

XML documents can't contain extended characters, or special characters such as the greater than (>) or less than (<) symbols, which are used to denote elements. However, you can convert binary data into a string representation that is XML-legal by using a Base64 transform.

In Base64 encoding, each sequence of three bytes is converted to a sequence of four bytes. Each Base64 encoded character has one of the 64 possible values in the range {A-Z, a-z, 0-9, +, /, =}.

Here's an example that creates a new node in the orders.xml for Base64-encoded image data. In order to use this code as written, you must import the System.IO namespace.

Public Module StoreBase64Data
 
 Public Sub Main()
 ' Load the document.
 Dim Doc As New XmlDocument
 Doc.Load("orders.xml")
 
 ' Create a new element.
 Dim LogoNode As XmlNode
 LogoNode = Doc.CreateElement("Logo")
 
 ' Retrieve the picture data.
 Dim fs As New FileStream("logo.bmp", FileMode.Open)
 Dim LogoBytes(Convert.ToInt32(fs.Length)) As Byte
 fs.Read(LogoBytes, 0, LogoBytes.Length)
 
 ' Encode the picture data and add it as text.
 Dim EncodedText As String = Convert.ToBase64String(LogoBytes)
 LogoNode.AppendChild(Doc.CreateTextNode(EncodedText))
 
 ' Add the new element.
 Doc.DocumentElement.ChildNodes(0).AppendChild(LogoNode)
 
 ' Save the document.
 Doc.Save("orders_pic.xml")
 
 Console.WriteLine("File successfully 'orders_pic.xml' written.")
 Console.ReadLine()
 End Sub
 
End Module

Here's the resulting (slightly abbreviated) XML document:



 
 CompuStation
 R0lGODlh0wAfALMPAAAAAIAAAACAAICAAAAAgIAAgACAgICAgMDAwP8AAD...
 
 
 
 

You can use Convert.FromBase64String to retrieve the image data from the XML document.

  Note

Visual Studio .NET uses a Base64 transform to store binary information that's added to a form at design time in the corresponding XML resources file.




Microsoft Visual Basic. Net Programmer's Cookbook
Microsoft Visual Basic .NET Programmers Cookbook (Pro-Developer)
ISBN: 073561931X
EAN: 2147483647
Year: 2003
Pages: 376

Flylib.com © 2008-2020.
If you may any questions please contact us: flylib@qtcs.net