We've discussed how to read and write XML documents, access them in a range of ways, and validate the content against a schema or DTD. Let's now look at Creating and Editing the Content of XML Documents ( edit-xml.aspx ).
The example page loads an XML document named bookdetails.xml and demonstrates four different techniques you can use for editing and creating documents:
Selecting a node, extracting the content, and deleting that node from the document.
Creating a new empty document and adding a declaration and comment to it.
Importing (that is, copying) a node from the original document into the new document.
Selecting, editing, and inserting new nodes and content into the original document.
Figure 11-20 shows the page when you run it. You can see the four stages of the process, though the second and third are combined into one section of the output in the page:
Note | You must run the page in a browser on the web server itself to be able to open the XML documents using the physical paths in the hyperlinks in the page. |
The page contains the customary <div> elements to display the results and messages, and details of any errors encountered . It also creates the paths to the existing and new documents, and displays a hyperlink to the existing document. This is identical to the previous example, and so we aren't repeating the code here. Instead, here's the part that loads the existing document into a new XmlDocument object:
Dim objXMLDoc As New XmlDocument() Try objXMLDoc.Load(strXMLPath) Catch objError As Exception outError.innerHTML = "Error while accessing document.<br />" _ & objError.Message & "<br />" & objError.Source Exit Sub ' and stop execution End Try
To select a specific node in the document, you can use an XPath expression . In our example, the expression is descendant::Book[ISBN="0764544020"] , which “ when the current node is the root element of the document “ selects the <Book> node with the specified value for its <ISBN> child node. This expression is used in the SelectSingleNode method, and it returns a reference to the node you want. To display this node and its content, you just have to reference its OuterXml property.
Note | If you only want the content of the node, use the InnerXml property, and if you only want the text values of all the nodes concatenated together, use the InnerText property. |
To delete the node from the document, you can call the RemoveChild method of the parent node (the root of the document, which is returned by the DocumentElement property of the document object), and pass it a reference to the node to be deleted.
'specify XPath expression to select a book element Dim strXPath As String = "descendant::Book[ISBN=" & Chr(34) _ & "0764544020" & Chr(34) & "]" 'get a reference to the matching <Book> node Dim objNode As XmlNode objNode = objXMLDoc.SelectSingleNode(strXPath) 'display node and content using the OuterXml property outResult1.InnerHtml = "XPath expression '<b>" & strXPath _ & "</b>' returned:<br />" _ & Server.HtmlEncode(objNode.OuterXml) & "<br />" 'delete this node using RemoveChild method from document element objXMLDoc.DocumentElement.RemoveChild(objNode) outResult1.InnerHtml &= "Removed node from document.<br />"
We create a new empty XML document, simply by instantiating an XmlDocument (or XmlDataDocument ) object. Nodes can then be created and inserted into this document. In the code that follows , we're creating a new XML declaration (the <?xml version="1.0"?> element) and inserting it into the new document with the InsertBefore method:
'create new empty XmlDocument object Dim objNewDoc As New XmlDocument() 'create a new XmlDeclaration object Dim objDeclare As XmlDeclaration objDeclare = objNewDoc.CreateXmlDeclaration("1.0", Nothing, Nothing) 'and add it as the first node in the new document objDeclare = objNewDoc.InsertBefore(objDeclare, objNewDoc.DocumentElement)
Note | The second and third parameters of the CreateXmlDeclaration method are used to specify the encoding type used in the document, and the standalone value (in other words, if there is a schema available to validate the document). We set both to Nothing , so we'll get neither of these optional attributes in the XML declaration element. An XML parser will then assume the default values "UTF-8" and "yes" when it loads the document. |
When the new node is created, a reference to this new node is returned from the CreateXmlDeclaration method. This reference is used as the first parameter to the InsertBefore method. The second parameter is a reference to the node that we want to insert before, and in this case we specify the root of the document.
Notice that DocumentElement is not the root element of the document, as it doesn't yet have one. This sounds confusing, but you can think of it as a reference to the placeholder where the root element will reside.
Next we create a new Comment element, and insert this into the new document after the XML declaration element:
'create a new XmlComment object Dim objComment As XmlComment objComment = objNewDoc.CreateComment("New document created " & Now()) 'and add it as the second node in the new document objComment = objNewDoc.InsertAfter(objComment, objDeclare)
To get some content into the newly created document, our example page imports a node from the existing document loaded from disk at the start of the page. We again use an XPath expression with the SelectSingleNode method to get a reference to the <Book> element that is to be imported.
We then create a new XmlNode object in the target document to hold the imported node, and call the Import method of this new node to copy the node from the original document. The second parameter to the Import method specifies whether we want a deep copy “ in other words, if we want to import all the content of the node as well as the value.
Once you've got the new node into the document, you have to insert it into the tree “ it is only an unattached fragment at the moment. As before, you can use the InsertAfter method, using the reference you've already got to the new node, and the reference created earlier to the Comment node so that the imported node becomes the root element of the new document.
We finish this section by displaying the contents of the new document. We've got a reference to the XmlDocument object that contains it, so we just query the OuterXml property to get the complete content. You can see the new document displayed in the example page.
strXPath = "descendant::Book[ISBN=" & Chr(34) & "0764543709" & Chr(34) & "]" objNode = objXMLDoc.SelectSingleNode(strXPath) 'create a variable to hold the imported node object Dim objImportedNode As XmlNode 'import node and all children into new document as unattached fragment objImportedNode = objNewDoc.ImportNode(objNode, True) 'insert new unattached node into document after the comment node objNewDoc.InsertAfter(objImportedNode, objComment) 'display the contents of the new document outResult2.InnerHtml = "Created new XML document and inserted " _ & "into it the node selected by<br />" _ & "the XPath expression '" & strXPath & "'" _ & "Content of new document is:<br />" _ & Server.HtmlEncode(objNewDoc.OuterXml)
The final part of the example page edits some values in the original document. This time, an XPath expression that will match more than one node is necessary, and so we use the SelectNodes method of the document to return an XmlNodeList object containing references to all the matching nodes (in our example, all the <ISBN> nodes). We can then display the number of matches found.
The plan is to add an attribute to all of the <ISBN> elements, and replace the text content (value) of these elements with two new elements that contain the information in a different form. After declaring some variables that are needed, we can iterate through the collection of <ISBN> nodes using a For Each construct.
strXPath = "descendant::ISBN" 'get a reference to the matching nodes as a collection Dim colNodeList As XmlNodeList colNodeList = objXMLDoc.SelectNodes(strXPath) 'display the number of matches found outResult3.InnerHtml = "Found " & colNodeList.Count _ & " nodes matching the" _ & "XPath expression '" & strXPath & "'<br />" _ & "Editing and inserting new content<br />" 'create variables to hold an XmlAttribute and other values Dim objAttr As XmlAttribute Dim strNodeValue, strNewValue, strShortCode As String 'iterate through all the nodes found For Each objNode In colNodeList ...
Within the loop, we first create a new attribute named formatting and set the value to hyphens (all the <ISBN> nodes will have the same value for this attribute). You can add this attribute to the <ISBN> element node by calling the SetAttribute method. However, there is a minor hitch “ the members of an XmlNodeList are XmlNode objects, which don't have a SetAttribute method. In Visual Basic, you can get around this by casting the object to an XmlElement object using the CType (convert type) function.
To change the content of the <ISBN> elements, you only have to set the InnerXml property. This is much easier than using the InsertBefore and InsertAfter methods demonstrated earlier, and provides a valid alternative when the content you want to insert is available as a string (recall that you had references to the element node and its new content node when you used InsertBefore ).
Our code extracts the existing ISBN value, creates the new short code from it, formats the existing ISBN with hyphens, and then creates a string containing the new content for the element. The final step is to insert these values into the <ISBN> node by setting its InnerXml property, before going round to do the next one. You can then end the page by writing the complete edited XML document to a disk file and displaying a hyperlink to it so that it can be viewed .
... 'create an XmlAttribute named 'formatting' objAttr = objXMLDoc.CreateAttribute("formatting") 'set the value of the XmlAttribute to 'hyphens' objAttr.Value = "hyphens" 'and add it to this ISBN element have to cast the object 'to an XmlElement as XmlNode doesn't have this method CType(objNode, XmlElement).SetAttributeNode(objAttr) 'get text value of this ISBN element strNodeValue = objNode.InnerText 'create short and long strings to replace content strShortCode = Right(strNodeValue, 4) strNewValue = Left(strNodeValue, 1) & "-" _ & Mid(strNodeValue, 2, 6) & "-" _ & Mid(strNodeValue, 8, 2) & "-" _ & Right(strNodeValue, 1) 'insert into element by setting the InnerXml property objNode.InnerXml = "<LongCode>" & strNewValue _ & "</LongCode><ShortCode>" _ & strShortCode & "</ShortCode>" Next 'write the updated document to a disk file objXMLDoc.Save(strNewPath) 'display a link to view the updated document outResult3.InnerHTML &= "Saved updated document: <a href=""" _ & strNewPath & """>" & strNewPath & "</a>"
If you open both documents “ the original and the edited version (Figures 11-21, 11-22) “ you can see the effects of our editing process. The first contains the <Book> node with the <ISBN> value 0764544020 , which is not present in the second. You can also see the updated <ISBN> elements in the second document:
In this example, we've demonstrated several techniques for working with an XML document using the System.Xml classes provided in .NET. Some of the techniques use the XML DOM methods as defined by W3C, and some are specific extensions available with the XmlDocument (and other) objects. In general, these extensions make common tasks a lot easier “ for example, the ability to access the InnerText , InnerXml , and OuterXml of a node makes it remarkably easy to edit or insert content and markup.
We haven't covered all the possibilities for accessing XML documents, as you'll see if you examine the list of properties, methods, and events for each of the relevant objects in the SDK. However, by now, you should have a flavor for what is possible, and how easy it is to achieve.