Using XmlDocument


Using XmlDocument

Our handling of XML so far has been forward-only, which is very light on resource usage but isn’t so useful if you need to move around within the XML document. The XmlDocument class is based on the W3C DOM, and it’s the class that you want to use if you need to browse, modify, or create an XML document.

start sidebar
What Is the W3C DOM?

The DOM is a specification for an API that lets programmers manipulate XML held in memory. The DOM specification is language-independent, and bindings are available for many programming languages, including C++. XmlDocument is based on the DOM, with Microsoft extensions.

Because XmlDocument works with XML in memory, it has several advantages and disadvantages when compared with the XmlTextReader forward-only approach.

One advantage is that, in reading the entire document and building a tree in memory, you have access to all the elements and can wander through the document at will. You can also edit the document by changing, adding, or deleting nodes, and you can write the changed document back to disk again. It’s even possible to create an entire XML document from scratch in memory and write it out—serialize it—which is a useful alternative to using XmlTextWriter.

The main disadvantage is that all of an XML document is held in memory at once, so the amount of memory needed by your program is going to be proportional to the size of the XML document you’re working with. Therefore, if you’re working with a very large XML document—or have limited memory—you might not be able to use XmlDocument.

end sidebar

The XmlDocument class has a number of properties, methods, and events, the most important of which are summarized in the following three tables.

Property

Description

Attributes

Gets an XmlAttributeCollection representing the attributes of a node.

ChildNodes

Gets all the child nodes of a node.

DocumentElement

Returns the root element for the document.

DocumentType

Returns the DOCTYPE node, if one is present.

FirstChild, LastChild

Gets the first or last child nodes of a node.

HasChildNodes

Value is true if a node has child nodes.

InnerText

Returns the concatenated values of a node and all its child nodes.

InnerXml

Gets or sets the markup representing the children of the current node.

IsReadOnly

Gets a value indicating whether the current node is read-only.

LocalName

Gets the name of the current node without a namespace prefix.

Name

Gets the fully qualified name of the current node.

NodeType

Gets the type of the current node. The node type will be one of the XmlNodeType values listed in the table on page 409.

OwnerDocument

Gets the XmlDocument to which the current node belongs.

ParentNode

Gets the parent of a node.

PreserveWhitespace

Determines whether white space should be regarded as significant. The default is false.

Value

Gets or sets the value of a node.

Method

Description

AppendChild

Appends a child node to a node

CloneNode

Creates a duplicate of the current node

CreateAttribute

Creates an XmlAttribute object

CreateCDataSection

Creates an XmlCDataSection object

CreateComment

Creates an XmlComment object

CreateDefaultAttribute

Creates a default XmlAttribute object

CreateDocumentType

Creates an XmlDocumentType object

CreateElement

Creates an XmlElement object

CreateEntityReference

Creates an XmlEntityReference object

CreateNavigator

Creates an XPathNavigator for navigating the object and its contents

CreateNode

Creates a plain XmlNode

CreateProcessingInstruction

Creates an XmlProcessingInstruction object

CreateTextNode

Creates an XmlText object

CreateXmlDeclaration

Creates an XmlDeclaration object

GetElementById

Returns an XML element with the specified ID attribute

GetElementsByTagName

Gets a list of descendant nodes matching a name

ImportNode

Imports a node from another document

InsertBefore, InsertAfter

Inserts a node before or after a reference node

Load

Loads XML from a file, a URL, a stream, or an XmlReader object

LoadXml

Loads XML from a string

ReadNode

Creates an XmlNode based on the current position of an XmlReader

RemoveAll

Removes all child nodes and attributes from a node

RemoveChild, ReplaceChild

Removes or replaces a child node

Save

Saves the XML document to a file, a stream, or an XmlWriter

SelectNodes, SelectSingleNode

Select one or more nodes matching an XPath expression

WriteContentTo

Saves all the children of the XmlDocument node to an XmlWriter

WriteTo

Saves the XmlDocument to an XmlWriter

Event

Description

NodeChanged

Fired when the value of a node has been changed

NodeChanging

Fired when the value of a node is about to be changed

NodeInserted

Fired when a node has been inserted

NodeInserting

Fired when a node is about to be inserted

NodeRemoved

Fired when a node has been removed

NodeRemoving

Fired when a node is about to be removed

The XmlNode Class

You’ll notice a lot of references to nodes in the preceding tables. The DOM tree that an XmlDocument object builds in memory is composed of nodes, each of which is an object of a class that inherits from the abstract XmlNode base class. Just about everything in an XML document is represented by a node. For example:

  • Elements are represented by the XmlElement class.

  • Attributes are represented by the XmlAttribute class.

  • The text content of elements is represented by the XmlText class.

  • Comments are represented by the XmlComment class.

The XmlNode class provides common functionality for all these node types. Because this functionality is so important when working with XmlDocument, I’ve listed the properties and methods of XmlNode in the following two tables.

Property

Description

Attributes

Gets the collection of attributes for the node.

ChildNodes

Gets all the children of the node as an XmlNodeList.

FirstChild, LastChild

Gets a pointer to the first and last children of the node.

HasChildNodes

Value is true if a node has child nodes.

InnerText

Represents the concatenated values of the node and all its children.

InnerXml, OuterXml

InnerXml gets or sets the markup representing the children of the node. OuterXml includes the node and its children.

IsReadOnly

Returns the read-only status of the node.

Item

Gets a child element by name.

Name, LocalName

The name of the node, with or without namespace information.

NextSibling, PreviousSibling

Gets a pointer to the node immediately following or preceding a node.

NodeType

Returns an XmlNodeType value representing the type of the node.

OwnerDocument

Gets a pointer to the XmlDocument that owns this node.

ParentNode

Gets the node’s parent node.

Prefix

Gets or sets the namespace prefix for the node.

Value

Gets or sets the value of the node. What the value represents will depend on the node type.

Method

Description

AppendChild, PrependChild

Adds a child to the end or beginning of a node’s list of child nodes

Clone, CloneNode

Clones a node

CreateNavigator

Creates an XPathNavigator for navigating the object and its contents

GetEnumerator

Returns an enumerator for the collection of child nodes

InsertAfter, InsertBefore

Inserts a node after or before a specified node

Normalize

Normalizes the tree so that there are no adjacent XmlText nodes

RemoveAll

Removes all children and attributes of a node

RemoveChild

Removes a specified child node

ReplaceChild

Replaces a specified child node

SelectNodes

Selects a list of nodes matching an XPath
expression

SelectSingleNode

Selects the first node that matches an XPath expression

Supports

Tests whether the underlying DOM implementation supports a particular feature

WriteContentTo

Saves all children of the current node

WriteTo

Saves the current node

Perhaps the most important descendant of XmlNode is XmlElement, which represents an element within a document. This class adds a number of methods to XmlNode, most of which are concerned with getting, setting, and removing attributes.

The following exercise shows you how to use XmlDocument. You’ll write a program that reads the volcano XML file into memory and then inserts a new element into the structure.

  1. Start a new Visual C++ Console Application (.NET) project named CppDom.

  2. Add the two following lines to the top of CppDom.cpp. These lines reference the XML DLL and help you access the namespace members.

    #using <System.xml.dll> using namespace System::Xml; 
  3. You’re going to supply the name of the XML document to read when you run the program from the command line, so change the declaration of the _tmain function to include the command-line argument parameters, as shown here:

    int _tmain(int argc, char* argv[])
  4. Add this code to the start of the _tmain function to check the number of arguments and save the path:

    // Check for required arguments if (argc < 2) { Console::WriteLine(S"Usage: CppXmlWriter path"); return -1; } String* path = new String(argv[1]);
  5. Create a new managed class named XmlBuilder, and give it an XmlDocument* as a data member:

    __gc class XmlBuilder { XmlDocument* doc; };

    You need a managed class because it will be necessary to pass the XmlDocument pointer around between functions. You could pass the pointer explicitly in the argument list of each function, but it’s better to make it a member of a class so that it can be accessed by all the member functions.

  6. Add a constructor that creates an XmlDocument object, and tell it to load the file that was specified on the command line.

    public: XmlBuilder(String* path) { // Create the XmlDocument doc = new XmlDocument(); // Load the data doc->Load(path); Console::WriteLine(S"Document loaded"); }

    Unlike XmlTextReader, the XmlDocument class reads and parses the file when it’s constructed. Note that you’re not catching exceptions here. Something might go wrong when opening or parsing the file, but exceptions are left for the caller to handle.

  7. Add some code to the _tmain function to create an XmlBuilder object. Make sure you are prepared to handle any exceptions that occur.

    // Create a Builder and get it to read the file try { XmlBuilder* pf = new XmlBuilder(path); } catch(Exception* pe) { Console::WriteLine(pe->Message); }

    You can try building and running the code at this point. First copy the volcano.xml and geology.dtd files you created earlier into the project folder. If you see the “Document loaded” message displayed when you run the program, you know that the document has been loaded and parsed.

The next step is to access the nodes in the tree. The current XML document contains three volcano elements; what you’ll do is find the second element and insert a new element after it. There are a number of ways in which you could do this, and I’ll just illustrate one method. It isn’t the most efficient way to do the job, but it does show how to use several XmlDocument and XmlNode methods and properties.

  1. Continue working on the CppDom project. Start working with the tree by getting a pointer to its root. Because you’ll use this root several times, add an XmlNode* member to the XmlBuilder class to hold the root, like this:

    private: XmlNode* root;
  2. Add the following code to the constructor to get the root node:

    // Get the root of the tree root = doc->DocumentElement;

    DocumentElement returns you the top of the DOM tree. Note that this is not the root element of the XML document, which is one level down.

  3. You also need to get the list of child nodes for the root. Because you’ll be using this list again, add an XmlNodeList* member to the class to hold the list.

    private: XmlNodeList* xnl; 
  4. The following code shows how you can get a list of child nodes and iterate over it. Add this code to the constructor:

    // get the child node list xnl = doc->ChildNodes; IEnumerator* ie = xnl->GetEnumerator(); while (ie->MoveNext() == true) Console::WriteLine(S"Child: {0}", (dynamic_cast<XmlNode*>(ie->Current))->Name);

    The ChildNodes property returns a list of child nodes as an XmlNodeList. The XmlNodeList is a typical .NET collection class, which means that you can get an enumerator to iterate over the nodes. The code iterates over the child nodes, printing the name of each. Note that because Current returns an Object*, it has to be cast to an XmlNode* before you can use the Name property.

  5. The IEnumerator interface is part of the System::Collections namespace, so you need to add the following code near the top of the CppDom.cpp file, after the other using directives:

    using namespace System::Collections;

    If you run this code on the volcanoes.xml file, you should see output similar to the following:

    Document loaded Child: xml Child: geology Child: #comment Child: geology

    The root of the tree has four child nodes: the XML declaration, the DOCTYPE declaration, a comment, and the root node.

    Note

    Once you’ve verified the existence of the child nodes, you can remove the lines that declare and use the enumerator because you won’t need them again. Make sure you don’t remove the line that assigns the value to xnl!

  6. Now that you’ve got the root of the tree, you need to find the root element of the XML by using a public class member function named ProcessChildNodes, as shown here:

    void ProcessChildNodes() { // Declare an enumerator IEnumerator* ie = xnl->GetEnumerator(); while (ie->MoveNext() == true) { // Get a pointer to the node XmlNode* pNode = dynamic_cast<XmlNode*>(ie->Current); // See if it is the root if (pNode->NodeType == XmlNodeType::Element && pNode->Name->Equals(S"geology")) { Console::WriteLine(S" Found the root"); ProcessRoot(pNode); } } }

    The function creates an enumerator and iterates over the children of the root node. The root XML element will be of type XmlNodeType::Element and will have the name geology. Once we’ve identified that element, the function ProcessRoot is then used to process the children of the root XML element.

    Here’s the public ProcessRoot member function:

    void ProcessRoot(XmlNode* rootNode) { XmlNode* pVolc = dynamic_cast<XmlNode*>(rootNode->ChildNodes->Item(1)); // Create a new volcano element XmlElement* newVolcano = CreateNewVolcano(); // Link it in root->InsertBefore(newVolcano, pVolc); }

    The function is passed in the root node. I know that the file I’m working with has more than two volcano elements, and I know that I want to insert a new one before the second element. So, I can get a direct reference to the second element by using the Items property on ChildNodes to access a child node by index. In real code, you’d obviously need to put in a lot more checking to make sure you were retrieving the desired node.

    Once the node has been retrieved, you call CreateNewVolcano to create a new volcano element. Then you use InsertBefore to insert the new one immediately before the node you just retrieved by index.

  7. Now add the public CreateNewVolcano function, which creates a new volcano element. To save space, I haven’t given the code for creating the whole element, but just enough that you can see it working.

    XmlElement* CreateNewVolcano() { // Create a new element XmlElement* newElement = doc->CreateElement(S"volcano"); // Set the name attribute XmlAttribute* pAtt = doc->CreateAttribute(S"name"); pAtt->Value = S"Mount St.Helens"; newElement->Attributes->Append(pAtt); // Create the location element XmlElement* locElement = doc->CreateElement(S"location"); XmlText* xt = doc->CreateTextNode(S"Washington State, USA"); locElement->AppendChild(xt); newElement->AppendChild(locElement); return newElement; }

    The function creates a new XmlElement for the volcano. Note that the node classes—XmlElement, XmlComment, and so on—don’t have public constructors, so you need to create them by calling the appropriate factory method. The name attribute gets appended to the element’s collection of attributes, and then the location element is created with its content. Building DOM trees like this is a process of creating new nodes and appending them to one another.

  8. It would be useful to be able to print out the modified tree, so add a public function named PrintTree to the class, as shown here:

    void PrintTree() { XmlTextWriter* xtw = new XmlTextWriter(Console::Out); xtw->Formatting = Formatting::Indented; doc->WriteTo(xtw); xtw->Flush(); Console::WriteLine(); }

    You’ve already seen the use of XmlTextWriter to create XML manually. You can also use it to output XML from a DOM tree, by linking it up to an XmlDocument, as shown in the preceding code.

  9. Add calls to ProcessChildNodes and PrintTree to the _tmain function, and you can build and test the program.

    try { XmlBuilder* pf = new XmlBuilder(path); pf->ProcessChildNodes();  pf->PrintTree(); } catch(Exception* pe) { Console::WriteLine(pe->Message); }

    When you run the program, you’ll be able to see that the new node has been added to the tree. Remember that this operation has modified only the DOM tree in memory; the original XML file has not been changed.




Microsoft Visual C++  .NET(c) Step by Step
Microsoft Visual C++ .NET(c) Step by Step
ISBN: 735615675
EAN: N/A
Year: 2003
Pages: 208

flylib.com © 2008-2017.
If you may any questions please contact us: flylib@qtcs.net