Parsing XML with Validation


XML documents can be checked for validity in a number of ways, and the XmlValidatingReader lets you validate XML using the three most common standards:

  • DTDs

  • W3C schemas

  • XDR schemas

XmlValidatingReader has the same set of methods and properties as XmlTextReader, with a few additional properties to support validation, which are listed in the following table.

Property

Description

CanResolveEntity

Returns a value indicating whether this reader can resolve entities. XmlValidatingReader always returns true.

Depth

Gets the depth of the current node in the XML document.

EntityHandling

Specifies the type of entity handling: whether to expand all entities (the default) or expand character entities and return general entities as nodes.

Reader

A pointer to the underlying XmlReader.

Schemas

Returns the collection of schemas used for validation.

SchemaType

Gets a schema type object for the element currently being read. This property returns a null reference if it’s called when validation is performed using a DTD.

ValidationType

Specifies the type of validation to perform: none, DTD, Schema, XDR, or Auto. The default is Auto, which will determine the type of validation required from data in the file.

There’s one extra method over and above those supported by XmlTextReader, ReadTypedValue, which gets a .NET common language runtime (CLR) type corresponding to a type in validated XML.

You can create an XmlValidatingReader to parse XML document fragments from a string or a stream, but it’s most common to base the validating reader on an underlying XmlTextReader object.

The following exercise modifies the XmlTextReader program to validate the XML as it’s parsed. To perform validation, you need to have a DTD or a schema to validate against. Here’s a DTD for the volcano XML data, which I’ve stored in a file named geology.dtd:

<!ELEMENT geology (volcano)+> <!ELEMENT volcano (location,height,type,eruption+,magma,comment?)> <!ATTLIST volcano name CDATA #IMPLIED> <!ELEMENT location (#PCDATA)> <!ELEMENT height EMPTY> <!ATTLIST height value CDATA #IMPLIED unit CDATA #IMPLIED> <!ELEMENT type (#PCDATA)> <!ELEMENT eruption (#PCDATA)> <!ELEMENT magma (#PCDATA)> <!ELEMENT comment (#PCDATA)>

Note

I’ve used a DTD for simplicity, but a schema can be used in exactly the same way.

  1. Edit the volcanoes.xml file to add a DOCTYPE reference at the top of the file.

    <?xml version="1.0" ?> <!DOCTYPE geology SYSTEM "geology.dtd"> <!-- Volcano data -->

    If you check the sample XML document against the DTD, you’ll notice that there’s a problem. The element ordering for the second volcano, Hekla, is location-type-height rather than the location- height-type order demanded by the DTD. So, when you parse this XML with validation, you’d expect a validation error from the parser.

  2. Add a using declaration to the top of the CppXmlTextReader.cpp, as shown here:

    using namespace System::Xml::Schema;

    Some of the classes and enumerations are part of the System::Xml::Schema namespace, and the inclusion of the using declaration will make it easier to refer to them in code.

  3. Create an XmlValidatingReader based on the existing XmlTextReader, like this:

    // Create the reader... XmlTextReader* rdr = new XmlTextReader(path); // Create the validating reader and set the validation type XmlValidatingReader* xvr = new XmlValidatingReader(rdr); xvr->ValidationType = ValidationType::Auto;

    The constructor for the XmlValidatingReader takes a reference to the XmlTextReader, which it uses to perform the basic parsing tasks. The last line sets the validation type to Auto, which means that the XmlValidatingReader will decide for itself what type of validation to use, based on the references to DTDs or schemas it finds in the XML document.

    Note

    NoteIt’s not really necessary to set the ValidationType in this case because Auto is the default, but I included it to show you how to control the validation.

  4. Edit all the code that parses the XML to use the XmlValidatingReader xvr rather than the XmlTextReader rdr, as follows:

    // Read nodes from the XmlValidatingReader while (xvr->Read()) { switch (xvr->NodeType) { case XmlNodeType::XmlDeclaration: Console::WriteLine(S"-> XML declaration"); break; case XmlNodeType::Document: Console::WriteLine(S"-> Document node"); break; case XmlNodeType::Element: Console::WriteLine(S"-> Element node, name={0}", xvr->Name); break; case XmlNodeType::EndElement: Console::WriteLine(S"-> End element node, name={0}", xvr->Name); break; case XmlNodeType::Text: Console::WriteLine(S"-> Text node, value={0}", xvr->Value); break; case XmlNodeType::Comment: Console::WriteLine(S"-> Comment node, name={0}, value={1}", xvr->Name, xvr->Value); break; case XmlNodeType::Whitespace: break; default: Console::WriteLine(S"** Unknown node type"); break; } }

    Because XmlValidatingReader provides a superset of the XmlTextReader functionality, it’s a simple matter to swap between the two.

  5. If you now build and run the program, it should throw an exception when it finds the invalid element ordering in the document, plus several more lines of stack trace information.

    System.Xml.Schema.XmlSchemaException: Element ’volcano’ has invalid child element ’type’. Expected ’height’. An error occurred at file:///C:/XMLFiles/volcanoes.xml(14, 6). at System.Xml.XmlValidatingReader.InternalValidationCallback( Object sender, ValidationEventArgs e)

    Note that the error message gives the line and character position where the parser found the problem, which in this case is line 14, character 6. Note that if you have typed in the XML file using different formatting, you might get different line and character numbers. By default, the parser will throw an exception if it finds a validation error, and if you don’t handle it, the program will terminate.

    You can improve on this error handling by installing an event handler. The parser fires a ValidationEvent whenever it finds something to report to you, and if you install a handler for this event, you’ll be able to handle the validation errors yourself and take appropriate action.

  6. Event handler functions must be members of a managed class, so create a new class to host a static handler function. Add this code before the _tmain function:

    // Validation handler class __gc class ValHandler { public: static void ValidationHandler(Object* pSender, ValidationEventArgs* pe) { Console::WriteLine(S"Validation Event: {0}", pe->Message); } };

    The ValHandler class contains one static member, which is the handler for a ValidationEvent. As usual, the handler has two arguments: a pointer to the object that fired the event, and an argument object. In this case, the handler is passed a ValidationEventArgs object that contains details about the parser validation error. This sample code isn’t doing anything except printing the error message, but in practice, you’d decide what action to take based on the Severity property of the ValidationEventArgs object.

  7. Link up the handler to the XmlValidatingReader in the usual way:

    XmlValidatingReader* xvr = new XmlValidatingReader(rdr); xvr->ValidationType = ValidationType::Auto; // Set the handler xvr->ValidationEventHandler += new ValidationEventHandler(0, &ValHandler::ValidationHandler);

    Make sure that you set up the handler before you call Read to start parsing the XML.

  8. Build and run the program. This time, you won’t get the exception message and stack trace, but you will see the messages printed out from the event handler as it finds validation problems.

  9. Correct the ordering of the elements in the XML file, and run the program again. You shouldn’t see any validation messages this time through.




Microsoft Visual C++  .NET(c) Step by Step
Microsoft Visual C++ .NET(c) Step by Step
ISBN: 735615675
EAN: N/A
Year: 2003
Pages: 208

flylib.com © 2008-2017.
If you may any questions please contact us: flylib@qtcs.net