Chapter 3: Parsing XML Documents

Chapter 3

Using XML to solve integration and implementation problems is not as easy as defining a data model and creating instance documents. What will you do with these documents after you create them? How will you read them into your application or database for processing? Part of using XML as a solution involves an XML processor, which is the application responsible for processing these documents. One function of this processor, and the focus of this chapter, is parsing documents. The parser is responsible for parsing the XML document and verifying it by checking for well-formedness or by validating it against a schema. If these tasks are performed successfully, the data contained within the document is exposed in a method that makes it available for other manipulations.

Parsing is not always as simple as just reading through an XML document and verifying it for ASCII text. The structure and rules of your governing DTD can be verified when processing these instance documents if yours is a validating parser. You need this parsing application to evaluate the instance document and determine if it's valid and then make it available for secondary applications to utilize the data contained therein.

In this chapter we'll review what an XML parser does, examine the different models used to process XML documents, and introduce how XML can be manipulated using objects within the .NET Framework. By the end of the chapter you should have a good understanding of when to use which processing model, as well as which parser will best fit your needs.

So, what is parsing? According to http://www.dictionary.com, it's defined as follows:

  1. To break (a sentence) down into its component parts of speech with an explanation of the form, function, and syntactical relationship of each part
  2. To describe (a word) by stating its part of speech, form, and syntactical relationships in a sentence
  3. To examine closely or subject to detailed analysis, especially by breaking up into components: "What are we missing by parsing the behavior of chimpanzees into the conventional categories recognized largely from our own behavior?" (Stephen Jay Gould)
  4. To make sense of; comprehend: I simply couldn't parse what you just said
  5. Computer Science. To analyze or separate (input, for example) into more easily processed components

Parsing is an essential task for any application that uses language-based data or code as input. XML processors, which rely heavily on parsers, provide a standard mechanism for navigating and manipulating XML documents. If you have an XML document and need to get data out of it, change the data, or modify the XML document structure, you don't need to write code to load the XML file, validate it for specific characters and elements, and process this information accordingly. You can use an XML parser instead, which will load the document and give you access to its contents in the form of objects.



XML Programming
XML Programming Bible
ISBN: 0764538292
EAN: 2147483647
Year: 2002
Pages: 134

flylib.com © 2008-2017.
If you may any questions please contact us: flylib@qtcs.net