XPath and XML Infosets
An XML infoset is intended to hold all the information in an XML document in compact form. Reducing an XML document to its infoset is intended to make comparisons between all kinds of XML documents easier by presenting the data in those documents in a standard way. You can find the official XML Information Set specification at www.w3.org/TR/xml-infoset.
To understand what infosets are and what they're used for, imagine searching for data on the World Wide Web. You may want to search for a particular topic, such as XML, and you'd turn up millions of matches. How could you possibly write software to compare those documents? The data in those documents isn't stored in any way that's directly comparable.
That's where infosets come in because the idea is to regularize how data is stored in an XML document, which will, ultimately, let you work with thousands of such documents. The idea behind infosets is to set up an abstract way of looking at an XML document that allows it to be compared to others.
XML infosets have their own data model, which is not the same as the XPath data model. An XML infoset can contain 15 different types of information items:
Each of these information items themselves have a set of properties, which contain more informationfor example, the document information item has properties that let you access the children of the root node.
Over time, several XML standards have developed their own data model, and W3C is trying to get them all reconciled. You won't have to know about infosets in this book, but if you're already familiar with them, it's useful to know how you can derive the nodes in the XPath data model from the information items provided by an XML infoset. Here's how that works:
In fact, one of the tasks of XPath 2.0 was to reconcile the data models used in XPath and the XML Infoset specifications, and we'll discuss that later in Chapter 7.