Why XPath?

The main purpose of XPath is to make it easy to work with the data in an XML document. XPath lets you address specific parts of XML documents.

Let's begin with an example to demonstrate the reason to use XPath. Say, for example, that you have an XML document, ch01_01.xml , that stores information about various planets in three <planet> elements, as you see in Listing 1.1.

Listing 1.1 A Sample XML Document ( `ch01_01.xml` )

 <?xml version="1.0" encoding="UTF-8"?> <?xml-stylesheet type="text/xsl" href="ch01_02.xsl"?> <planets>     <planet>         <name>Mercury</name>         <mass units="(Earth = 1)">.0553</mass>         <day units="days">58.65</day>         <radius units="miles">1516</radius>         <density units="(Earth = 1)">.983</density>         <distance units="million miles">43.4</distance><!--At perihelion-->     </planet>     <planet>         <name>Venus</name>         <mass units="(Earth = 1)">.815</mass>         <day units="days">116.75</day>         <radius units="miles">3716</radius>         <density units="(Earth = 1)">.943</density>         <distance units="million miles">66.8</distance><!--At perihelion-->     </planet>     <planet>         <name>Earth</name>         <mass units="(Earth = 1)">1</mass>         <day units="days">1</day>         <radius units="miles">2107</radius>         <density units="(Earth = 1)">1</density>         <distance units="million miles">128.4</distance><!--At perihelion-->     </planet> </planets>

Now say that you want to extract the names of the three planets hereMercury, Venus, and Earthfrom this XML document. Each of these names is buried deep in the document, stored as text in the <name> elements. How can you access them?

This is where XPath comes in. To do what it does, XPath uses a non-XML syntax that the creators of XPath, the World Wide Web Consortium (W3C), call "compact." As we'll see, if you don't know XPath, that syntax can be more than compactit can be impenetrable. But when you know XPath, you'll be able to access any part of any XML document.

XPath doesn't work by itselfit was meant to be embedded in other languages and applications. XPath was originally developed for use with Extensible Stylesheet Language Transformations (XSLT), and in fact, XSLT can do the work we wantrecovering the data we're afterusing XPath. XPath points out the data to use, and XSLT actually grabs and uses that data, so they're natural to use together.

Because XPath is used so often with XSLT, we'll get an introduction to XSLT later in this chapter, and we'll see a more in-depth treatment of XSLT with XPath in Chapter 5. For now, what's important to know is that the way you select the data you want in an XML document is by using an XPath expression , and that you can put such expressions to work in XSLT stylesheets . Using an XSLT processor, you can apply XSLT stylesheets to XML documents and so access the data you want.

To get to the data we want in ch01_01.xml , we'll start by accessing the <planets> element in the XSLT stylesheet, matching that element with the XPath expression planets . Then we'll match each <planet> element inside the <planets> element with the XPath expression planet , and finally, we'll extract the name of each planet from each <planet> element's <name> element, using the XPath expression name . You can see what it looks like in the XSLT stylesheet in Listing 1.2 (don't worry about the XSLT details at this point; we'll see how to construct XSLT stylesheets like this one later in the chapter and in depth in Chapter 5).

Listing 1.2 A Sample XSLT Document ( `ch01_02.xsl` )

 <?xml version="1.0" encoding="UTF-8"?> <xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform">  <xsl:template match="planets">  <HTML>             <xsl:apply-templates/>         </HTML>     </xsl:template>  <xsl:template match="planet">  <P>  <xsl:value-of select="name"/>  </P>     </xsl:template> </xsl:stylesheet>

Open this examplethat is, navigate to ch01_01.xml in Microsoft Internet Explorer to see it at work. You can see the results in Figure 1.1.

Figure 1.1. Performing an XSL transformation in Internet Explorer.

graphics/01fig01.jpg

As you see in the figure, the XSLT processor in Internet Explorer has used our XSLT stylesheetwhich uses XPathto retrieve the data we wanted from our XML document: the names of the three planets. Even though the details may not be clear yet, you can begin to see in this example the part that XPath plays in letting you access specific parts of XML documents. As you see, XPath is central to XSLT, because it gives you the ability to select the data you want in an XML document.

This example has given us a quick look at the kind of thing that XPath can do for you. Now it's time to get more systematic and take a look at the whole XPath picture in overview.

Listing 1.1 A Sample XML Document ( ch01_01.xml )

Listing 1.2 A Sample XSLT Document ( ch01_02.xsl )

Figure 1.1. Performing an XSL transformation in Internet Explorer.

Listing 1.1 A Sample XML Document ( `ch01_01.xml` )

Listing 1.2 A Sample XSLT Document ( `ch01_02.xsl` )