Using XPath with PHP


Extensible Stylesheet Language (XSL) converts an XML document to another data format. Different XSL transformations use same XML document to generate different output, such as an HTML Web page.

Accessing XPath Using DOM

PHP uses a tree-based approach called DOM to parse an XML document. This approach helps create and manipulate the hierarchical tree structure of an XML document. You can implement DOM in any programming language, such as PHP, Java, and Visual Basic (VB). Implementing DOM in PHP, you can access an XML document using XPath. DOM is a parser that accesses and manipulates structured data by representing a document in the form of a tree hierarchy of objects.

Objects represent different structures that occur within an XML document. For example, the Element object represents elements and Attr objects represent attributes. Each object consists of standard properties and methods that navigate the object tree and access specific elements, attributes, or character data. The XPath processor can navigate a document tree using the parent-child relationship that exists between tree nodes. Node properties extract all information required from a document tree.

PHP consists of XPath classes that make the DOM parser flexible. The XPath classes build a collection of nodes that match the criterion specified in an XPath expression. The XPath classes available in PHP are XPathContext and XPathObject.

The XPathContext class sets up a context node for all XPath evaluations. You can create an XPathContext class object by calling the xpath_new_context() function. This function must be passed as a reference to a DOM object. The XPath evaluations provide instances of the XPathObject class.

The xpath_eval() method of the XPathContext class creates an instance of the XPathObject class. The xpath_eval() method accepts the XPath address as its argument. The xpath_eval() method returns an instance that contains the nodeset matching a specified XPath expression. As a result, the PHP XPath implementation to query an XML document parses the document into a DOM tree.

For example, the XML document, books.xml, lists information about books published by a publisher.

Listing 5-6 shows the content of the books.xml document:

Listing 5-6: The books.xml Document
start example
 <?xml version="1.0"?> <books> <book> <title>Introduction to computers</title> <publisher>Global Education Ltd.</publisher> <author>John Mitchell</author> <price>500</price> <publishedYear>1993</publishedYear> </book> <book> <title>C Programming</title> <publisher>Global Education Ltd.</publisher> <author>Taub Schiling</author> <price>900</price> <publishedYear>1996</publishedYear> </book> <book> <title>Operating Systems</title> <publisher>Global Education Ltd.</publisher> <author>David Kennedy</author> <price>1800</price> <publishedYear>1993</publishedYear> </book> </books> 
end example
 

In the above listing, the books.xml document contains information about books, such as the book title, author, price, and date of publishing, published by the publisher, Global Education Ltd.

For example, you need to solve the query to search titles and authors of the books published by Global Education Ltd. in 1993. You can execute this XPath query using the PHP document, books.php, as shown in Listing 5-7:

Listing 5-7: Executing the XPath Query Using DOM Implementation in PHP
start example
 <?php> $doc=xmldocfile("books.xml"); $xpath=$doc->xpath_new_context(); $output=$xpath->xpath_eval("/books/book[normalize-space(publisher/text())='Global Education Ltd.' and normalize-space(publishedYear/text())='1993']"); $nodeset=$output->nodeset; foreach ($nodeset as $node) {    foreach ($node->child_nodes() as $children)    {       if ($children->node_type()==XML_ELEMENT_NODE)       {          if ($children->tagname()=="title")          {             print ("Title:");             foreach ($children->child_nodes()as $subcontent)             {                if ($subcontent->node_type()==XML_TEXT_NODE)                {                   print ($subcontent->content);                }             }             print ("::");          }          if ($children->tagname()=="author")          {             print ("Author:");             foreach ($children->child_nodes() as $subcontent)             {                if ($subcontent->node_type()==XML_TEXT_NODE)                {                   print ($subcontent->content);                }             }             print ("<br />");          }       }    } } ?> 
end example
 

In the above listing, the PHP code, books.php, selects the title and author name of all books published by Global Education Ltd. in 1993 from the books.xml document. The xpath_eval() method evaluates the XPath expression provided as its argument.

Figure 5-3 shows the output of the PHP code in the books.php document:

click to expand: this figure shows the title and author name of two books published by global education ltd. in 1993.
Figure 5-3: Output of the Code in the books.php Document
Note  

The PHP XPath implementation performs slowly for large documents and is resource intensive .

Accessing XPath Using SAX

SAX is an event-driven model for parsing an XML document. When the SAX parser encounters an XML construct, such as a tag, it generates an event. These events are passed to event handlers, which, in turn , provide access to the content of a document. You can parse an XML document using XPath expressions in PHPs SAX implementation.

You can set handlers for XML location paths using the XML parsing class, path parser (class_path_parser.php). For each handler, you can set up a PHP function that the parser calls whenever it finds an element matching the given location path. When the parser detects an element that matches the location path , the function accepts element names , attributes, and content. A function can handle multiple location paths.

For example, the XML document, employees.xml, contains information pertaining to employees in a company.

Listing 5-8 shows the content of the employees.xml document:

Listing 5-8: The employees.xml Document
start example
 <?xml version="1.0"?> <employees> <employee name=Peter> </employee> <employeeId>e01</employeeId> <department>administration</department> <employee name=Julie></employee> <employeeId>e02</employeeId> <department>sales</department> <employee name=Taub></employee> <employeeId>e03</employeeId> <department>administration</department> <employee name=David></employee> <employeeId>e04</employeeId> <department>Planning</department> </employees> 
end example
 

You can implement SAX in PHP to retrieve XML elements that match the pattern described by an XPath expression. For example, you can retrieve the name of all employees from the employees.xml document.

Listing 5-9 shows the content of the employees.php document:

Listing 5-9: Retrieving Employee Names
start example
 <?php include_once("/class_path_parser.php"); function result($name,$attribs,$content) {    print("$name &nbsp;");    print($attribs [name]);    {       print ("<br/>");    } } $parser = new Path_parser(); $parser->set_handler("/employees/employee", "result"); if(!$parser->parse_file("employees.xml"))  {    print("Error:".$parser->get_error()."\n"); } ?> 
end example
 

The parse_file() method parses an XML document from a file or a URL. The syntax of the parse_file() method is:

 boolean parse_file (string $xml) 

In the above syntax, the parameter, $xml, is the name of the file or URL that contains the file that the parser needs to parse. The parse_file() method returns TRUE, if the file is successfully parsed, otherwise it returns FALSE.

Tip  

If the parser does not successfully parse a file, the parse_file() method returns an error message. You can use the get_error() method to retrieve the error message.

The set_handler() method processes XML element nodes that match the pattern specified by an XPath expression. The syntax of the set_handler() method is:

 set_handler (string $path, string $handlername) 

The above syntax shows that the $path represents the absolute path in an XML document. The $handlername must have three arguments: $name, $attribs, and $content. The $name argument denotes the element name, $attribs denotes element attributes, and $content denotes the text within the element node.

Figure 5-4 shows the output of the PHP code, employees.php:

click to expand: this figure shows the names of all employees who are listed in the employees.xml document.
Figure 5-4: Output of the employees.php Code

You can parse an XML file using the Path_parser class and set handlers for specific XML elements defined by an XPath expression. The handler name can accept the element name, attributes, and the content.

Accessing XPath Using XSLT

XSLT is a language that can transform XML documents into a text-based format, such as HTML or another XML-based structure, such as the Web Distributed Data eXchange (WDDX) format. To perform an XSLT transformation, you require an XML document that XSLT should transform, XSLT stylesheet, and an XSLT engine.

The XSLT stylesheet contains the instructions that you need to write to achieve the required transformation. You use XSL, an XML-based language to create stylesheets. You need to define the output layout or result tree and the input document or source tree from where the XSL retrieves data. An XSLT engine transforms an XML document into other document types using a stylesheet. You can use XPath expressions in the XSLT stylesheet to retrieve elements from an XML document and transform these elements into the required format. For example, the XML document, chapter.xml contains information about the headings in a chapter.

Listing 5-10 shows the content of the chapter.xml document:

Listing 5-10: The chapter.xml Document
start example
 <?xml version="1.0"?> <?xml-stylesheet type="text/xsl" href="chapter.xsl"?> <chapter> <h>Heading 1</h> <h>Heading 2</h> <h>Heading 3</h> </chapter> 
end example
 

You can create a stylesheet file to convert the chapter.xml document to an HTML document.

Listing 5-11 shows the content of the chapter.xsl stylesheet file:

Listing 5-11: The chapter.xsl Stylesheet File
start example
 <xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform" version="1.0"> <xsl:template match="/"> <HTML> <BODY> <xsl:for-each select="/chapter/h"> <S><xsl:value-of select="."/></S> </xsl:for-each> </BODY> </HTML> </xsl:template> </xsl:stylesheet> 
end example
 

In the above listing, <xsl: for-each> applies to any <h> elements, which are the child nodes of a <chapter> root element. The output contains <S> elements with the value of <xsl:value-of> element as its content.

You need to apply the stylesheet to the chapter.xml document.

Listing 5-12 shows output of the chapter.xml file after you apply the stylesheet to the chapter.xml document:

Listing 5-12: Output After Applying the Stylesheet
start example
 <HTML> <BODY> <S>Heading 1</S> <S>Heading 2</S> <S>Heading 3</S> </BODY> </HTML> 
end example
 

The above listing shows the HTML format for the chapter.xml document.

XPath expressions enable you to specify the location in an XML document and process this information using XSLT. You can process specific XML elements using XPath expressions with DOM, SAX, or XSLT. Parts of XSLT operate using a subset of XPath. You can use this subset to test if an XPath node matches a pattern defined by the location path. XPath operates only on the logical structure of an XML document.




Integrating PHP and XML 2004
Integrating PHP and XML 2004
ISBN: N/A
EAN: N/A
Year: 2004
Pages: 51

flylib.com © 2008-2017.
If you may any questions please contact us: flylib@qtcs.net