20.2 Parsing a Simple XML Document


You want to know the steps to parsing a simple XML document.

Technique

Here is an example of an XML document and the steps needed to parse that document:

 test.xml -- The XML file we parse. <?xml version="1.0"?> <book>     <chapter id="1">         <title>Chapter 1</title>         <contents>             The Contents of chapter one go here.         </contents>     </chapter>     <chapter id="2">         <title>Chapter 2</title>         <contents>             The Contents of chapter two go here.         </contents>     </chapter> </book> xml-test.php -- The PHP program to parse the XML file. <?php function start_element($parser, $element_name, $element_attrs) {     switch ($element_name) {         case "CHAPTER":             print "<a name=\"#$element_name{$element_attrs['ID']}\"></a>";             break;         case "TITLE":             print '<h2>';             break;         case "CONTENTS":             print '<font face="arial" size="2">';             break;     } } function end_element($parser, $element_name) {     switch ($element_name) {         case "TITLE":             echo "</h2>\n";             break;         case "CONTENTS":             echo "</font>\n<br><br>";             break;     } } function character_data($parser, $data) {     echo $data; } $parser = xml_parser_create(); xml_set_element_handler($parser, 'start_element', 'end_element'); xml_set_character_data_handler($parser, 'character_data'); $fp = fopen('test.xml', 'r') or die('Cannot open test.xml'); while ($data = fread($fp, 4096)) {     xml_parse($parser, $data, feof($fp))         or die(sprintf('XML Error: %s at line %d',                        xml_error_string(xml_get_error_code($parser)),                        xml_get_current_line_number($parser))); } xml_parser_free($parser); ?> 

Comments

PHP connects to the Expat C API, so if you have used Expat before, the example in the synopsis should look somewhat familiar. If you haven't used Expat before, let me explain a little bit about the theory involved in parsing XML documents.

Expat uses a SAX-type XML parser, meaning that it uses handlers for the different types of events that may occur. When an event occurs, the data is passed to these different handlers, along with information related to the data. Let's consider the XML document in the preceding "Technique" section: On the start of an element ( <title> would be an element), PHP sends the element name (that is, the title) and any attributes (in array form) related to the start tag to the start_element() function. The start_element() function is defined as the handler for start elements by the xml_set_handler() function.

After you have defined your handlers, you can use the xml_parse() function to parse the XML data either in separate chunks (as done here) or all at once. The xml_parse() function returns true on success and false on failure, so you can check whether any errors occurred. When you are done parsing the document, free the parser allocated by the xml_parser_create() function.

This is essentially what is done to parse XML documents in PHP with the Expat library. There are a few tricks and a couple more functions, but the idea is basically the same: create the parser, create the handlers, set the handlers, and then parse the document.



PHP Developer's Cookbook
PHP Developers Cookbook (2nd Edition)
ISBN: 0672323257
EAN: 2147483647
Year: 2000
Pages: 351

flylib.com © 2008-2017.
If you may any questions please contact us: flylib@qtcs.net