Section 19.1. SimpleXML


19.1. SimpleXML

PHP offers several different ways of parsing XML, but as of PHP 5, the most popular way is to use the SimpleXML extension. SimpleXML works by reading in the entire XML file at once and converting it into a PHP object containing all the elements of that XML file chained together in the same way. Once the file has been loaded, you can simply pull data out by traversing the object tree.

The advantage of SimpleXML is that you no longer need to write any complicated code to access your XMLyou simply load it, then read in attributes as you would expect to be able to. Consider the following XML file, employees.xml:

     <employees>             <employee>                     <name>Anthony Clarke</name>                     <title>Chief Information Officer</title>                     <age>48</age>             </employee>             <employee>                     <name>Laura Pollard</name>                     <title>Chief Executive Officer</title>                     <age>54</age>             </employee>     </employees> 

The base element is a list of employees, and it contains several employee elements. Each employee has a name, a title, and an age. Now take a look at this basic SimpleXML script:

     $employees = simplexml_load_file('employees.xml');     var_dump($employees); 

Here is the output:

     object(simplexml_element)#1 (1) {             ["employee"]=>             array(2) {                     [0]=>                     object(simplexml_element)#2 (3) {                             ["name"]=>                             string(14) "Anthony Clarke"                             ["title"]=>                             string(25) "Chief Information Officer"                             ["age"]=>                             string(2) "48"                     }                     [1]=>                     object(simplexml_element)#3 (3) {                             ["name"]=>                             string(13) "Laura Pollard"                             ["title"]=>                             string(23) "Chief Executive Officer"                             ["age"]=>                             string(2) "54"                     }             }     } 

From that, you should be able to see that the base element has an array employee, containing two elementsone for each of the employees in the XML file. Each element in that array is another object, containing the name, the title, and the age of each employee. Put simply, each collection of data is made into an array, and each distinct XML element is made into an object.

Now, consider the following script, using the same XML file:

     $employees = simplexml_load_file('employees.xml');     foreach ($employees->employee as $employee) {             print "{$employee->name} is {$employee->title} at age {$employee->age}\n";     } 

This time the script actually does something useful with the XML content, and iterates through the $employees->employee array. As each employee element is read from the array, its information is printed out. Note how easy it is to read information from elements, simply because the XML is all converted to standard PHP variables.

19.1.1. XML Attributes

SimpleXML allows you to access attributes of XML elements as if the element were an array. Here's some very simple XML with attributes:

     <cakes>             <cake type="sponge">                     <name language="english">Victoria Cake</name>             </cake>     </cakes> 

In that example, the cake element has a type attribute, and the name element has a language attribute. This next script accesses them both:

     $xml = simplexml_load_file("cakes.xml");     print "{$xml->cake[0]["type"]}\n";     print "{$xml->cake[0]->name["language"]}\n"; 

The $xml->cake[0] part accesses the first cake element, as we have already discussed. However, note that it treats the cake as an array in order to get the type attribute. If we had used $xml->cake[0]->type, it would have looked for a <type> child element of the cake, which doesn't exist.

The next line, $xml->cake[0]->name["language"], gets the first cake, pulls out its <name> child element, then reads the "language" attribute. As long as you remember that elements use -> and attributes use [ ], you'll be OK.

19.1.2. Reading from a String

While simplexml_load_file( ) loads XML data from a file, simplexml_load_string( ) loads XML data from a string. This is generally not as useful, but it does allow you to load several XML files into one string, then use that inside one SimpleXML structure.

For example:

     $employees = <<<EOT     <employees>     <employee  FOO="BAR">     <name>Anthony Clarke</name>     <title>Chief Information Officer</title>     <age>48</age>     </employee>     <employee  BAZ="WOM">     <name>Laura Pollard</name>     <title>Chief Executive Officer</title>     <age>54</age>     </employee>     </employees>     EOT;     $employees = simplexml_load_string($employees);     foreach ($employees->employee as $employee) {             print "{$employee->name} is {$employee->title} at age {$employee->age}\n";     } 

The majority of that script is just the heredoc-style string assignment that sets up the XML. Then, with a call to simplexml_load_string( ), the XML is parsed into the $employees object, just as with the simplexml_load_file( ) function. The resulting object is no different.

19.1.3. Searching and Filtering with XPath

The standard way to search through XML documents for particular nodes is called XPath. Sterling Hughes (the creator of the SimpleXML extension) described it by saying it's "as important to XML as regular expressions are to plain text," which should give you an idea of just how important it is!

Fortunately for us, XPath is much easier than regular expressions for basic usage. Using the same employees.xml file, here is an XPath script:

     $xml = simplexml_load_file('employees.xml');     echo "<strong>Using direct method...</strong><br />";     $names = $xml->xpath('/employees/employee/name');     foreach($names as $name) {             echo "Found $name<br />";     }     echo "<br />";     echo "<strong>Using indirect method...</strong><br />";     $employees = $xml->xpath('/employees/employee');     foreach($employees as $employee) {             echo "Found {$employee->name}<br />";     }     echo "<br />";     echo "<strong>Using wildcard method...</strong><br />";     $names = $xml->xpath('//name');     foreach($names as $name) {             echo "Found $name<br />";     } 

That pulls out names of employees in three different ways, and the work is all done in the call to the xpath( ) function. This takes a query as its only parameter, and returns the result of that query. The query itself has specialized syntax, but it's very easy. The first example says, "Look in all the employees elements, find any employee elements in there, and retrieve all the names of them." It's very specific because only employees/employee/name is matched.

The second query matches all employee elements inside employees, but doesn't go specifically for the name of the employees. As a result, we get the full employee back, and need to print $employee->name to get the name.

The last one just looks for name elements, but note that it starts with "//"this is the signal to do a global search for all name elements, regardless of whereor how deeply nestedthey are in the document.

XPath can also be used to filter your results according to any values you want. For example:

     $xml = simplexml_load_file('employees.xml');     echo "<strong>Matching employees with name 'Laura Pollard'</strong><br />";     $employees = $xml->xpath('/employees/employee[name="Laura Pollard"]');     foreach($employees as $employee) {             echo "Found {$employee->name}<br />";     }     echo "<br />";     echo "<strong>Matching employees younger than 54</strong><br />";     $employees = $xml->xpath('/employees/employee[age<54]');     foreach($employees as $employee) {             echo "Found {$employee->name}<br />";     }     echo "<br />";     echo "<strong>Matching employees as old or older than 48</strong><br />";     $employees = $xml->xpath('//employee[age>=48]');     foreach($employees as $employee) {             echo "Found {$employee->name}<br />";     }     echo "<br />"; 

The filter is done between the square brackets, [ and ]. The first query grabs all employees elements, then all employee elements inside it, and then filters them so that only those that have a name that matches Laura Pollard are retrieved. Once you get that, the other two are quite obvious: <, >, <=, etc., all work as you'd expect in PHP.

If you want to filter by the value of an attribute rather than the value of an element, you need to use the @ symbol. For example, our cakes.xml file has cakes that have a "type" attribute. To search for specific types using XPath, you would need to use code like this:

     $sponge_cakes = $xml->Xpath('//cake[@type="sponge"]'); 

You can grab only part of a query result by continuing on as normal afterward, like this:

     $ages = $xml->xpath('//employee[age>=48]/age');     foreach($ages as $age) {             echo "Found $age<BR/>";     } 

You can even run queries on queries, with an XPath search like this:

     $employees = $xml->xpath('//employee[age>=49][name="Laura Pollard"]'); 

Going back to selecting various types of elements, you can use the | symbol (OR) to select more than one type of element, like this:

     echo "<B>Retrieving all titles and ages</B><BR/>";     $results = $xml->xpath('//employee/title|//employee/age');     foreach($results as $result) {             echo "Found $result<BR/>";     } 

That will output the following:

     Found Chief Information Officer     Found 48     Found Chief Executive Officer     Found 54 

You can combine all of this together to search on more than one value, like this:

     $names = $xml->xpath('//employee[age<40]/name|//employee[age>50]/name');     foreach($names as $name) {             echo "Found $name<BR/>";     } 

For more complex work, you can run calculations using XPath in order to get tighter control over your queries. For example, if you only wanted the names of employees who have an odd age (that is, cannot be divided by two without leaving a remainder), you would use an XPath query like this:

     $names = $xml->xpath('//employee[age mod 2 = 1]/name'); 

Along with mod (equivalent to % in PHP) there's also div for division, + and -, and ceiling( ) and floor( ) (equivalent to their namesakes in PHP). These are quite advanced and don't get much use in practice. When using "-", you have to keep it from looking like part of an element name, so foo-bar needs to be written as foo - bar so that we don't think we're talking about an element named foo-bar.

19.1.4. Outputting XML

One of the most interesting features about SimpleXML is that it can, at any time, give you a string containing the well-formed XML representation of its data. This essentially does the opposite of simplexml_load_file( ), but incorporates any changes you've made to the data while it was in SimpleXML form.

For example:

     $xml = simplexml_load_file('employees.xml');     $xml->employee[1]->age = 55;     echo $xml->asXML( ); 

That loads our XML file, and changes the second employee to have an age of 55. The call to asXML( ) then outputs the changed data tree, printing this:

     <?xml version="1.0"?>     <employees>             <employee>                     <name>Anthony Clarke</name>                     <title>Chief Information Officer</title>                     <age>48</age>             </employee>             <employee>                     <name>Laura Pollard</name>                     <title>Chief Executive Officer</title>                     <age>55</age>             </employee>     </employees> 

Note the changed value for Laura's age. However, blindly changing values isn't a smart move: the XML could change quite easily so that Pollard is no longer the second person in there. Instead, you should really combine it with an XPath search, like this:

     $xml = simplexml_load_file('employees.xml');     echo "\nBefore transformation:\n\n";     echo $xml->asXML( );     $xml->employee[1]->age = 55;     $employees = $xml->xpath('/employees/employee[name="Anthony Clarke"]');     $employees[0]->title = "Chairman of the Board, Chief Information Officer";     echo "\n\nAfter transformation:\n\n";     echo $xml->asXML( ); 

This time the age is changed by referencing Laura directly, but I've also changed the job title of Anthony Clarke using a smart XPath search for his exact name. Of course, even names can be duplicated by chance, so an employee ID would be even better!



PHP in a Nutshell
Ubuntu Unleashed
ISBN: 596100671
EAN: 2147483647
Year: 2003
Pages: 249

flylib.com © 2008-2017.
If you may any questions please contact us: flylib@qtcs.net