Building the PHP Parser

XML parsing with PHP has five steps.

Write functions to handle elements and text.
Create a parser.
Install the handlers.
Obtain XML string.
Parse.

Step 4 and 5 are repeated as needed for large XML files or continuous XML streams.

Step 1 is where all the thought and creativity is required. So let us demand little of our first parser: The handlers will be little more than stubs as we engage the rest of the process.

PHP

 <? function unknownXML($parser, $text) {     echo "what is this? ($text)<br/>"; } $xml_parser = xml_parser_create(); xml_set_default_handler($xml_parser, "unknownXML"); $data= $GLOBALS["HTTP_RAW_POST_DATA"]; if (!xml_parse($xml_parser, $data, true))      die("XML error: xml_error_string(xml_get_error_code($xml_parser))              at line xml_get_current_line_number($xml_parser))"); xml_parser_free($xml_parser); ?>

In this code we see a pretty straightforward implementation of the five steps. Note these details:

XML parser handlers are installed by referring to the name as a text string, not with the sort of function pointer we might expect.
The true parameter for xml_parse() marks this parse as the final one for this XML object. In a simple one-shot parse like this, it is not an interesting parameter. But remember that large XML files can be loaded in chunks , and they rely on this flag.
The parser is explicitly destroyed when complete.

Punching in a username and password in a Mardi Gras mood yields the screen in Figure 15.2.

Figure 15.2. Echoing the Calls to the Default Event Handler

graphics/15fig02.jpg

Now let's look a bit closer. Let's install specialized handlers for opening and closing elements and for handling character data. We add these lines to the PHP script.

PHP

 <? $indent=""; function startElement($parser, $name, $attrs) {     global $indent;     echo "$indent $name<br/>";     $indent.= "- - "; } function endElement($parser, $name) {     global $indent;     $indent= substr($indent, 0, -4);     echo "$indent $name ends<br/>"; } function characterData($parser, $text) {     global $indent;     echo "$indent $text<br/>"; } . . . xml_set_element_handler($xml_parser, "startElement", "endElement"); xml_set_character_data_handler($xml_parser, "characterData"); . . . ?>

We get the nicely formatted output in Figure 15.3. This is fine for echoing to the screen the parse events as they occur, but it does not do anything interesting or useful with the data.

Figure 15.3. Echoing the Calls to an Extended Event Handler Suite

graphics/15fig03.jpg

Our next step is to build a simple set of handlers. First we build a little parser that collects very simple messages. It simply builds a table of elements and their contentsa flat table of elements with a single entry for each. Therefore it has these limits:

No attributes. The following becomes just "iron: 9".

XML

 <iron unit="tons" quality="pig" source="duluth">9</iron>

No context. It cannot distinguish between a putter and a trainload of ingots.

XML

 <golf-clubs><iron>9</iron></golf-clubs> <freight><iron>9</iron></freight>

Only one instance, the most recent, is saved. The following resolves to just a 6 iron.

XML

 <golf-clubs>     <iron>9</iron>     <iron>7</iron>     <iron>6</iron> </golf-clubs>

No nodality within the text. The formatting in the following will disappear.

XML

 <iron>Sunbeam EasyLife <bold>cordless</bold> model</iron>

But even with such odious limits, this simple parser is perfectly adequate in many situations where the Flash client is sending a few simple pieces of data to the server. Our example here, where only a simple username and password are transmitted, is typical of this sort of communication.

First we establish three variables . One, theElements, is an associative array in which the key strings are the names of the elements found and the values are the contents of the elements. We track the currently active element with a simple string variable, thisElement. And to support nesting of elements, we build a stack, called oldElements, of currently open elements.

PHP

 <?   $theElements=array();   $oldElements=array();   $thisElement=""; function startElement($parser, $name, $attribs) {    global $thisElement, $oldElements, $theElements;    array_push($oldElements, $thisElement);    $thisElement=$name;    $theElements[ $thisElement ] = ""; } function endElement($parser, $name) {    global $thisElement, $oldElements;    $thisElement= array_pop($oldElements); } function characterData($parser, $text) {    global $thisElement, $theElements;    $theElements[ $thisElement ] .= $text; } function unknownXML($parser, $stuff) {     echo "What is this stuff? ($stuff)<br/>"; }

Note that the expat parser does not guarantee that all the character data for an element is sent in a single call to characterData(). It does not even promise that the calls will be broken on nodal boundaries. So our function features a concatenation rather than an assignment.

Also, empty elements are often very important. In some vocabularies they form significant tokens, application-specific keywords. The message might consist mostly of such tokens, as in the popular mobile-device protocol WAP. Or you might send a single such token: <STOP/> or < logoff /> . So this version of the parser creates such empty elements in the start_element handler.

Building and launching this parser is no different than creating the previous one. But when it comes time to display results, we need to scan through the array of elements. Let's display the results as follows (Figure 15.4).

Figure 15.4. Associating the Events into Full Elements

graphics/15fig04.jpg

PHP

 reset($theElements); while (list($elementName, $elementContent) = each($theElements))    echo "$elementName  => $elementContent <br/>";

Note that our structure has been flattened. Our wrapper element <LOGIN> is demoted from parent to sibling, and since it contains no text of its own, it is an empty element.

But why try to format a pretty response? Let's respond with functional XML.

PHP

 header("Content-type: text/xml"); echo "<RESPONSE>"; reset($theElements); while (list($elementName, $elementContent) = each($theElements))    echo "<$elementName>$elementContent</$elementName>"; echo "</RESPONSE>";

Having flattened <LOGIN> out of its position, we need to create a new wrapper element. (Remember that XML requires the entire document body to be one single element.) So we have created a <RESPONSE> object which contains the entire XML message (Figure 15.5). We can send XML to PHP and it can parse it carefully or quickly. And the PHP scripts can output well- formed XML. But we do not yet have XML in-and-out from Flash. Although we performed XML input earlier and output recently, we have yet to send an XML message and capture the XML result.

Figure 15.5. XML Response

graphics/15fig05.jpg

There are two XML upload methods . We have used the .send() method; now we will use the .sendAndLoad() method. The two parameters of send() are the URL to which it is being sent and the window in which the results are displayed; In sendAndLoad() the second parameter is the name of the XML object that will be the load target of the response XML.

Remember time. The script we are writing should last for only 100 milliseconds or less. And the time between one line and the next is measured in microseconds. By contrast, we can expect that it will take several seconds to transmit a data package to the server, have the server script parse the XML, act on it, and formulate a response. Even then the response needs to be transmitted back to Flash and parsed on the client side before the data becomes visible. Rather than loop around in a while() loopunthinkableor loop around the timelineunneccessarywe can assemble the event-driven architecture of this functional component.

Later our password will unlock doors. For the moment, it will just display more XMLbut this time entirely within Flash. We set a dynamic text field called statusline on our screen (Figure 15.6). We make a little method for our XML object called showme(), which uses the .toString method as a way to quickly visualize a small XML object like this.

Figure 15.6. The Creation of statusline

graphics/15fig06.jpg

ActionScript

 function showme(ok) {   statusline="Received: \n"+               (ok?  this.toString() : "terrible load failure")               +"\n("+this.contentType+")"; }

`contentType` Property

Open up the submit() function (which, you may recall, is fired when the submit button is pressed or Enter is hit). We will replace the .send() function. We need to create a fresh XML object, which we call xmlResponse. We register the showme method to xmlResponse as its onload action, which occurs when the load has completed or failed. (These two cases are distinguished by the ok flag that onload passes and which showme() observes.)

 // xmlLogin.send("authorize.php","_blank");   xmlResponse = new XML("");   xmlResponse.onload = showme;   xmlLogin.sendAndLoad("authorize.php", xmlResponse);   statusline="WAITING for XML";

The statusline assignment ("WAITING") is displayed from the time the submit action starts (button pressed) until onload occurs (communication completed).

We have made our connection! We composed a message in XML and sent it to a PHP script, which read it and sent back a legitimate XML message that Flash captured neatly (Figure 15.7). But there are still shortcomings.

Figure 15.7. The statusline Shows the Server's Response

graphics/15fig07.jpg

The response we receive is just a stepped-on version of the original XML message Let's make authorize.php do some work, not simply repeat what it hears. In particular, we should get it to authorize users. Ultimately, we will validate username/password pairs against a database back end. But before we explore the delights of the database, let's complete the round-trip communication between Flash and the server.

The special username "anonymous" allows open entry to our site. Its only restriction is that the visitor needs to know a public password (and be able to spell "anonymous" correctly). The public password can change frequently or never. Today the password is "bluefish." We allow ourselves a shortcut and hardcode the password into the PHP file. (Easy for us, hard for the maintenance webmaster. But to ameliorate this a bit we put the password in the first line of our PHP file.)

PHP

 $anonymousPassword= "bluefish";

And we add the following to our XML output section.

PHP

 if($theElements[ "USERNAME" ]== "anonymous")    if($theElements[ "PASSWORD" ]== $anonymousPassword)        echo('<ACCEPT privilege ="guest"/>');    else        echo('<DENY error="404">Incorrect anonymous password</DENY>'); else    do_something_else();

Now when we type in "anonymous" and "whitefish," we get the screen in Figure 15.8.

Figure 15.8. Bad Password

graphics/15fig08.jpg

On the other hand, if we type in "anonymous" and "bluefish" we get the screen in Figure 15.9.

Figure 15.9. Good Password

graphics/15fig09.jpg

For now do_something_else() drops back to echoing XML. Even more fun happens later when it tests the names against the database, but first let's get the client-server link solid.

The best way to develop and ( especially ) debug a networking project like this is to use powerful tools. Examining the results of transactions after the host or the client has manipulated the data is often confusing. In the last chapter, for example, we found that the tags had disappeared from an XML string between the time we composed it and the time we examined it. We suspected both the Flash code that generated, packaged, and transmitted the string and the PHP code that received and possibly overprocessed it. In fact, it was neither . The tags were eliminated by the HTML interpreter, which was displaying the debug dump from PHP.

In that case we were able to discover the answer by inspecting the source. Here, we have an anomaly buried deeper in the protocol. Flash tells us that it has received a message in the application/x-www-form-urlencoded format, but we had told the PHP to send it as text/xml .

Since it seems to work, the disagreement is not very consequential, but its solution allows us to glimpse an important tool. Packet sniffers allow the programmer to learn exactly what traffic is occurring along a connection.

On this screen in Figure 15.10 we see the request segment of the HTTP transaction, in which the Flash client sent a header that (besides identifying the format and protocol of the request itself as well as the requesting browser and the intended server page) announces a 63-byte content of type text/xml , and that is exactly what follows. Which makes sense, since it is exactly what we asked Flash to transmit.

Figure 15.10. Packet from Client Request

graphics/15fig10.jpg

If we look at the response with the packet sniffer, we see a wealth of interesting data (Figure 15.11). This is the packet we told PHP to send as Content-type: text/xml but which Flash (at least this version) identifies as application/x-www-form-urlencoded . Clearly the packet sniffer shows who's telling the truth. It is not only labeled text/xml in the header, the actual contents are in that format.

Figure 15.11. Packet from Server Response

graphics/15fig11.jpg

Great programmers do not guess what is wrong (or guess what is right either). They test.

PHP