3.1 History of XML-RPC | Programming Web Services with Perl

XML-RPC was designed primarily by Dave Winer of UserLand Software, Inc. He was one of the designers working on the SOAP specification and became frustrated with the mounting complexity. He wanted something to use immediately, but SOAP was taking a long time to coalesce. So he forked off what was then an early working draft of the SOAP protocol, and this became what is now known as XML-RPC.

The first implementation of the specification was in Userland's Frontier product, a content management system with scripting, object database, and server capabilities. This was introduced in April 1998, and eventually the specification was published to encourage the development of other compliant toolkits. Currently, there are 65 implementations in languages ranging from AppleScript to Zope. There are toolkits for Lisp, Ruby, Eiffel, Scheme, Dylan, and an impressive seven different implementations for PHP. Perl features three different implementations, which will be covered in-depth in Chapter 4.

The web site for XML-RPC, http://www.xmlrpc.com, is a good source for more history of the specification. It also features links to various toolkits and the current specification as well.

3.1.1 The XML in XML-RPC

XML-RPC uses a simple XML application to express function calls ( requests ) and returned values (responses) between clients and servers. The heart of an XML-RPC message is the way data is encoded into XML.

3.1.1.1 Data encoding

Data is at the core of any interface, since the first and foremost goal is to send information between two points. XML-RPC supports six basic datatypes in messages (seven, technically, since i4 and int may be considered distinct), and also supports serialization of arrays and structures ( name /value pairs just like Perl's hashes). The data types are explained in Table 3-1.

Table 3-1. The XML-RPC datatypes

Name/Tag	Sample value	Description
`<int>` or `<i4>`	12, ^- 1, 65536, etc.	The `int` and `i4` types express 32-bit signed integer values. They are functionally the same, but if a server is expecting one of the two, it may not always accept the other encoding.
`<double>`	^- 2.7182818284, 3.14159265358979	The `double` tag describes double-precision floating-point data. The expression may only consist of the sign, digits, and the decimal point. The specification doesn't provide for exponential notation.
`<string>`	"XYZ", etc.	Data marked with the `string` tag is meant to be unedited character data. The only characters not directly allowed are `&` and `<` , which are entity-encoded as `&` and `<` .
`<boolean>`	or `1`	The `boolean` type expresses the typical boolean true/false range using the values 1 and 0, respectively. While Perl lets you test general scalars for truth/falseness, this isn't true of many other languages.
`<dateTime.iso8601>`	20020726T02:50:54	Date and time values are expressed according to the ISO 8601 standard, and the type used for this is called `dateTime.iso8601` . This is covered in depth later.
`<base64>`	Any Base64 ^- encoded blob of data	The `Base64` type was added to the spec in a later revision, to support binary data that doesn't fit easily into the others.

Aside from the base64 and dateTime.iso8601 types, the tags should be very self-explanatory. Some confusion may come from the int versus i4 tags, but these are only distinct in applications that check the actual encoding type information. Some servers are more strict about this than others, but those that are generally publish detailed descriptions of their interface, so the information is available to make the correct choice.

The base64 type was added in an update to the specification in early 1999. It allows for arbitrary data (images, digital audio, etc.) to be encoded using the widely accepted Base64 algorithm. The content between the opening and closing tags is considered to be the complete encoded entity (no allowance is made for breaking up large blobs into smaller chunks ), not including any whitespace immediately before or after the data itself. Arbitrary whitespace can't appear within the Base64 data. If you ever need to process the Base64 data yourself, the MIME::Base64 module for Perl provides functions to encode and decode strings. Since the toolkits all handle this transparently , it won't be covered here.

The choice of the dateTime.iso8601 tag may seem curious at first, but the ISO 8601 standard allows for specification of dates and times in all time zones (by expressing them as offsets from UTC, Coordinated Universal Time). The syntax of that standard allows for partial specification of time only, date only, etc. It's a fairly flexible format with wide acceptance in the Web and Internet communities.

All the datatypes are expressed in XML using their tag name. None use any attributes, and none are valid as empty elements. The fragment in Example 3-1 shows each of the types in XML format:

Example 3-1. The XML-RPC datatypes in XML

 <int>255</int>     <i4>-2147483648</i4>     <double>3.14159265358979</double>     <string>XML-RPC &amp; Perl</string>     <boolean>0</boolean>     <dateTime.iso8601>20020726T02:50:54</dateTime.iso8601>     <base64>   SnVzdCBBbm90aGVyIFBlcmwgQm9vaw== </base64> <!-- What did that say? "Just Another Perl Book" -->

Note that the string example decodes as XML-RPC & Perl . Data elements of type string may contain any characters, including nonprintable and null characters. The XML-RPC layer defines the string as the contents between the opening and closing tags (not including leading and trailing whitespace). This doesn't change the fact that if one end of the conversation is written in a language that stumbles over unusual characters (such as C might with embedded null characters), a problem can then arise. These things can't be mandated away by the specification, however, and must instead be handled by the toolkit authors when (and where) necessary.

3.1.1.2 Arrays and structures

The array and struct datatypes are how XML-RPC expresses complex data. These constructions can serialize almost any array or hash table Perl can produce (except for objects). Both structures allow for recursive embedding of structures, so an array 's element may be a struct with a member element that contains yet another array , and so on.

The array element serializes data when the only distinguishing factor between two elements is their place within the order. Elements in an array don't have to all be of the same type. An array element contains one child, a data element. Even when an array has zero actual elements, the data container must be present.

Within the data container are zero or more value elements, each of which contains one item of data. Example 3-2 shows a basic array structure.

Example 3-2. A simple array structure

 <array>   <data>     <value><int>255</int></value>     <value><double>3.14159265358979</double></value>     <value><i4>-2147483648</i4></value>     <value><string>XML-RPC &amp; Perl</string></value>     <value>       <array>         <data>  <!-- An array with zero elements -->         </data>       </array>     </value>     <value>A string with no wrapping tag</value>   </data> </array>

In this example, an empty array was embedded in at what would be element 4 ( assuming the count starts at 0). The example also showed a shortcut that the specification permits : within a value element, if the type of data being serialized is string , the type-specific tags are optional. In Perl, this array looks like this:

 @array = (255, 3.14159265358979, -2147483648, 'XML-RPC & Perl', [ ],           'A string with no wrapping tag');

Expressing the struct type of data isn't much more complex than the array . The main difference is that the elements of a struct are named key/value pairs, just like Perl's hashes. A struct contains zero or more instances of a container called member . A member container holds two elements, the first, called name, and the second, called value . The value element is treated just as it is within an array , as you saw earlier. The name element is functionally the same as a string , but it isn't explicitly typed as such. The specification defines no limitations on the characters that can appear within the name element.

As with Perl's hashes, the order of the key/value pairs isn't guaranteed , so nothing about the order of the serialization should be assumed to mean anything to the actual data itself. Example 3-3 shows a simple struct expression.

Example 3-3. A simple sample struct expression

 <struct>   <member>     <name>pi</name>     <value><double>3.14159265358979</double></value>   </member>   <member>     <name>min.signed.int</name>     <value><i4>-2147483648</i4></value>   </member>   <member>     <name>publisher</name>     <value>O'Reilly &amp; Associates</value>   </member>   <member>     <name>nested array</name>     <value>       <array>         <data>         </data>       </array>     </value>   </member>   <member>     <name>nested struct</name>     <value>       <struct></struct>     </value>   </member> </struct>

Unlike the array expression, an empty struct has no elements within it. Example 3-3 matches the following Perl:

 %hash = (pi => 3.14159265358979, `min.signed.int' => -2147483648,          publisher => "O'Reilly & Associates", `nested array' => [ ],          `nested struct' => {});

3.1.1.3 Making a request

Data is useful, but data alone doesn't constitute a request. When a client makes a request to a service, it must inform the server at the other end what remote procedure (or method, in the language of the XML-RPC specification) it wishes to call. Within the call, any parameters that need to be passed as arguments to the procedure are encoded.

Requests are also expressed in simple XML structure. The top-level element of the XML document is methodCall when encoding a request. It has one required child element and one optional element. The required element is methodName , and it contains the name of the method being called on the remote server. The name may contain only alphanumeric characters, underscore ( _ ), period (.), colon (:), and slash ( / ) characters. As with other identifiers, the leading character of the name must be either an alphabetic character or an underscore. Many XML-RPC services use the period to denote namespaces, so seeing method names such as system.listMethods is common.

The optional element is called params ; it's used when the procedure call has one or more parameters. It may be present even when there are no parameters, because the specification allows params to be empty. Toolkits for XML-RPC must allow for the case of an empty parameter list in their deserialization.

Within params are zero or more containers called param . Within each param container is exactly one value element, governed by the same rules as before. Example 3-4 illustrates this.

Example 3-4. A simple request message

 <?xml version="1.0"?> <methodCall>   <methodName>user.create</methodName>   <params>     <param>       <value>         <struct>           <member>             <name>user_id</name>             <value>rjray</value>           </member>           <member>             <name>password</name>             <value>bad_password!</value>           </member>           <member>             <name>age</name>             <value><int>34</int></value>           </member>         </struct>       </value>     </param>     <param>       <value>www.blackperl.com</value>     </param>   </params> </methodCall>

In the example, a routine called user.create is called with two parameters, a struct and a string . The struct has three member s in it, the first two of which are string types, while the third is an int . In this example, the data represents new user information in the structure, followed by the host or domain in the second argument. The indentation is purely for readability; most toolkits don't make any effort to maintain indention levels in the XML they generate. Also, the initial line required by an XML document is present in this example. Since an XML-RPC message must be valid XML, this line must always be present.

Even as the number of parameters and their complexity increases , the request will still look basically like Example 3-4. This simplicity in XML-RPC is what has given it a strong following, even after the SOAP specifications were unveiled and updated over time.

3.1.1.4 Creating a response

In any client/server model, the request is only half of the story. Fortunately, as simple as the request XML structure itself is, the response structure is even simpler.

The response is much more straightforward because the response format is stricter than the request. Requests have to specify the remote procedure or method name, and they must contend with specifying lists of arguments. A response always returns exactly one value. There are no responses with no return parameter (the equivalent of a C function returning void , for example).

The single return value is passed back to the client within a methodResponse top-level element, which contains a params element with a single param container. Example 3-5 shows a simple response message.

Example 3-5. A typical XML-RPC response

 <?xml version="1.0"?> <methodResponse>   <params>     <param>       <value><int>1</int></value>     </param>   </params> </methodResponse>

While it is true that the structure of the response message contains several elements that may seem redundant, it allows for a simpler definition of content bodies by keeping the placement and role of params consistent across both requests and responses. Though the XML-RPC specification provides neither a formal DTD or schema, several other parties have crafted their own for use with other XML- related tools, and the structure of the messages lend themselves to clear and simple expression.

Of course, while a return value may only be a single item, it can be a structure or array value. And as with requests, the contents of a structure or array may be arbitrarily deep and complex. Later in the chapter, the value of this rule will be demonstrated when discussing overloading of methods (procedures) and server-side call management.

3.1.1.5 Sending an error response

Needless to say, not every procedure will run without error. Such errors have to be easily distinguished from successful calls. In the next section, which discusses the HTTP communication, you'll see how the HTTP response code can't be used to signal a procedure-level error. Instead, XML-RPC has a syntax for marking a response as an error. This is called a fault . Since a client can't request a fault (obviously), this only applies to response messages.

Example 3-6 illustrates a typical fault message.

Example 3-6. A fault message for XML-RPC

 <?xml version="1.0"?> <methodResponse>   <fault>     <value>       <struct>         <member>           <name>faultCode</name>           <value><int>404</int></value>         </member>         <member>           <name>faultString</name>           <value><string>Resource not found</string></value>         </member>       </struct>     </value>   </fault> </methodResponse>

When a server returns a fault response, the fault structure replaces the params structure from a successful response. If an application uses XPath notation to process the response XML, for example, it can use the same query path regardless of fault or success, and then examine the child element's name to determine the nature of the response (fortunately, as will be seen in Chapter 4, toolkits do this so the application doesn't have to deal with it directly).

The fault structure is simply a container element with a single child, a value element. Unlike other value specifications, in this instance the only allowable content is a struct with exactly two member children. One of these must be named faultCode and have an int value. The other must be called faultString and have a value that is a string . The order of these two isn't important, because the struct doesn't preserve member order. But the naming is important, and must be followed. The fault structure can't contain extra members . The XML-RPC specification doesn't detail any sort of basic fault messages or codes. The strings and integers used are entirely up to the servers providing the service.

3.1.2 Client and Server Communication

All communication in XML-RPC is done over HTTP. For strict adherence to the specification, all communication must further be limited to HTTP 1.0, because the specification explicitly calls for the presence of a Content-Length header in both the request and response. Many of the toolkits don't adhere to this level of strictness and will accept "chunked" content-encoding on the server end of things. All toolkits provide the Content-Length header on responses, as far as is known. Their client implementations may also allow for HTTP 1.1-style content encoding as well, but it isn't safe to assume this.

Clients make their requests using the POST verb of HTTP, hence the requirement for the Content-Length headers. In theory, the simpler data elements can be provided in the URL as query arguments, but applying this to the complex types (structures and arrays) is unnecessarily complex. Instead, the specification states that requests are sent as the body of a POST request, regardless of length.

The communication of important information between client and server is done through the HTTP headers. The specification requires that the headers shown in Table 3-2 be present in both requests and responses. This is another detail that is generally handled at the toolkit level, and as such is probably not something an application developer needs to worry about. A toolkit developer, on the other hand, should ensure that they are capable of dealing with any other XML-RP-compliant tools.

Table 3-2. HTTP headers required for XML-RPC

Header	Role
`Content-Length`	An integer value giving the length in bytes of the content of the message (not including the headers).
`Content-Type`	This value must always be `text/xml` . Note that the HTTP specification for headers and their values allows a server to provide additional information, as long as this value is present and clear.
`Host`	(Required for requests only.) The name of the host the client is trying to connect with, in case multiple virtual hosts are being managed by the same server application.
`Server`	(Required for responses only.) A text string that identifies the server in some way, usually identifying the server software rather than just the hostname.
`User-Agent`	(Required for requests only.) A string that identifiers the client application making the request. The use of the `User-Agent` header is inherited from the header's role in the tradition role of the Web.

Both requests and responses provide the content-related headers. The requesting client has the additional requirement of providing the Host and User-Agent headers, while the responding server only needs to provide a Server header.

When a client sends a request to a server and receives an XML-RPC response, the HTTP involvement in the conversation has been successful regardless of whether the response itself is a fault or not. The reasoning for this is simple: the XML-RPC functionality could very well be coexisting with an ordinary web server, and that server has no native understanding of success versus failure in XML-RPC terms. An HTTP server only knows if it was able to successfully handle a request and send a valid response. HTTP error codes in the 400 (request incomplete) or 500 (server error) ranges can indicate only that the HTTP server had problems, not the XML-RPC code.

Any time a server receives a request and can give a response, the HTTP response code must be 200 , the basic HTTP success indicator. This holds true for faults as well as successful returns, which is why faults are used to communicate problems. The range of fault types for XML-RPC is theoretically unbounded, but the range of error codes for HTTP is finite. Additionally, not all servers can attach arbitrary documents to error messages. In other words, HTTP error codes just wouldn't work. So, unless the URL is invalid, or the server is down, the HTTP layer reports success.

Example 3-7 shows the same response message as Example 3-5, but with server-side headers and the HTTP response line added. The content is abbreviated, and depending on the server in use, the complete set of headers may be much longer.

Example 3-7. XML-RPC response with HTTP headers

 HTTP/1.1 200 OK Connection: close Content-Length: 138 Content-Type: text/xml Date: Sun, 28 Jul 2002 11:20:13 GMT Server: Apache/1.3.23 (Unix) mod_perl/1.26     <?xml version="1.0"?> <methodResponse>   <params>     <param>       <value><int>1</int></value>     </param>   </params> </methodResponse>

In Chapter 4, when toolkits are discussed, we will cover the merits of using an Apache server with mod_perl enabled. Suffice it to say, mod_perl can make an XML-RPC environment even more efficient, when coded and configured properly.

3.1.3 Method Signatures and Overloading

By design, XML-RPC treats the remotely executable procedures as strongly typed. While languages such as Perl or PHP may operate fine with arbitrary parameters passed in, languages such as Java and C are much more demanding about the integrity of their arguments list. Many server implementations address this by tracking the number, order, and type of parameters a procedure or method may accept. This is often referred to as the signature of the remote call.

Method signatures aren't a part of the XML-RPC specification. They are implemented at the server level in many of the toolkits offered for application development. Signatures are most often used by the server to determine if it can actually send the input data to the appropriate procedure without generating a fatal exception or error. Most languages have some sort of exception-handling facility (such as Perl's eval and die functions), but some don't. These languages can encounter problems if passed a string when expecting an integer, for example. Depending on the language and how the software was written, the result of these problems can range from just returning errors to stack-buffer overflows.

A method's signature is defined as the sequence of the return parameter's type, followed by the types of all input parameters. For example, using the messages from Example 3-4 and Example 3-8 as the input and response of the same method, that method's signature would be:

 (int, struct, string)

Note that the first input parameter is the struct , and that nothing in the signature indicates (or mandates ) any of the underlying structure in that parameter.

This is one reason why the return values from methods must always be single values. There would be no way to discern the earlier signature from one that indicates a single input of string , with an output of int followed by struct . In order to fully support languages such as Java or C++ that allow multiple interfaces to methods (sometimes called overloaded methods), it is necessary to distinguish between the different calling signatures. If an XML-RPC server is exposing routines from a library written in C, this isn't an issue. But Java, C++, and many other languages (including Perl) can have more than one way to call a given routine.

How a server manages method signatures and how (or even if) it imparts that information to client applications is entirely up to the developers of the server toolkits and developers of the servers themselves.