SOAP, based on its XML pedigree, is meant to be platform and operating system independent. It is also supposed to be totally programming language agnostic ”despite its prowess in serving as an RPC mechanism. It is thus ideally suited to be freely and gainfully used as a message-based communications scheme between disparate systems. This intersystem communications capability is the fundamental relationship between SOAP and Web services. SOAP serves as the means for realizing Web services I/O operations and consequently also for Web invocation. Despite the overt I/O- related role that it plays in Web services, SOAP nonethelessis not a transport mechanism.
SOAP is meant to be used on top of standard transport protocols. The layering of SOAP on top of the transport layer is clearly shown in the Web services stack diagram in Figure 4.14. SOAP can be used across HTTP, TCP, SMTP, FTP, message queuing (e.g., IBM s WebSphere MQ), BEEP, or other RPC mechanisms. However, HTTP is the preferred and most widely used transport scheme for SOAP ”in the context of Web services as well as other scenarios.
HTML over HTTP is what the Web is all about, but with SOAP, you can have XML over HTTP. The SOAP specifications acknowledge this made-in-heaven relationship aspect of SOAP and HTTP. Though it is stated that SOAP is not limited to use with HTTP, the only transport bindings shown in the original SOAP specification (known as the 1.1 specification) were those for HTTP.
Given that SOAP is a messaging scheme that has structures known as envelopes and payloads, it has become de rigueur to use a postal analogy to describe SOAP, though the analogy is somewhat limited. Nonetheless it would be remiss not to mention it in passing, given its widespread usage within the industry. SOAP is all about the item that is to be mailed. It describes how the item to be mailed (i.e., the payload) is to be packaged, in a modular manner. That is where the SOAP envelope comes in.
However, SOAP does not dictate how this envelope containing the payload (i.e., the message) is delivered to the intended recipient. When dealing with paper-oriented mail, you have multiple delivery options (e.g., normal mail, express mail service, courier [e.g., FedEx], and so forth). Those are transport options ”same message, but very different delivery characteristics. It is the same with SOAP and the various transports over which it can be used.
When used on top of HTTP, SOAP messages can typically traverse corporate firewalls and evade standard packet-filtering policies, since HTTP, as the basis for interacting with Web servers, is given carte blanche access. Ironically, one of the motivations for developing SOAP was the fact that when used with HTTP it could indeed be a powerful remote procedure call mechanism that would not be blocked by a corporate firewall. CORBA and the DCOM-based distributed computing approaches needed specific ports to be opened in the firewall for them to get through. Since network administrators are invariably hesitant to open up new ports on firewalls, this was yet another problem that beset the CORBA and DCOM approaches. Suffice to say SOAP s ability to get through firewalls by using HTTP is now a source of some concern and one of the security-related issues that has to be cogently dealt with in the context of Web services.
However, this firewall traversal capability is not an overt security exposure, the reason being that getting one or more SOAP messages through a firewall by itself is unlikely to pose any kind of threat or cause any damage. SOAP, as shown in Figure 5.1, always works in application-to-application , sender-to-receiver mode. Thus a SOAP message that traverses a firewall is but a harmless lost transmission unless it is received by and acted upon by an application (or Web service) running behind the firewall. The secret for enforcing SOAP-level security is to ensure that any SOAP-capable application (or Web service) deployed behind a firewall is authorized, trusted, and regularly monitored .
SOAP predates the advent of Web services. The origins of SOAP can be traced back to XML-RPC. XML-RPC was developed in early 1998 by a few inspired visionaries working for Userland Software, DevelopMentor, and Microsoft ”with the names of Dave Winer of Userland Software and Don Box of DevelopMentor inextricably associated with the development of XML-RPC and its influence on the creation of SOAP. XMP-RPC, true to its name , is a simple RPC mechanism realized by using XML over HTTP. XML-RPC demonstrated that XML s scope did not have to be limited to that of representing the exact meaning of a structured document. XML, with XML-RPC, could also serve as the basis for a powerful, standards-based distributed transaction processing scheme. XML-RPC is still in use today, though SOAP has in many ways supplanted it as its more strategic and widely known descendant.
Following XML-RPC, Microsoft and DevelopMentor went on to investigate transport-independent, XML-based messaging schemes that could be used for distributed computing. They were striving for a mechanism that was simpler to deploy and use than either DCOM or CORBA ”which were considered at that time the long- term solutions for component-based distributed computing, with the former being Microsoft specific while the later was open and vendor independent. The resulting specification, SOAP 1.0, was ready in September 1999. To garner sufficient market backing for this specification, Microsoft sought other partners . The result was the SOAP 1.1 specification, which was authored by representatives from Microsoft, DevelopMentor, IBM/Lotus, and UserLand Software.
The 1.1 specification was submitted to W3C on May 8, 2000. Collaborating on SOAP provided IBM and Microsoft with the inspiration and impetus to flesh out the concept of XML Web services that would use this new XML-based messaging scheme as their I/O mechanism. IBM, Microsoft, and Ariba thus went on to create the specifications for WSDL and UDDI, which were made public 6 months after the unveiling of SOAP.
In addition to the four companies that authored it (keeping in mind that Lotus is a division of IBM), the submission of the SOAP 1.1 specification to W3C was further endorsed by other then-big names in the industry, including Ariba, Compaq, H-P, SAP, IONA Technologies, and CommerceOne. Given this wide and influential industry backing, W3C did not subject this specification to the rigorous ratification process that is typically the norm. Bowing to the momentum that SOAP had already picked up by that juncture, and the fact that SOAP implementations were already in progress, particularly from Microsoft (i.e., Microsoft s SOAP Toolkit 1.0, which was available in summer 2000) and the Java camp, led by IBM and Apache, W3C accepted SOAP 1.1 as a de facto industry standard. By 2002, there were more than 70 separate SOAP implementations .
SOAP 1.1 was not as precise as some would have wished, and there were some interoperability issues between different implementations of the 1.1 specification. This led to the W3C s XML Protocol Working Group initiating work on a SOAP 1.2 specification that attempted to address the nearly 400 technical and editorial issues cited against the original specification. This working group was formally chartered with creating an XML-based standard ”namely, SOAP 1.2 ”that would satisfy , at a minimum, the following criteria:
Develop an envelope-oriented encapsulation scheme for XML data so that it could be transferred in an interoperable manner between disparate systems, allowing for future extensibility and evolution in terms of distributed systems ”particularly in terms of possible intermediary nodes that may occur between the transmitter and the receiver where these intermediaries could be in the form of gateways, caches, or application-level proxies.
Ensure, with the cooperation of the IETF, an operating system (and programming language) “independent means for representing the contents of the SOAP envelope when SOAP is used for RCP-related operations.
Define a mechanism based on XML schema data types (e.g., xsd:string, xsd:integer, xsd:decimal, and xsd:Boolean) to represent necessary data (where such a process for representing data so that it can be correctly interpreted at the remote end is referred to as data serialization ).
To define, yet again with the cooperation of the IETF, a nonexclusive transport mechanism that could be layered on top of HTTP.
The SOAP 1.2 specification, sanctioned as an official W3C recommendation, was made available on June 24, 2003. This W3C recommendation status makes SOAP 1.2 a bona fide standard. The 1.2 specification consists of two parts :
SOAP Version 1.2 Part 1: Messaging Framework
SOAP Version 1.2 Part 2: Adjuncts
These two parts, which constitute the technical body of the specification, were edited by representatives from Canon, IBM, Microsoft, and Sun. This technical portion of the specification is augmented by an introductory primer, SOAP Version 1.2 Part 0: Primer . This document was edited by a representative of Ericsson. The complete 1.2 document set, however, is considered to consist of one other document ”the SOAP Version 1.2 Specification Assertions and Test Collection . This document, which is essentially a test plan for 1.2, was created by representatives of Active Data Exchange, AT&T, IONA Technologies, Oracle, Unisys, and W3C.
The goal of the Assertions and Test Collection is to foster interoperability between diverse 1.2 implementations. Given that interoperability was an issue with 1.1, it is easy to appreciate the motivation for this specification-validating suite of tests. This document captures assertions found in the SOAP Version 1.2 Part 1 and Part 2 specifications and provides a set of tests that indicate whether the assertions are properly implemented in a given SOAP implementation.
These tests are meant to help SOAP implementors ensure that their creations comply with the actual specification. A SOAP 1.2 implementation that passes all of the tests specified in this document may claim to conform to the June 24, 2002, SOAP 1.2 Test Suite ”this being the date that the document was accepted as a W3C recommendation. In theory, all implementations that successfully pass the entire suite of tests contained in this document should be able to cleanly interoperate with each other without encountering unexpected exceptions.
However, the successful completion of this test suite does not necessarily guarantee total SOAP 1.2 compliance, since this test suite admits up front that it does not test all aspects ”particularly those facets of an implementation that are considered to reflect the core mandatory SOAP 1.2 requirements spelled out in the specification. The bottom line here, however, is that having a test suite that sets out to validate a relatively large part of the overall specification is definitely a worthwhile and sound basis for trying to enforce interoperability.
In a somewhat related vein, the 1.2 specification also strives to be as unambiguous as possible, especially when describing the structure of the XML-based documents, yet also strives to minimize the potential implementation variations that could result in interoperability issues. To this end, Part 1 of the specification, which deals with the composition of SOAP messages, resorts to the use of XML Information Set (XML InfoSet) conventions. XML InfoSet is a relatively new W3C standard that was formally ratified in October 2001.
The goal of XML InfoSet is to provide a consistent set of definitions for use in other specifications that need to refer to the information in a well- formed XML document. The XML InfoSet is an abstract data set for XML documents. It is, in effect, a guide for writing more-rigorous XML. It emphasizes the use of XML namespaces to eliminate ambiguity. Given this mission to promote better-structured XML documents, one can understand its appeal to those crafting the SOAP 1.2 specification. With the use of InfoSet, SOAP 1.2 also shifts the data serialization (i.e., remote data representation) issue to correspond with the transport that is to be used. Thus it is left to the specification for a transport binding to dictate the serialization scheme that will be used with that transport.
Other than the use of InfoSet, with its emphasis on namespaces, much of the other changes between 1.1 and 1.2 tend to be in the realm of technical refinement ”and are somewhat esoteric. For example, with 1.1 it was possible to have other elements, known as trailers , after the payload-carrying body element of a SOAP message. In other words, there could be elements between < /s:Body > and < /s:Envelope > . SOAP 1.2 does not permit such trailers.
SOAP, to be as flexible and extensible as possible, advocates a message transfer model in which there can be a chain of SOAP-cognizant nodes between the originator and receiver of a message. Each node in such a chain may process a part of the overall SOAP message. Nodes that perform some level of processing on a SOAP message are known as SOAP endpoints. The processing performed by intermediary endpoints typically relates to header items within the SOAP message. Figure 5.2 illustrates some of the endpoint configurations possible with SOAP, highlighting that SOAP supports multicast/broadcast as well as chained workflow-type configurations. In SOAP 1.1, header elements could contain an actor= attribute, which specifies which types of endpoints (which are defined via a URI) should process that particular header element.
SOAP 1.2 renames the actor= attribute to a more meaningful role= attribute while keeping its purpose and meaning the same as that for actor= ”discussed further in Section 5.2.1. The other changes in 1.2 are similarly arcane and are targeted at technocrats heavily involved in implementing rather than using SOAP. Consequently, these are beyond the scope and mission of this book, which is meant to be a nontechnical executive s guide to managers and decision makers .
However, there is one subtle but significant change regarding SOAP 1.2 that deserves exposition. As of 1.2, SOAP will no longer be considered an acronym for Simple Object Access Protocol. Instead, SOAP will just stand for SOAP! The phrase Simple Object Access Protocol is misleading, and one has to suspect that it was contrived just to obtain the catchy acronym.
SOAP is a message transfer protocol rather than an object access mechanism. There is nothing that relates to object orientation when it comes to SOAP. SOAP does not even assume that any objects exist at the sending or receiving ends, so the object part is inappropriate. Furthermore, exactly how simple SOAP is, particularly as it evolves with 1.2, is also open to debate. In the early days of SOAP, people would claim that a SOAP implementation could be realized in a weekend, given that it was such bare-bones protocol. However, the consensus today is that it really would take a rather long weekend to successfully realize a SOAP implementation that conformed to the 1.2 test suite. So it is best to forget the simple and object references and just think of SOAP as what greases the skids when it comes to Web services I/O.