5.2 SOAP: The overall model

As is the case with XML, the power of SOAP emanates from the fact that it is based on a simple, modular, and highly flexible model that eschews restrictions. The core SOAP philosophy revolves around three fundamental precepts, as follows :

SOAP messages consist of XML documents.
SOAP messages, as illustrated in Figure 5.1, flow on an application-to-application basis, from a sender to a recipient, in a stateless, one-way message exchange scheme.
There can also be a chain of recipients (i.e., endpoints or nodes) that process a given message ”with each recipient in the chain being responsible for processing some part of the message.

Everything to do with SOAP, and its use in the context of Web services, is derived from these three basic precepts. The new SOAP documentation captures this with an opening that categorically but simply states: SOAP Version 1.2 provides the definition of the XML-based information, which can be used for exchanging structured and typed information between peers in a decentralized, distributed environment. Subsequent sections in the introduction stress that SOAP is a lightweight protocol and that SOAP messages can be exchanged over a variety of underlying protocols. It also notes that the SOAP framework has been designed to be independent of any particular programming model and other implementation-specific semantics. Figure 5.3 shows several representations of SOAP stacks to show how SOAP can be layered on top of diverse transports.

Figure 5.3: SOAP s transport-independent stack, realized via the protocol binding layer, shown here in its generic form (on top) as well as with some popular implementational options.

Simplicity and extensibility are two major design goals for SOAP, which strives to realize these goals by trying to stay aloof from many of the networking-related issues normally addressed by other distributed computing paradigms . Some of these networking- related features that are not addressed by SOAP include reliability, security (including firewall traversal), message correlation, and routing. SOAP assumes that these important networking issues will be handled by the transport layer below it or by the application layer above it.

Part 1 of the SOAP 1.2 specification (i.e., the messaging framework specification) deals with:

The SOAP processing model, which defines the rules for processing a SOAP message.
The SOAP extensibility model, which deals with the concepts surrounding SOAP features and SOAP modules ”where features in this context are extensions to the base processing model. What the specification does is describe how extensions may be added to the basic model, rather than specify any explicit features.
SOAP s underlying protocol binding framework, which describes the rules for defining a SOAP binding to an underlying transport protocol; this can then be used for exchanging SOAP messages between SOAP nodes.
SOAP message construct, which defines the specific structure of a SOAP message ”which in essence consists of one (and only one) SOAP envelope; this in turn may contain an optional header element and a mandatory body element.

Part 2 of the SOAP 1.2 specification (i.e., the so-called adjuncts) deals with optional functions that may be used with the messaging framework specified in Part 1. The optional adjuncts dealt with in Part 2 are as follows:

A SOAP data model for representing application-defined , non-XML-based data structures and values by using a concept that relies on XML-qualified names and namespaces ”referred to by a rather cumbersome and misleading turn of phrase that talks about a directed, edge-labeled graph of nodes.
The SOAP encoding scheme, which can be used with the data model for representing application-defined data that are to be included within SOAP messages.
The ever-so-important RPC encapsulation mechanism, which shows how SOAP can be used as a means to transport and remotely invoke the RPC calls of contemporary programming languages.
A description of how features (i.e., extensions) and bindings can be shared between specific SOAP nodes by using message metadata, which, in this instance, are known as properties and property values.
A description of SOAP-Supplied Message Exchange Patterns (MEPs) for two specific request-response scenarios, where the first involves a SOAP request that results in a SOAP response, whereas the second deals with a non-SOAP request (i.e., one that does not include a SOAP envelope) that nonetheless results in a SOAP response being sent back.
How a small set of Web server “related methods (referred to in the specification as SOAP Web methods ), such as GET, PUT, POST, and DELETE, which could be used with HTTP and possibly other protocols, supports SOAP messaging over the Web (e.g., SOAP over HTTP).
The oft-mentioned SOAP HTTP binding, which defines how SOAP can be gainfully deployed over HTTP ”particularly for RPC applications.

The adjunct part of the specification appears to be somewhat esoteric and obtuse. This, unfortunately , is a growing problem with today s egalitarian, Web-oriented standards. Everybody within reason, gets a say, and too many agendas and views, however well intentioned, come into play. The SOAP 1.2 adjuncts are a compromise of sorts. These adjuncts try to provide optional extensions, within the overall SOAP messaging framework specified in Part 1, that satisfy as many groups as possible.

At this juncture, it is worth recalling once again that SOAP s overarching goal is to provide a simple, stateless, one-way message exchange model. A SOAP message is the basic unit of communication within this context. A SOAP message is fundamentally a one-way transmission between SOAP nodes, from a SOAP sender to a SOAP receiver ”where a SOAP node is an implementation of the necessary processing logic to transmit, receive, process, or relay a SOAP message. The SOAP nodes enforce the protocols that govern SOAP message exchanges. SOAP permits and expects SOAP messages to be combined by applications to realize complex interaction patterns that involve multiple, back-and-forth conversational exchanges. This is where the adjuncts come in. They show how applications can take the basic one-way messaging paradigm and create more complex interaction patterns by exploiting features provided by an underlying protocol or through application-specific functionality.

SOAP, as the specification likes to point out, is also silent on the semantics (i.e., meaning or intent) of any application-specific data it conveys, as it is on issues such as the routing, reliable data transfer, and firewall traversal, as mentioned earlier. SOAP, however, particularly via the adjuncts, provides the framework by which application-specific information may be conveyed in an extensible manner (e.g., the SOAP data model). Thus the adjuncts, though optional, ensure that SOAP can satisfy the requirements of contemporary Web-related applications ”in particular, Web services.

5.2.1 SOAP processing model

The SOAP processing model, which sets out to define a distributed processing environment, revolves around the concept of SOAP nodes. A SOAP node, specific to a given SOAP message, can be:

The initial SOAP sender
The ultimate SOAP receiver
A SOAP intermediary

A SOAP node, on receiving a SOAP message, must perform processing on that message per processing model protocols set out in the SOAP specification. When processing a SOAP message, a SOAP node is said to act in one or more SOAP roles ”where each such role is clearly identified by a specific URI, which is known as the SOAP role name . Each node can determine the set of roles in which it proposes to act when handling a given message. The role or roles assumed by a node, however, have to remain constant during the processing of a given message.

A node identifies the role or roles that it can play for a given message via the URIs that specify the role names. Such role names will be stated via a role= attribute, which can be included in a SOAP header block ”where a header block constitutes a single computational unit within a SOAP header. (This role= attribute used to be the actor= attribute in SOAP 1.1.) Thus, for example, you could have a role name that indicates audit or journal ”as shown in this SOAP extract:

 <soap:Envelope xmlns:soap=http://www.w3.org/2003/05/  soap-envelope          xmlns:rl=http://examples.org/app-roles>   <soap:Header>    <rl:track soap:role= http:// examples.org/journal>   :   :   </soap:Header>   <soap:Body>     :     :     :     :   </soap:Body>  </soap:Envelope>

Nodes that recognize the indicated role name and are capable of playing that role would perform the necessary actions ”which in this case might be to journal the forwarding of the message so that there is an audit trail of its passage. A header block is deemed to be targeted at a SOAP node if a role name in a header block corresponds to a name of a role in which that node is capable of operating. Though the specification does not prescribe how implementations may exploit the role function, it is easy to see how this attribute could be used to realize node-specific routing capabilities or to indicate whether certain messages could be cached for subsequent replay from an intermediary node.

The SOAP 1.2 specification defines three role names that have special significance. These, with their corresponding URIs, are as follows:

next ” http://www.w3.org/2003/05/soap-envelope/role/next
ultimateReceiver ” http://www.w3.org/2003/05/soap-envelope/role/ultimateReceiver
none ” http://www.w3.org/2003/05/soap-envelope/role/none

Each node that is expected to act as an ultimate receiver (i.e., intended recipient) or as an intermediary must be able to perform the next role. Those that will be an ultimate receiver have to be able to perform the ultimateReceiver role. The none role is somewhat incongruous but is there to ensure total architectural completeness and integrity. It indicates a role that is not to be performed by a SOAP node. SOAP header blocks that are targeted at http://www.w3.org/2003/05/soap-envelope/role/none are not supposed to be processed by SOAP nodes. It thus provides a kind of out-of-specification mechanism to convey data in a header block that may be of use when processing other valid header blocks. While an intermediary node may have the option of removing a header block that is tagged with a none role, it is typically assumed that such header blocks are relayed, unchanged, to the ultimate receiver.

Much of the free-form extensibility associated with SOAP is derived from the implementation-specific flexibility that nodes have in processing SOAP header blocks. In addition to the role= attribute, the other attribute that controls how a header block is processed by a node is the mustUnderstand= attribute. The mustUnderstand= attribute is Boolean and as such can only have the values true and false. The mustUnderstand= attribute can be used, elegantly and effectively, to segregate header blocks into those that are mandatory (relative to a given node) and those that are optional.

The following example, directly from the SOAP 1.2 Primer , shows the use of the mustUnderstand= attribute, as well as that of role=, albeit in this instance in the rather innocuous next mode:

 <env:Envelope xmlns:env="http://www.w3.org/2003/05/soap- envelope">   <env:Header>   <m:reservation xmlns:m=http://  travelcompany.example.org/reservation       env:role=http://www.w3.org/2003/05/soap-envelope/  role/next        env:mustUnderstand="true">    <m:reference>uuid:093a2da1-q345-739r-ba5d- pqff98fe8j7d</m:reference>    <m:dateAndTime>2001-11-29T13:20:00.000-05:00</  m:dateAndTime>   </m:reservation>   <n:passenger xmlns:n=http://mycompany.example.com/  employees       env:role=http://www.w3.org/2003/05/soap-envelope/  role/next        env:mustUnderstand="true">    <n:name>ke Jgvan yvind</n:name>   </n:passenger>   </env:Header>   <env:Body>   :   :   :   :   </env:Body>  </env:Envelope>

A SOAP node is said to understand a given SOAP header block if the software implementation of that node is capable of understanding the namespace qualified outermost XML element name associated with that header block (i.e., m:reservation and n:passenger in the previous example). Mandatory header blocks, by definition, are deemed to be significant vis--vis the rest of the SOAP message in that they are considered to impart some level of additional meaning that could affect the processing of other header blocks or even the payload contained in the body of the message.

Thus, a mandatory header block targeted at a node either has to be completely processed by that node or has to be rejected by that node, without any processing, with the appropriate SOAP fault code. In this context, the targeting of a header block to a particular SOAP node is achieved by using the role= attribute. Thus role= and mustUnderstand= work in tandem ”as indicated by the previous example. SOAP faults are represented by SOAP fault messages. A fault message contains a standard SOAP body with a subelement called fault, which in turn contains two mandatory subelements known as code and reason. There is the option of inserting a further application-specific subelement known as details. Figure 5.4 shows an example of a SOAP fault message, as shown in the SOAP 1.2 Primer, that relates to the unsuccessful processing of a credit card transaction, invoked in the form of an RPC, via a SOAP-oriented travel reservation application.

 <?xml version='1.0' ?>  <env:Envelope xmlns:env="http://www.w3.org/2003/05/soap-envelope"              xmlns:rpc='http://www.w3.org/2003/05/soap-rpc'>    <env:Body>     <env:Fault>       <env:Code>         <env:Value>env:Sender</env:Value>         <env:Subcode>          <env:Value>rpc:BadArguments</env:Value>         </env:Subcode>       </env:Code>       <env:Reason>        <env:Text xml:lang="en-US">Processing error</env:Text>        <env:Text xml:lang="cs">Chyba zpracovn</env:Text>       </env:Reason>       <env:Detail>        <e:myFaultDetails          xmlns:e="http://travelcompany.example.org/faults">          <e:message>Name does not match card number</e:message>          <e:errorcode>999</e:errorcode>        </e:myFaultDetails>       </env:Detail>     </env:Fault>   </env:Body>  </env:Envelope>

Figure 5.4: An example of an actual SOAP fault message, in this case indicating the failure to process an RPC request, as shown in the SOAP 1.2 Primer, illustrating the possible use of fault codes, a fault reason, and a detailed description of the fault.

The appropriate processing of a header block by a SOAP node could involve the removal of the header block from that message, modification of the header block, or the insertion of a new header block. Per SOAP conventions, a targeted intermediary that does not process a header block is expected to remove it from the message prior to the message being forwarded. There is, however, a new and optional attribute known as relay= that may also be included in a header block to override this somewhat counterintuitive behavior. A header block in which the relay= attribute is set to true will be forwarded by intermediary nodes whether they process that block or not.

If a targeted node cannot process a mandatory header block, all further processing of that SOAP message ceases, and the message will not be forwarded any further. Though a mustUnderstand= attribute is not associated with the body (i.e., payload) of a SOAP message, it is a given that it has to be processed by the ultimate recipient (i.e., the node that plays the ultimateReceiver role). The ultimate recipient may also process any header blocks targeted at it.

5.2.2 Security considerations with SOAP

The SOAP specifications, 1.1 or 1.2, do not explicitly address SOAP-related security issues. Thus, there are no SOAP-defined mechanisms for dealing with access control, confidentiality (e.g., encryption), integrity (i.e., validity), and nonrepudiation (i.e., disowning a message). The specification, while acknowledging the need for such security measures, nonetheless expects these to be handled with the SOAP extensibility model ”in particular, SOAP protocol bindings. The SOAP bindings are the conventions for encapsulating a SOAP message within or on top of another protocol (i.e., the underlying protocol) for the purpose of transmitting these messages between nodes with SOAP messages. Typical SOAP bindings include carrying a SOAP message within an HTTP message or an e-mail (e.g., MIME) or simply on top of TCP.

SOAP security is thus very implementation specific. This is not entirely a bad thing. As previously mentioned, in the context of firewall traversal, a SOAP message by itself cannot typically harm a system without the active involvement of SOAP message processing software at a node. The one noteworthy exception here is that a flood of malicious SOAP messages, though not harming the system or network, could grind things to a halt so as to constitute a denial-of-service attack. In this context a SOAP message is very much like an e-mail attachment (and it is worth noting that it is possible to send SOAP messages by using SMTP in conjunction with MIME).

Receiving a malicious e-mail attachment (e.g., virus or worm) in itself does not generally harm a system. The damage occurs when one opens (or activates) such a rogue attachment. The same inert-until-processed is true when it comes to SOAP message processing. A SOAP message can damage a SOAP node and bar the denial-of-service scenario only with cooperation of the SOAP message processing software, so what is crucial is to ensure the integrity, veracity, and reliability of the SOAP software ” before worrying too much about the potential rogue contents of SOAP messages.

Given that a SOAP message, by definition, may be processed by multiple intermediaries, it is vital to have trust in all SOAP nodes that are likely to come in contact with a SOAP message ”as opposed to just thinking about the credentials of the ultimate receiver. Since it may not be possible, in many cases, to know about all the SOAP intermediaries that may be involved, it is imperative that all sensitive information is safeguarded within a message with suitable end-to-end application-level encryption. Depending on the transport being used, it may be possible to realize some level of point-to-point encryption at the transport layer (e.g., using SSL with HTTP). However, transport-level security alone may not be sufficient if one does not want the information in a message disclosed until it reaches the ultimate destination application.

The first priority when it comes to SOAP-related security is to ensure the credentials, capabilities, and validity of the software that will be processing the SOAP message. In this context, particularly given the Microsoft Windows “related security attacks that exploit unintended holes in the software, it is important to try, as much as possible, to make sure that the software does not have any weaknesses that could be pried open by a message created by an author aware of this soft spot in the software. The main thing, as repeatedly stressed here, is to ensure that the SOAP software is only capable of doing things that it is supposed to do ”and that one is aware of what it can do, should do, and furthermore what it really cannot do (e.g., secretly transcribe the contents of a message or delete random files).

Once the trustworthiness and reliability of the SOAP nodes involved have been established, one can then concentrate on ensuring the actual bona fides of SOAP messages and authenticating the purported originators of these messages. This again becomes an application-level function. Basically, the SOAP message processing software could be written so that it processes messages only from known and authenticated sources ”and moreover processes only individually authenticated messages that contain data per presubscribed encoding schemes. This could be realized by including an application-specific user ID/password mechanism on each message. The privacy of such an authentication scheme can be achieved by using end-to-end encryption. The bottom line here is that the only true SOAP-related security safeguard you can have is to use only trusted (and well-behaved) software that processes only valid messages from authenticated sources.

SOAP permits application-specific data, which could include authentication data and other security measures, to be carried within either SOAP header blocks or the SOAP body. If security-related items are being placed in a header block, one needs to realize that header block processing may involve the removal or modification of that header by an intermediary node, so where necessary the role= and mustUnderstand= attributes need to be rigorously used to ensure that any security measures placed in a header block are processed appropriately.