The Principal Web Services Technologies

The Web services stacks make use of several technologies. In this section, we will cover what are considered to be the main components of today's Web services architectures, namely XML, SOAP, WSDL, and UDDI. Due to space consideration, the discussion will concentrate on the basic concepts, representations, data structures, and mechanisms, without going too deeply into arcane details that are better left to a specialized book.

XML

A section in a chapter in a book about P2P is probably not the best in-depth reference on XML. Although this book assumes that you are somewhat familiar with XML, this section does cover the main concepts of XML to give you a good basis for understanding the other topics covered. If you are comfortable with XML, you can skip this section.

Data and Document-Centric Languages

The concept of markup languages has been around for a long time (SGML and HTML are two major examples), but no simple yet flexible standard (that is, recognized by major players) was available to define and exchange data. Document-centric markup languages such as HTML are good at specifying the components of a document (such as titles, paragraphs, lists, and tables) without describing the data that is contained in that document. XML, on the other hand, is a data-centric markup language whose main purpose is to define the data that is contained in a document. Consider, for example, the two fragments shown next. First, look at an HTML fragment:

 <table BORDER COLS=3 WIDTH="100%" >     <tr BGCOLOR="#CCFFFF">       <td>Book Title</td>       <td>Author</td>       <td>SKU</td>    </tr>    <tr>       <td>Brain Surgery Explained</td>       <td>John Smith</td>       <td>154-455</td>    </tr>    <tr>       <td>DIY Surgery</td>       <td>Betty Mitchell</td>       <td>154-323</td>    </tr> </table>

Next, look at an XML fragment:

 <books>     <book>       <title>Brain Surgery Explained</title>       <author>John Smith</author>       <sku>154-544</sku>    </book>    <book>       <title>DIY Surgery</title>       <author>Betty Mitchell</author>       <sku>154-323</sku>    </book> </books>

At this point, the distinction should be clear. Although the HTML code is good at telling an application such as a browser how to display the book data, it does nothing to tell the application what the data consists of. For example, a distinction isn't made between metadata (in this case, the first row, which describes the column headings) and data (the remaining two rows). By contrast, the XML code tells the application about the structure of the data and metadata, leaving decisions on what to do with the data to the application. (This includes how and what to display, and whether to display at all. As a matter of fact, the application could be an order-processing application that has nothing to do with displaying data.) That, in a nutshell, is one of the greatest advantages of XML.

Still considering the two code snippets, we can make another important observation: In the HTML code, the tags (such as <td> and <tr>) are predetermined and defined in the HTML language specification. Developers are not allowed to define their own tags. In the XML code shown, the developer defined the tags. (We will explore the definition mechanism later in this section.) Remember that the X in XML stands for extensible. This means that developers can define and extend their own tags and make up their own languages. That is the other great advantage of XML.

XML Instance Document Components

XML documents are of two types: instance documents and specification documents. Instance documents are often matched to a specification, which specifies the tags and their allowed occurrences within the document. We first cover the details of instance documents. (These are the documents that you will usually get to see.) Specification documents will be covered later.

Let us now look at a more complete XML listing (Listing 11.1) so that you can understand the components that form an XML document.

Listing 11.1 Book Sellers XML Document

 <?xml version="1.0" encoding="UTF-8"?> <books>    <book isbn="123456789">       <title>Brain Surgery Explained</title>       <author>          <firstname>John</firstname>          <lastname>Smith</lastname>       </author>       <retailer>          <name>booksgalore</name>          <sku>154-455</sku>          <price>24.99</price>       </retailer>       <retailer>          <name>seriousbooks</name>          <sku>465-40-143</sku>          <price>27.99</price>       </retailer>    </book>    <book isbn="987654321">       <title>DIY Surgery</title>       <author>          <firstname>Terry</firstname>          <lastname>Mitchell</lastname>       </author>       <author>          <firstname>Joe</firstname>          <lastname>Mitchell</lastname>       </author>       <retailer>          <name>seriousbooks</name>          <sku>465-40-365</sku>          <price>32.49</price>       </retailer>    </book> </books>

This listing is an XML instance document. It consists mainly of two parts: the prologue and the body. The prologue is the following line:

 <?xml version="1.0" encoding="UTF-8"?>

This line starts with the characters <?, which define what's called a processing instruction. Processing instructions are beyond the scope of this discussion; suffice it to say, most of the time what you'll use is the same line shown in Listing 11.1.

The rest of the document forms the body, and it consists of a hierarchy or tree of paired tags, starting with a root node in this case, the node that is bracketed and defined by the paired tags <books> and </books>. A tree view of the body of the document is shown in Figure 11.5.

Figure 11.5. Tree view of the `books.xml` XML document.

graphics/11fig05.jpg

An XML document is, in essence, a collection of paired tags such as <author> and </author>, with data between the tag pairs. This data is either string data such as an author's name, or other sets of nested paired tags. Each set of these paired tags defines what's called an element. Unlike HTML, for example, XML element tags have to be completely nested, or ordered properly. For example, the following HTML syntax, with the bold and italic tags out of order, is acceptable:

 <B><I>This is OK in HTML</B></I>

In XML, a similar syntax such as

 <book><title>Web Services</book></title>

is not valid, but has to be properly ordered:

 <book><title>Web Services</title></book>

A document in which all the tags are properly matched and nested and in which no syntax errors exist is called a well-formed document.

XML has different types of element tags, but the main three are illustrated in the listing:

Elements that contain other nested elements These elements are nodes within the document tree. An example is the <book> element, which contains the <title>, <author>, and <retailer> elements.
Elements that contain string data These are leaf nodes in the document tree. An example is the <title> element, which describes the title of a book in string form.
Elements that are empty These elements do not contain data outside of the element tag. Instead, they convey information just by their presence, or through the attributes they contain. As an example, instead of using <price>27.99</price>, we can use <price value="27.99"/>. Notice that the start and end tags are combined into one tag.

Now let's take a closer look at the <book> tag from the previous listing:

 <book isbn="123456789">

The name-value pair (isbn="123456789") before the closure of the angle bracket (<) is called an attribute. Start tags of XML elements can have attributes added within their angle brackets, and an attribute's value (the value part of the name-value pair) can only be a string.

If you're asking yourself at this point why attributes are needed or what attributes provide that can't be done with elements, give yourself some bonus points. Conceivably, we could have rewritten the majority of the elements in the <books> XML document as a set of empty elements, with data such as book titles or author names provided as strings within the attribute values. For various reasons, however, this is not a good idea. The debate over the usage of attributes has been raging in the XML community since the introduction of the XML specification, with no clear winners. The general rule of thumb is to use elements to convey content, and attributes to convey characteristics of that content, especially if those characteristics can be enumerated (such as a small set of colors). For example, when describing a book, the book title is considered content, and described in a <title> element. The book type, on the other hand (for example, "technical," "fiction," and so on) can be described in an attribute. Also keep in mind that elements are usually processed more efficiently by parsers and XML tools

DTD, Schema, and Namespaces

Let us now consider the following example:

 <?xml version="1.0" encoding="UTF-8"?>  <results>    <book>       <name>Our House</name>       <autor>John Doe</autor>       <author>Mary Public</author>    </book> </results>

This is a well-formed document (correct syntax, all the tags are properly paired, and so on), but did we intend to have two different elements, <author> and <autor>, in the document, or is that an error? Also, even though one would think of a book as having a title and an author as having a name, no <title> element is present. Instead, the element <name> seems to refer to the title of the book, not an author's name.

These two irregularities illustrate the point that writing well-formed XML documents is often not sufficient. These documents do not exist in a vacuum: They usually serve some purpose for exchanging data, and they are usually shared with other people or between applications. Therefore, they must have a shared meaning, a shared purpose, and a shared structure that can be validated. This validation is done through a specification document.

In the early days of XML, XML document structure and validation were specified through document type definition (DTD) documents. An example DTD for the books.xml document is relatively self explanatory and is shown in the following code:

 <!ELEMENT books (book+)>  <!ELEMENT book (title , author+ , retailer+)> <!ATTLIST book  isbn CDATA  #IMPLIED > <!ELEMENT title (#PCDATA)> <!ELEMENT author (firstname , lastname)> <!ELEMENT retailer (name , sku , price)> <!ELEMENT name (#PCDATA)> <!ELEMENT sku (#PCDATA)> <!ELEMENT firstname (#PCDATA)> <!ELEMENT lastname (#PCDATA)> <!ELEMENT price (#PCDATA)>

DTDs have some useful properties in that they allow the specification of the vocabulary and the grammar of an XML document (that is, they specify what tags are allowed and what their containment model is within the document). However, DTDs quickly proved to be too simple for interesting use of XML. Aside from their non-XML syntax, DTD documents had two major deficiencies: They did not allow the specification of the type of data allowed within the tags of an element; they did not allow the use of namespaces, an important and useful mechanism for reuse of XML elements.

To illustrate the need for namespaces, imagine a messaging system that handles messages from different organizations and wraps them in a <message> element, as shown in the following XML fragment:

 <message>     <sender>BooksGalore</sender>    <receiver>dianadoe@hotmail.com</receiver>    <body>       <purchaseOrder >          <billTo>Diana Doe</billTo>          <book isbn="123456789">             <title>Brain Surgery Explained</title>             <author>John Smith</author>             <sku>154-544</sku>          </book>          <message>Book is on back order.</message>       </purchaseOrder>    </body> </message>

This example illustrates two of the issues that are raised when different XML documents are composed into one, or when elements are reused. First, the element <message> exists in two different places, with two different meanings or contexts. The messaging system defines its own <message> element as a wrapper for the whole message, and the message sender uses its own <message> element to indicate messages to the receiver (such as when a book is on back order). This ambiguous use of an element can be confusing to the XML parser and to the application that is handling the document, and is called a namespace collision.

The other concept is that the <book> element is the same one used in examples throughout this section, and has been reused by the message sender. Does the sender in this case need to redefine the specification for the <book> element, or should there be a way that the sender can indicate that this is an element defined elsewhere? An associated issue is that the <body> element should be allowed to contain any type of XML tags that the sender requires. How can the designers of that specification handle this requirement without prior knowledge of the contents? How will the receiving application know when the </message> element is the root element for the message being delivered, or the containing element for the message to the customer?

The issue of namespaces has been familiar to software engineers for a long time and has been recently brought to the surface because of the important object-oriented concept of reuse.

It manifests itself, for example, in the use of package names to fully qualify class names in Java, thus enabling the use of classes from a variety of sources in the same application. Similar issues have been encountered in XML usage: What if it makes sense to use elements from different sources in the same XML document? As in the Java fully qualified class names, namespaces solve this problem by providing the ability to disambiguate elements that are spelled the same but have different contexts.

Using namespaces to qualify elements in the XML document shown in the previous example would result in a document similar to the one shown here:

 <message>     <sender>BooksGalore</sender>    <receiver>dianadoe@hotmail.com</receiver>    <body>       <po:purchaseOrder           xmlns:po="http://localhost:8080/xml/schemas/purchaseorder">          <po:billTo>Diana Doe</po:billTo>          <bk:book isbn="123456789"             xmlns:bk="http://localhost:8080/xml/schemas/book">             <bk:title>Brain Surgery Explained</bk:title>             <bk:author>John Smith</bk:author>             <bk:sku>154-544</bk:sku>          </bk:book>          <po:message>Book is on back order.</po:message>       </po:purchaseOrder>    </body> </message>

You can include namespace declarations in different ways. As you see from the example, a namespace declaration can be included in the root element of the new namespace in this case, one in the <purchaseOrder> element, which is the highest level in that hierarchy, and one in the <book> element, which is the root element of the next namespace. The syntax of the declaration is relatively simple and takes the form xmlns:namespaceidentifier=URI, as shown here:

 <bk:book isbn="123456789"     xmlns:bk="http://localhost:8080/xml/schemas/book">

In this declaration, the namespaceidentifier is bk; it is used as a prefix for all elements that belong to that namespace, such as <bk:book> and <bk:author>. The Uniform Resource Identifier (URI) uniquely identifies the namespace and usually tends to be a Uniform Resource Locator (URL), although this is not a requirement. URIs in general can be any uniquely identifying sequence of characters, and they do not identify the location of a resource (as opposed to URLs). Using URLs is generally easier than using URIs because URLs are usually uniquely identified. A namespace declaration is scoped by the element in which it appears so that elements outside of that scope do not belong to that namespace. In our example, the <po:message> element is within the scope of the xmlns:bk="http://localhost:8080/xml/schemas/book" declaration; in contrast, the <message> element is outside the scope. Therefore, <message> is from a different namespace in this case, the default undeclared namespace of the main document.

Alternatively, all namespace declarations can be included in the root element of the document, as follows:

 <message     xmlns:po="http://localhost:8080/xml/schemas/purchaseorder"    xmlns:bk="http://localhost:8080/xml/schemas/book">    <sender>BooksGalore</sender>    <receiver>dianadoe@hotmail.com</receiver>    <body>       <po:purchaseOrder >          <po:billTo>Diana Doe</po:billTo>          <bk:book isbn="123456789">             <bk:title>Brain Surgery Explained</bk:title>             <bk:author>John Smith</bk:author>             <bk:sku>154-544</bk:sku>          </bk:book>          <po:message>Book is on back order.</po:message>       </po:purchaseOrder>    </body> </message>

What is the alternative to DTDs? The answer to this question was defined by the publication of the XML schema specification. The schema definition for the <books> example is shown in Listing 11.2.

Listing 11.2 Book Sellers XML Schema

 <?xml version="1.0" encoding="UTF-8"?> <xsd:schema xmlns:xsd="http://www.w3.org/2001/XMLSchema"    elementFormDefault="qualified">    <xsd:element name="books">       <xsd:complexType>          <xsd:sequence>             <xsd:element ref="book" maxOccurs="unbounded"/>          </xsd:sequence>       </xsd:complexType>    </xsd:element>    <xsd:element name="book">       <xsd:complexType>          <xsd:sequence>             <xsd:element ref="title"/>             <xsd:element ref="author" maxOccurs="unbounded"/>             <xsd:element ref="retailer" maxOccurs="unbounded"/>          </xsd:sequence>          <xsd:attribute name="isbn" use="optional" type="xsd:string"/>       </xsd:complexType>    </xsd:element>    <xsd:element name="title" type="xsd:string"/>    <xsd:element name="author">       <xsd:complexType>          <xsd:sequence>             <xsd:element ref="firstname"/>             <xsd:element ref="lastname"/>          </xsd:sequence>       </xsd:complexType>    </xsd:element>    <xsd:element name="retailer">       <xsd:complexType>          <xsd:sequence>             <xsd:element ref="name"/>             <xsd:element ref="sku"/>             <xsd:element ref="price"/>          </xsd:sequence>       </xsd:complexType>    </xsd:element>    <xsd:element name="name" type="xsd:string"/>    <xsd:element name="sku" type="xsd:string"/>    <xsd:element name="price" type="xsd:float"/>    <xsd:element name="firstname" type="xsd:string"/>    <xsd:element name="lastname" type="xsd:string"/> </xsd:schema>

Notice, for example, that the content of the different elements is typed. For example, we can now make sure that the <price> element contains a float. Typing and namespaces are only two of the many new features of XML schemas. Schemas give XML developers enormous flexibility in designing their documents, allowing things such as inheritance and range restrictions, among others.

Having defined the schema for the <books> example, how would an instance document make use of that fact? Let us look at an instance of XML that uses the books.xsd schema (see Listing 11.3).

Listing 11.3 Book Sellers XML Document Using XML Schema

 <?xml version="1.0" encoding="UTF-8"?> <books xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi: noNamespaceSchemaLocation="http://localhost:8080/xml/schemas/books.xsd">    <book isbn="123456789">       <title>Brain Surgery Explained</title>       <author>          <firstname>John</firstname>          <lastname>Smith</lastname>       </author>       <retailer>          <name>booksgalore</name>          <sku>154-455</sku>          <price>24.99</price>       </retailer>       <retailer>          <name>seriousbooks</name>          <sku>465-40-143</sku>          <price>27.99</price>       </retailer>    </book>    <book isbn="987654321">       <title>DIY Surgery</title>       <author>          <firstname>Terry</firstname>          <lastname>Mitchell</lastname>       </author>       <author>          <firstname>Joe</firstname>          <lastname>Mitchell</lastname>       </author>       <retailer>          <name>seriousbooks</name>          <sku>465-40-365</sku>          <price>32.49</price>       </retailer>    </book> </books>

SOAP

By now, I hope I've convinced you that XML is a great idea for data exchange and interoperability. The next logical step in the evolution of interoperability standards is whether XML can be used not only to define data, but also to define messages between applications. That leads us to take a closer look at SOAP.

The first paragraph in the W3C note of the SOAP 1.1 specification (http://www.w3.org/TR/SOAP/) gives a great summary of what SOAP is: "SOAP is a lightweight protocol for exchange of information in a decentralized, distributed environment. It is an XML-based protocol that consists of three parts: an envelope that defines a framework for describing what is in a message and how to process it, a set of encoding rules for expressing instances of application-defined datatypes, and a convention for representing remote procedure calls and responses. SOAP can potentially be used in combination with a variety of other protocols; however, the only bindings defined in this document describe how to use SOAP in combination with HTTP and HTTP Extension Framework."

Although the readability of the specification goes downhill from there, this section should provide you with enough of a background to be able to use SOAP effectively. In essence, SOAP is a simple, lightweight protocol that allows you to use XML to exchange data across networks. SOAP can be invoked over (but not restricted to) HTTP, thereby getting around traditional firewall issues. This section goes into more detail about SOAP and its components.

The Structure of a SOAP Message

If you think of sending a message over the networks as being analogous to sending a letter in the mail, then the SOAP model starts making sense. The top-level SOAP structure is the SOAP Envelope, which contains a SOAP Header and a SOAP Body. This envelope is carried by some transport mechanism, most commonly HTTP (although other protocols such as SMTP can be used). Listings 11.4 and 11.5 show a SOAP message embedded in an HTTP request and the SOAP response embedded in the HTTP response. (All examples in this section are generated using the Apache Axis implementation and tools found at http://xml.apache.org/axis/.)

Listing 11.4 SOAP Message Within an HTTP Request

 POST /axis/servlet/AxisServlet HTTP/1.0 Content-Length: 459 Host: localhost Content-Type: text/xml; charset=utf-8 SOAPAction: "http://soapinterop.org/echoString" <?xml version="1.0" encoding="UTF-8"?> <SOAP-ENV:Envelope    SOAP-ENV:encodingStyle="http://schemas.xmlsoap.org/soap/encoding/"    xmlns:SOAP-ENV="http://schemas.xmlsoap.org/soap/envelope/"    xmlns:xsd="http://www.w3.org/2001/XMLSchema"    xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance">  <SOAP-ENV:Body>   <ns1:echoString xmlns:ns1="http://soapinterop.org/">    <arg0 xsi:type="xsd:string">Hello!</arg0>   </ns1:echoString>  </SOAP-ENV:Body> </SOAP-ENV:Envelope>

Listing 11.5 SOAP Message Within an HTTP Response

 HTTP/1.1 200 OK Content-Type: text/xml; charset=utf-8 Content-Length: 499 Date: Thu, 24 Jan 2002 19:26:30 GMT Server: Apache Tomcat/4.0.1 (HTTP/1.1 Connector) <?xml version="1.0" encoding="UTF-8"?> <SOAP-ENV:Envelope    SOAP-ENV:encodingStyle="http://schemas.xmlsoap.org/soap/encoding/"    xmlns:SOAP-ENV="http://schemas.xmlsoap.org/soap/envelope/"    xmlns:xsd="http://www.w3.org/2001/XMLSchema"    xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance">  <SOAP-ENV:Body>   <ns1:echoStringResponse xmlns:ns1="http://soapinterop.org/">    <echoStringResult xsi:type="xsd:string">Hello!</echoStringResult>   </ns1:echoStringResponse>  </SOAP-ENV:Body> </SOAP-ENV:Envelope>

Although simple, the previous two listings are representative of SOAP messages. A SOAP message is an XML document that consists of a root <Envelope> element, sometimes containing an optional <Header> element, and typically containing a <Body> element. The names are self explanatory.

Typically, you won't have to build SOAP messages by hand because development tools from various sources such as Apache, IBM, and Microsoft will automate the construction of the messages. You need to keep two important issues in mind, however. The first one is the conversion of Java types into XML payload for the SOAP messages. This is important because few interesting or realistic services would take simple types such as strings and numbers as parameters or return simple types as results. For example, this would be the case when a purchasing service takes as parameter a complex PurchaseOrder object and returns a complex Receipt object. The easiest method to accomplish this conversion is to implement your complex data types as JavaBeans, providing them with the usual setter and getter methods. Most deployment tools will then automatically perform the conversion from JavaBeans to XML. In the following example, we are providing a relatively simple PurchaseOrder class, with two instance variables, isbn and quantity:

 package myExamples.purchase;  public class PurchaseOrder {     private String isbn;     private Integer quantity;     public String getIsbn() { return isbn; }     public Integer getQuantity() { return quantity; }     public void setIsbn(String newIsbn) { isbn = newIsbn; }     public void setQuantity(Integer newQ) { quantity = newQ; } }

The details of the next step will depend on the tool you happen to be using, but you need to provide a mapping from the parameters to the Bean class for the serialization to occur properly. In the Apache Axis implementation, this can be done in the Web Service Deployment Descriptor (WSDD) file that you create for the service. In it, you include a <beanMapping> element that holds that information. In the following example, the Bean class (myExamples.purchase.PurchaseOrder) is provided as the mapping for the purchase order input parameter of the service:

 <deployment xmlns="http://xml.apache.org/axis/wsdd/"     xmlns:java="http://xml.apache.org/axis/wsdd/providers/java"    xmlns:po="http://localhost/PurchaseService"> <service name="PurchaseService" provider="java:RPC">   <parameter name="className" value="myExamples.purchase.PurchaseService"/>   <parameter name="methodName" value="*"/>   <beanMapping qname="po:PurchaseOrder"    languageSpecificType="java:myExamples.purchase.PurchaseOrder"/>  </service> </deployment>

Let us now look at the SOAP message that invokes the service:

 POST /axis/servlet/AxisServlet HTTP/1.0Content-Length: 704Host: localhostContent-Type:  text/xml; charset=utf-8SOAPAction: "PurchaseService/processOrder"  <?xml version="1.0" encoding="UTF-8"?> <SOAP-ENV:Envelope    SOAP-ENV:encodingStyle="http://schemas.xmlsoap.org/soap/encoding/"    xmlns:SOAP-ENV="http://schemas.xmlsoap.org/soap/envelope/"    xmlns:xsd="http://www.w3.org/2001/XMLSchema"    xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance">  <SOAP-ENV:Body>   <ns1:processOrder xmlns:ns1="PurchaseService">    <arg1 href="#id0"/>   </ns1:processOrder>   <multiRef  SOAP-ENC:root="0" xsi:type="ns2:PurchaseOrder"       xmlns:SOAP-ENC="http://schemas.xmlsoap.org/soap/encoding/"       xmlns:ns2="urn:PurchaseService">    <quantity xsi:type="xsd:int">5</quantity>    <isbn xsi:type="xsd:string">0-672-32181-5</ISBN>   </multiRef>  </SOAP-ENV:Body> </SOAP-ENV:Envelope>

You will notice that the conversion from the PurchaseOrder object to XML was performed automatically, resulting in the two instance variables being incorporated as the <quantity> and <isbn> subelements of the <arg1> element.

WSDL

In any reusable component-programming model, properly using a component requires finding out what it does, how to invoke it, what parameters it will take, and what kinds of results it will return. This also applies to Web services, which can be thought of as reusable components that can be invoked over the network through a variety of transports.

You have seen that XML is an essential ingredient for interoperable data definition and exchange, and that SOAP, built on top of XML and HTTP, is a good way to exchange messages. The next requirement for a robust, loosely coupled, distributed services architecture is the service description mechanism so that service providers can describe their services and how to invoke them. In this area, the WSDL specification has emerged as the de facto standard. This section will cover the basic concepts of WSDL and give some examples, but keep in mind that all current Web services tools such as IBM's and Microsoft's can automate the generation of WSDL descriptions of the Web services you develop.

WSDL is closely related to the concept of an IDL, as shown in the following example. The WSDL design pattern is to define a set of abstract reusable types and components, and then physical implementations that make use of these abstract components. During the following discussion, we will refer to the example in Listing 11.6 to go over the details of WSDL. In this example, a book retailer is defining a Web service that allows customers to request the price of a book either by ISBN or by author name.

Listing 11.6 WSDL Document for the `GetBookPrice` Service

 <?xml version="1.0" encoding="UTF-8"?> <definitions name="GetBookPrice"    targetNamespace="http://localhost:8080/axis/BookPriceService.jws"    xmlns:soap="http://schemas.xmlsoap.org/wsdl/soap/"    xmlns:serviceNS="http://localhost:8080/axis/BookPriceService.jws"    xmlns:xsd="http://www.w3.org/2001/XMLSchema"    xmlns="http://schemas.xmlsoap.org/wsdl/">    <types>       <xsd:schema          targetNamespace="http://localhost:8080/xml/schemas/bookprice.xsd"          xmlns:xsd="http://www.w3.org/2000/10/XMLSchema">          <xsd:element name="BookPriceByISBNRequest">             <xsd:complexType>                <xsd:all>                   <xsd:element name="isbn" type="string"/>                </xsd:all>             </xsd:complexType>          </xsd:element>          <xsd:element name="BookPriceByAuthorRequest">             <xsd:complexType>                <xsd:all>                   <xsd:element name="author" type="string"/>                </xsd:all>             </xsd:complexType>          </xsd:element>          <xsd:element name="BookPriceResponse">             <xsd:complexType>                <xsd:all>                   <xsd:element name="price" type="double"/>                </xsd:all>             </xsd:complexType>          </xsd:element>       </xsd:schema>    <types>    <message name="GetBookPriceByISBNInput">       <part name="body" element="bpxsd:ISBNBookPriceRequest"/>    </message>    <message name="GetBookPriceByAuthorInput">       <part name="body" element="bpxsd:AuthorBookPriceRequest"/>    </message>    <message name="GetBookPriceOutput">       <part name="body" element="bpxsd:BookPrice"/>    </message>    <portType name="BookPricePortType">       <operation name="GetBookPriceByISBN">          <input message="bp:GetBookPriceByISBNInput"/>          <output message="bp:GetBookPriceOutput"/>       </operation>       <operation name="GetBookPriceByAuthor">          <input message="bp:GetBookPriceByAuthorInput"/>          <output message="bp:GetBookPriceOutput"/>       </operation>    </portType>    <binding name="BookPriceSoapBinding" type="bp:BookPricePortType">       <soap:binding          style="document"          transport="http://schemas.xmlsoap.org/soap/http"/>       <operation name="GetBookPriceByISBN">          <soap:operation soapAction="http://example.com/GetBookPriceByISBN"/>          <input>             <soap:body use="literal"/>          </input>          <output>             <soap:body use="literal"/>          </output>       </operation>       <operation name="GetBookPriceByAuthor">          <soap:operation soapAction="http://example.com/GetBookPriceByAuthor"/>          <input>             <soap:body use="literal"/>          </input>          <output>             <soap:body use="literal"/>          </output>       </operation>    </binding>    <service name="BookPriceService">       <documentation>Get Book Prices</documentation>       <port name="BookPricePort" binding="bp:BookPriceSoapBinding">          <soap:address location="http://example.com/bookprice"/>       </port>    </service> </definitions>

The Java Implementation of the Web Service

First let us look at the Java implementation of the service, showing the two methods and their signatures. This is a simplistic example, where the bulk of the actual work in terms of calling various databases and finding a real answer is not shown. (Again, all Web services code shown has been deployed on the Apache Axis implementation.)

 package samples.bookprice;  public class BookPriceService {     public double getPriceByISBN(String isbn) {     //perform some hard work and return an answer         return 24.55;     }     public double getPriceBySKU(String sku) {     //perform some hard work and return an answer         return 24.55;     } }

Constructing the WSDL Document

Now let us start constructing the WSDL document that describes this service. A WSDL document is an XML instance document. Therefore, it starts with the XML definition you saw in the section "XML." The root element in a WSDL document is the <definitions> element. Aside from being a container for all the WSDL definitions, it also allows us to define the various namespaces to which we will need to refer:

 <?xml version="1.0" encoding="UTF-8"?>  <definitions name="GetBookPrice"    targetNamespace="http://localhost:8080/axis/BookPriceService.jws"    xmlns:soap="http://schemas.xmlsoap.org/wsdl/soap/"    xmlns:serviceNS="http://localhost:8080/axis/BookPriceService.jws"    xmlns:xsd="http://www.w3.org/2001/XMLSchema"    xmlns="http://schemas.xmlsoap.org/wsdl/">

Usually, the first section in a WSDL document is the <types> section. This section of the WSDL document is where you define data types such as parameters and return types that will be used in various messages. These data types will be referred to in the rest of the document. These type definitions follow the format of the XML schema definitions that we discussed in the XML section of this chapter, as shown on the second line of the following listing. Therefore, it is not necessary to define simple types such as strings and integers.

In this example, however, for the sake of illustration, we have defined three named data types: two string types for the input parameters (ISBN and Author name) and a double for the returned price. Another argument for defining even simple types in the <types> section is that these predefined names can be used throughout the document, and any future changes to the type are localized to this section. For example, having defined BookPriceResponse to be a double, we can then use it in several places. If the actual definition changes to float (due to a change in the implementation, for example), we only have to change the definition in the <types> section. (Good software engineering practices always apply, even to XML!)

 <types>     <xsd:schema       targetNamespace="http://localhost:8080/xml/schemas/bookprice.xsd"       xmlns:xsd="http://www.w3.org/2000/10/XMLSchema">       <xsd:element name="BookPriceByISBNRequest">          <xsd:complexType>             <xsd:all>                <xsd:element name="isbn" type="string"/>             </xsd:all>          </xsd:complexType>       </xsd:element>       <xsd:element name="BookPriceByAuthorRequest">          <xsd:complexType>             <xsd:all>                <xsd:element name="author" type="string"/>             </xsd:all>          </xsd:complexType>       </xsd:element>       <xsd:element name="BookPriceResponse">          <xsd:complexType>             <xsd:all>                <xsd:element name="price" type="double"/>             </xsd:all>          </xsd:complexType>       </xsd:element>    </xsd:schema> <types>

Next we usually define a set of <message> elements. This is how the abstract definitions of the messages that will be used in various operations (such as input or output messages) are declared. Message definitions are typed and include the passed parameters, if any. Here, we have defined three abstract message types: two request (or input) messages and one response (or output) message:

 <message name="GetBookPriceByISBNInput">     <part name="body" element="bpxsd:ISBNBookPriceRequest"/> </message> <message name="GetBookPriceByAuthorInput">    <part name="body" element="bpxsd:AuthorBookPriceRequest"/> </message> <message name="GetBookPriceOutput">    <part name="body" element="bpxsd:BookPrice"/> </message>

Next we can compose abstract operations from defined messages. An operation is an abstract definition that is analogous to a method signature declaration in Java. An operation will usually declare an input message and an output message. These operations are collected within a <portType> element. A portType is where the various abstract definitions are put to use. A portType assembles a set of operations that are to be supported by one port, or access point. Here, we have defined two different abstract operations: one to get the price of a book given the ISBN as an input parameter, and the other to get the price given the author name. Both operations use the same abstract message definition GetBookPriceOutput to return the book price:

 <portType name="BookPricePortType">     <operation name="GetBookPriceByISBN">       <input message="bp:GetBookPriceByISBNInput"/>       <output message="bp:GetBookPriceOutput"/>    </operation>    <operation name="GetBookPriceByAuthor">       <input message="bp:GetBookPriceByAuthorInput"/>       <output message="bp:GetBookPriceOutput"/>    </operation> </portType>

Finally, a <binding> element is where the abstract definitions within a portType are given a concrete correspondence to actual protocols. In this case, the retailer has decided to implement the operations that are defined within the BookPrice portType as SOAP over HTTP:

 <binding name="BookPriceSoapBinding" type="bp:BookPricePortType">     <soap:binding style="document" transport="http://schemas.xmlsoap.org/soap/ http"/>    <operation name="GetBookPriceByISBN">       <soap:operation soapAction="http://example.com/GetBookPriceByISBN"/>       <input>          <soap:body use="literal"/>       </input>       <output>          <soap:body use="literal"/>       </output>    </operation>    <operation name="GetBookPriceByAuthor">       <soap:operation soapAction="http://example.com/GetBookPriceByAuthor"/>       <input>          <soap:body use="literal"/>       </input>       <output>          <soap:body use="literal"/>       </output>    </operation> </binding>

Notice that the binding definition has defined the implementation of the operations as SOAP over HTTP, but no address was given. This further decouples the definitions from the actual implementation, allowing the possibility of reusing these definitions in other WSDL documents.

The final required definition, that of an actual invocation address, is given in the <port> element within the <service> element. In WSDL terminology, a service is the highest-level definition; it consists of a bundling of one of more ports, which describe physical access points to the different operations provided by the service. In short, a port brings together a binding and an address for that binding. (Unfortunately, the WSDL terminology can sometimes be unnecessarily confusing, but it will start to make sense after you start using it.) In the following example, the address where the SOAP message should be sent is associated with the binding that was defined previously:

 <service name="BookPriceService">     <documentation>Get Book Prices</documentation>    <port name="BookPricePort" binding="bp:BookPriceSoapBinding">       <soap:address location="http://example.com/bookprice"/>    </port> </service>

Now, all the elements that are necessary to invoke the BookPrice service are in place. After the WSDL document is complete, it can be communicated to potential users, either directly (such as through email or FTP) or through discovery mechanisms, such as UDDI, which we will explore in the next section.

Invoking the Web Service

As a final example, let us examine the actual invocation of one of the services that is defined in our WSDL document. The SOAP invocation of the GetBookPriceByISBN is shown here:

 POST /axis/BookPriceService.jws HTTP/1.0  Content-Length: 422 Host: localhost Content-Type: text/xml; charset=utf-8 SOAPAction: "/getPriceByISBN" <?xml version="1.0" encoding="UTF-8"?> <SOAP-ENV:Envelope    SOAP-ENV:encodingStyle="http://schemas.xmlsoap.org/soap/encoding/"    xmlns:SOAP-ENV="http://schemas.xmlsoap.org/soap/envelope/"    xmlns:xsd="http://www.w3.org/2001/XMLSchema"    xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance">  <SOAP-ENV:Body>   <getPriceByISBN>    <op1 xsi:type="xsd:string">4578498</op1>   </getPriceByISBN>  </SOAP-ENV:Body> </SOAP-ENV:Envelope>

The SOAP response from the Web service is as follows:

 HTTP/1.1 200 OK  Content-Type: text/xml; charset=utf-8 Content-Length: 470 Date: Thu, 28 Feb 2002 03:16:20 GMT Server: Apache Tomcat/4.0.1 (HTTP/1.1 Connector) <?xml version="1.0" encoding="UTF-8"?> <SOAP-ENV:Envelope    SOAP-ENV:encodingStyle="http://schemas.xmlsoap.org/soap/encoding/"    xmlns:SOAP-ENV="http://schemas.xmlsoap.org/soap/envelope/"    xmlns:xsd="http://www.w3.org/2001/XMLSchema"    xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance">  <SOAP-ENV:Body>   <getPriceByISBNResponse>    <getPriceByISBNResult xsi:type="xsd:double">24.55</getPriceByISBNResult>   </getPriceByISBNResponse>  </SOAP-ENV:Body> </SOAP-ENV:Envelope>

UDDI

Having created and deployed a service and written its WSDL description, the last major step is allowing potential users to find it. We have used the component programming analogy before to explain the need for a description language such as WSDL. We can extend the analogy to the concept of a services registry.

It is clear that an object or component registry is an essential element in any component-programming framework, facilitating the discovery and use of components. Imagine, for example, trying to write Java code without having access to a browsable set of JavaDocs, or working in Smalltalk without having access to a class browser. If you consider the Web services concept as an evolution of component programming, it follows that for the purpose of service discovery, it is necessary to have access to registries of business and service descriptions that can be browsed and queried. This is the role that UDDI plays in the Web services world.

When the idea of Web services started gaining momentum within the IT community in early 2000, two things became clear: Web services registries were going to be essential for the concept to become practical, and registry standards would have to be endorsed by several, if not all, of the large software providers for any hope of adoption by the various industries. Thus, the UDDI initiative, the result of several months of collaboration between representatives from Ariba, IBM, and Microsoft starting in the spring of 2000, was born and formally announced on September 6, 2000, with support from several other companies. Currently, the UDDI project (http://www.uddi.org) involves a community of more than 300 companies.

The purpose of UDDI is to facilitate service discovery both at design time and dynamically at runtime. Consequently, the UDDI project runs a public online business (and services) registry, which first went live on May 2, 2001. This registry is referred to as the UDDI Business Registry. The UDDI Business Registry actually consists of replicated registries that are currently hosted by several companies, called the UDDI Operators. Registries that conform to the first UDDI specification (UDDI V1) are hosted by IBM and Microsoft, whereas registries that conform to the second specification, still in beta releases (UDDI V2 Beta), are hosted by Hewlett-Packard, IBM, Microsoft, and SAP. More registry Operators are expected to join the Operators group.

UDDI is more than a business and services registry, however. It also defines a set of data structures and an API specification for programmatically registering and finding businesses, services, bindings, and service types. The UDDI API specification provides a set of Publication APIs to register services, and Inquiry APIs to find services. In addition to providing a programmatic API, UDDI registry Operators provide a Web-based user interface for registering, managing, and finding businesses and services in the registry. These Web sites provide a subset of the programmatic API and are relatively self explanatory. As a matter of fact, the Web interface to UDDI is probably the easiest and fastest way to register your services and to locate services. As of this writing, only the UDDI V1 service is running in production mode. Consequently, although we will mention the addition brought by UDDI V2, this section will be primarily concerned with the usage model of UDDI V1. IBM has its UDDI Web site at http://www.ibm.com/services/uddi and Microsoft's is at http://uddi.microsoft.com. Of course, all of the Operator nodes are accessible from http://www.uddi.org.

The UDDI Data Model

UDDI V1 has defined four core data structures: businessEntity, businessService, bindingTemplate, and tModel. UDDI V2 has added a fifth structure: the publisherAssertion. One of the important characteristics of these entities is that each has a unique identifier that can be used to locate it and refer to it. UDDI has adopted the use of Unique Universal Identifiers (UUIDs) for uniquely identifying the different entities. As we go into more detail about each structure, you will notice that each element has a key attribute. This key holds the UUID that is assigned by the operator at creation time. In the following discussion of the UDDI data model, please refer to Figure 11.6 as an example of how these structures are related. These data types are outlined next.

Figure 11.6. The five UDDI core data types.

graphics/11fig06.gif

`businessEntity`

The businessEntity is where businesses (and organizations) register detailed information such as name, contact information, and business type about themselves. The businessEntity structure is shown in Figure 11.7. Aside from the usual self-explanatory elements such as <name> and <contacts>, you should pay special attention to two elements: <categoryBag> and <identifierBag>. As their name suggests, these elements are used to categorize and identify services.

Figure 11.7. The `businessEntity` structure.

graphics/11fig07.gif

Although a full discussion of taxonomies is beyond the scope of this chapter, you can think of the <categoryBag> element as a way to fit a business or a service into one or more different categories in a taxonomy. As an example, one of the most common taxonomies is the one provided by Yahoo! to categorize Web sites. UDDI has established three canonical taxonomies: the North American Industry Classification System (NAICS), the Universal Standard Product and Services Code (USPSC), and a geographic location classification, the ISO 3166 Geographic Taxonomy. Although UDDI operators are free to offer more taxonomies at their sites, you have to be aware of the fact that this additional information is not replicated to the other UDDI registries, and it cannot be used in searches from other registries.

In addition to categorization, identification information such as the Dun & Bradstreet Data Universal Numbering System (DUNS) can be stored in the identifierBag elements. This greatly facilitates searching for businesses or organizations if their name is ambiguous.

`businessService`

A business or organization can register several services. This is reflected by the fact that the <businessEntity> element is the parent for the <businessServices> element, which is a container for the <businessService> elements. <businessService> elements allow users to register information about their services. The businessService structure is shown in Figure 11.8. Business services, like businessEntity entries, can also be categorized through <categoryBag> entries.

Figure 11.8. The `businessService` structure.

graphics/11fig08.gif

`bindingTemplate`

The bindingTemplate structure is where the rubber meets the road in completely defining a service and its invocation mechanism. In a sense, all of the other UDDI structures are there to allow users to finally get to a binding so that a service can be invoked. It closely corresponds to the concept of a binding as we saw it in WSDL. The bindingTemplate structure is shown in Figure 11.9.

Figure 11.9. The `bindingTemplate` structure.

graphics/11fig09.gif

tModel

Several components are required to fully define a service. These include input parameters, an output format, a transport mechanism, a security mechanism, and so on. As we saw in the WSDL section, many of these details can be supplied in the service description document. WSDL documents contain reusable abstract definitions (such as the message declarations) and concrete implementation definitions (such as the port declaration). This type of abstraction of reusable definitions is a step in the right direction, but it needs to be extended beyond the boundaries of one particular service or even one particular organization.

The technology model (tModel) definitions were meant to hold such cross-entity abstractions. For example, if a book dealership trade organization were to define some standard service interfaces such as some of the ones we have already seen, they would register them as tModels in UDDI. This would relieve the various book dealers from having to redefine these interfaces in their service descriptions. Their UDDI service entries would just refer to the corresponding tModel for the abstractions and provide the remaining concrete part of the definitions.

As a side effect of their use as abstractions, tModels generally are used as references. By convention, the UDDI specification suggests two main uses for them. The primary use is in defining what is called a technical fingerprint. This refers to any technical specifications or prearranged agreements on how to conduct business. The other main use for tModels is in defining namespaces to be used in the identifierBag and categoryBag structures.

Another important use for tModels, however, is from a user's perspective when searching for services. A common scenario would be that a service requester wants to do business with service providers that implement certain interfaces. The requester would then query the UDDI registry for providers who use the corresponding tModels.

Based on the UDDI specification, a tModel could define just about anything. It consists of a key, a name, a description, and a URL (see Figure 11.10).

Figure 11.10. The tModel structure.

graphics/11fig10.gif

`publisherAssertion`

To the four core data types we've just described, UDDI V2 has added a fifth: the publisherAssertion structure (see Figure 11.11). This structure is used to declare business relationships between different business entities.

Figure 11.11. The `publisherAssertion` structure.

graphics/11fig11.gif

Using the UDDI Web Interface

As we mentioned earlier, the UDDI Web interface that the individual UDDI operators provide is probably the fastest and easiest way to register your business and service information into UDDI and to locate other businesses and services while you're designing applications that consume services. For this section, we will be using the IBM UDDI Business Test Registry (accessible from http://www.ibm.com/services/uddi), which is typical of the interfaces that the other registry operators provide.

UDDI offers two different types of operations: authenticated and non-authenticated. Non-authenticated operations are generally search operations to locate businesses, services, or implementation details. Authenticated operations are the ones that allow you to register, edit, and delete businesses and services. To use the authenticated operations, you need to register with an operator. After you are registered, you can edit your information only from that operator's site, unless you require a transfer of custody to another operator. Keep in mind, however, that although you can edit your information through one operator only, the information will be replicated to all other operators so that anyone who is searching the Operator node can find your information.

After you are registered and logged in, the first option is to create a new business entry. We've created a new business named BooksGalore. After that step is out of the way, we can enter detailed information about the business. Figure 11.12 shows the options that are available to us. We can enter a business description, contact information, and a business locator. Business locators are how we can categorize the business, as discussed in the previous section.

Figure 11.12. Creating a new business entry at the UDDI Registry Web site.

graphics/11fig12.jpg

Figure 11.13 shows our options: NAICS, UNSPSC, and GCS-ISO 3166-1999. If we select the NAICS option, we can drill down the consecutive levels until we reach the desired level: Book Stores 451211 (see Figure 11.14).

Figure 11.13. Selecting a type of locator for categorization.

graphics/11fig13.jpg

Figure 11.14. Selecting a NAICS category.

graphics/11fig14.jpg

The rest of the interface is relatively straightforward. We can enter contact information and a business description.

The UDDI API

The next step is to find information in the registry. Searching the registry usually takes the form of a drill-down pattern, where a particular service is first located through a variety of means, and then the particular binding documents for that service are retrieved to invoke it. Although the Web interface can be used to perform these functions, UDDI provides a SOAP-based API that can be used programmatically at runtime to dynamically discover and invoke services. The API is exhaustive and includes calls for creating, editing, deleting, locating, and retrieving any of the entities that are in the registry.

The creation/editing operations take the form of save_entity, where entity is business, service, binding, or tModel. In V2, publisherAssertions are set through the set_publisherAssertions call. Delete operations are performed through delete_entity calls. These operations, because they involve modifying data in the registries, are authenticated. They require the acquisition of a token through the get_authToken operation. After the token is used, it can be optionally discarded through the discard_authToken operation. As expected, find operations through find_entity calls. Details about business or service entries are retrieved through the various get_entityDetail operations. The UDDI API specification is well documented at the UDDI site.