XML Web Services | The .NET and COM Interoperability Handbook (Integrated .Net)


Team-Fly

	.NET and COM Interoperability Handbook, The By Alan Gordon
	Table of Contents

	Chapter One. What's in a Name ?

XML Web Services

An XML Web service is a programmable application component that is accessible through standard Internet protocols. If that definition doesn't make sense to you, perhaps you can understand XML Web services better by comparing how XML Web services work with the way the Web works today. Today, a web browser makes a request on a Web server using the HTTP. The Web server reads this request and responds with an HTML page that contains a visual representation of the requested data. In some cases, the Web server will use a technology like the Common Gateway Interface (CGI) or ASP to generate the HTML dynamically, perhaps using data from a database. In other cases, the Web server simply returns a static HTML page.

With XML Web services, a client application (the client application may or may not run in a browser) makes a request on a Web server using a network protocol called SOAP, which is essentially just XML encoded with HTTP. A listener application on the Web server parses this XML document, invokes a routine that performs some processing, and generates an XML document that contains the return value from the XML Web service. The listener then encodes this XML document into a SOAP response and sends it back to the client.

The main advantage of this approach is that it uses XML instead of HTML. An XML document contains both data and metadata (that is, it contains information that describes the structure and meaning of the data in the response document), whereas an HTML document only contains a visual representation of the data. Therefore, the document that you receive from a Web service can be easily sorted, searched, transformed, and combined with information from other Web services.

Note

XML Web services aren't required to use SOAP. Notice that the definition of a Web service says that it can be accessed using "standard Web protocols," which does not preclude the use of other protocols. In fact, the .NET Web services infrastructure supports other network protocols in addition to SOAP. However, in practice, on the .NET platform, SOAP is the preferred approach.

Another huge advantage of this approach is that it is truly cross-platform. Any platform that can send and receive HTTP messages and parse XML can host an XML Web service or be a Web service client.

With XML Web services, it will be possible to deliver services that couldn't be dreamed of using HTML. For instance, imagine that you were injured while you were on the road on an important business trip. You may want to locate any doctor who will accept your insurance in the city that you are currently located in. You could go to the Web right now and search for the location of clinics in any city. However, you would have to search each link to know exactly where the clinic was located and what insurance they accepted. With Web services, it will be possible to create a directory Web service that returns the search results as XML. These search results will contain tags that identify the name, location, type of businessand in the case of a clinicthe insurance plans that they accept. You can filter the search results to show only those clinics that accept your insurance. You can then combine this information with information from a mapping and navigation Web service to quickly locate the closest clinic and to provide you with directions from your current location. Because this functionality is exposed as a Web service, you can access it from a PC, a PDA, or a smart phone. This is the essence of the "information anytime anywhere on any device" mantra of .NET. XML Web services are the key enabling technology behind the .NET vision.

Note

Microsoft actually presented the use case for .NET mentioned previously in a hilarious video that it showed at several of its developer events.

Understanding XML Web services is also the key to understanding .NET. A substantial amount of the technology in .NET was developed to make it easy to create, use, and integrate XML Web services.

XML

XML is a computer language for describing information. XML, like HTML, is a subset of the Standard Generalized Markup Language (SGML). XML is similar to HTML, which is also a computer language for describing information, but HTML only allows you to specify how information will be displayed. For instance, the following HTML displays a table of information about books:

 <HTML> <HEAD> <TITLE>Browse for a book by keyword</title> </HEAD> <BODY> <TABLE BORDER="1">     <TR>       <TD><H4>Title</H4></TD>       <TD><H4>Author Last</H4></TD>       <TD><H4>Author First</H4></TD>     </TR>     <TR>       <TD>The COM and COM+ Programming Primer</TD>       <TD>Gordon</TD>       <TD>Alan</TD>     </TR>     <TR>       <TD>XML by Example</TD>       <TD>McGrath</TD>       <TD>Sean</TD>     </TR> </TABLE> </BODY> </HTML>

Lots of information is in this file that defines how the information will look when it is displayed, but no metainformation is carried with this HTML. There is nothing that tells us what the data means .

Now let's look at the same information displayed as XML:

 <BOOKS>     <BOOK>       <TITLE>  The COM and COM+ Programming Primer       </TITLE>       <AUTHORLAST>  Gordon       </AUTHORLAST>       <AUTHORFIRST>  Alan       </AUTHORFIRST>     </BOOK>     <BOOK>       <TITLE>           XML by Example       </TITLE>       <AUTHORLAST>           McGrath       </AUTHORLAST>       <AUTHORFIRST>           Sean       </AUTHORFIRST>     </BOOK> </BOOKS>

The two files contain the same information, and both can be displayed in Internet Explorer (IE) 5 or greater (try it), but that's where the similarities end. The XML file contains a great deal of metainformation that describes the meaning of the information in the file. This makes this file ideal for use as a medium for information exchange.

You can generate this XML file on any platform perhaps from data stored in a database. This file can be sent over the Internet and then parsed by another piece of software (that runs on a different platform) and imported into some other database. All the metainformation that describes the meaning of the data in the file travels with the file. So how is it displayed? However you want! A companion language called Extensible Style Language (XSL) can be used to format the XML file for display in any format you choose. I could go on like this for pages and pages. XML is a very exciting technology, but this is not a book about XML. If you would like to learn more about XML, I can recommend the following books: Beginning XML , by David Hunter et al., and Professional XML by Didier Martin et al. Wrox Press publishes both books.

SOAP

SOAP is a network protocol that allows you to make method calls across a network. Based on that description alone, nothing is particularly unique about SOAP. There are many other Remote Procedure Call (RPC) protocols out there already that can do the same thing, including Distributed Computing Environment (DCE) RPC; Distributed COM (DCOM), which is based on DCE RPC; and Internet Inter-ORB Protocol (IIOP), which is the protocol used by the Common Object Request Broker Architecture (CORBA). All of these protocols work essentially as shown in Figure 1-2.

Figure 1-2. Executing an RPC.

graphics/01fig02.gif

To make a method call on a remote machine using RPC, you first call the desired method on some sort of proxy; this is Step 1 in Figure 1-2. The proxy usually presents an interface that looks exactly like the remote method. The proxy takes the method name and the parameters that the client is passing to the method and serializes them into a stream. If you are using an object-aware RPC implementation like DCOM or IIOP, the proxy also serializes an object identifier, and in the case of DCOM, an interface identifier, into the stream. In Step 2 of Figure 1-2, the proxy sends the stream to the server machine using a network protocol. Each of these RPC technologies uses a different protocol. DCOM, for instance, uses Transmission Control Protocol/Internet Protocol (TCP/IP) with a dynamically selected port in the range 1,024 to 65,535 (after an initial request is made on port 135).

Note

You can configure the port that DCOM uses to be within a user -specified range. This is useful if you are trying to make DCOM requests across a firewall. You can simply open the ports that you have constrained DCOM to use. See Chapter 19 of the book Inside COM+ Base Services by Guy and Henry Eddon for more information on this.

In Step 3 of Figure 1-2, a stub that is listening on the prearranged port receives the message. The stub unpacks the message and calls the desired method.

Note

With most RPC technologies, a separate piece of software is used to establish the connection between the proxy and the stub. Each RPC has a different name for this piece of software. In CORBA, it's called an Object Request Broker (ORB), and, in DCOM, it's called the Service Control Manager (SCM). With SOAP, a Web server typically performs this task.

After the method completes, the stub passes the return value of the methodand any output parametersback to the client using the same communication mechanism used by the inbound request, as shown in Steps 4 and 5. On the client, the proxy receives this response message, extracts the return value, and returns the value to the client (Step 6). This process of serializing and deserializing a call stack is sometimes referred to as marshaling and unmarshaling, or sometimes the entire process (serializing and deserializing) is just referred to as marshaling. From the client's perspective, you make a method call, and a return value comes back. All of the underlying details of the communication are completely hidden.

Most RPC implementations have several problems. Let's consider DCOM because I know a lot more about it than CORBA. In order to use DCOM, several pieces of proprietary software must exist on both the client and server. Both the client and the server must have something called an SCM, which establishes the communication between the proxy and stub. In addition, a DCOM-compliant proxy must reside on the client, and a stub must reside on the server. You can create a proxy and stub in three ways: (1) You can use a language called Interface Definition Language (IDL) to describe your interfaces. You then use an IDL compiler called MIDL to generate the proxy and stub from the IDL. (2) You can use MIDL to create a binary meta file called a type library that describes the classes and objects in your server, and then you can use a system-provided proxy and stub. (3) You can use a system-provided interface called IDispatch to make your method call, in which case you can use a different system-provided proxy and stub. In either case, you are again dependent on a great deal of proprietary software. Although DCOM is available on several platforms, including Solaris, Linux, and Virtual Memory System (VMS), expect to have a lot of problems getting DCOM to work unless both your client and server are running some variant of Microsoft Windows. Expect to have even more problems if your DCOM server sits behind a firewall. I mentioned earlier that, by default DCOM uses TCP/IP with a dynamically selected port in the range 1,024 to 65,535. Most firewalls block this traffic by default. The solution is to configure DCOM to use a user-defined port range and then to convince your network administrator to allow traffic through on this port (good luck!).

Note

You can also use a technology called the Remote Data Service (RDS) that allows a limited subset of DCOM functionality over HTTP on port 80. Most firewalls are configured to allow HTTP traffic on port 80 to go through.

What makes SOAP unique is that the proxy and stub are implemented using software that is universally available on every platform. To send or receive SOAP messages requires a Web server, a network library for sending and receiving HTTP messages and and an XML parser. Moreover, SOAP can use any of several Internet protocols, although usually HTTP on port 80 is used because it makes SOAP firewall friendly. No proprietary software is required to support SOAP on a particular platform.

With SOAP, the method call is serialized into an XML document using an XML schema that is defined in the SOAP specification.

Note

Microsoft has submitted SOAP to the World Wide Web Consortium (W3C). At the time that I am writing this book, you can find the latest SOAP specification (1.1) at www.w3.org/TR/SOAP/. As I am writing this book, work on version 1.2 of the SOAP specification is underway see www.w3.org/TR/soap12-part1/ and www.w3.org/TR/soap12-part2/.

The XML document describes the method that the client wants to call as well as the name and ( optionally ) the type of each parameter. The following is an example SOAP request:

[View full width]

  [View full width] 
 <?xml version="1.0"?> <SOAP:Envelope xmlns:SOAP="http://schemas.xmlsoap.org/soap/envelope/"SOAP:encodingStyle="http://schemas.  xmlsoap.org/soap/encoding/"> <SOAP:Body> <MonthlyPayment> <numMonths>60</numMonths> <interestRate>8</interestRate> <loanAmt>10000</loanAmt> </MonthlyPayment> </SOAP:Body> </SOAP:Envelope>

In this example, I am calling a method called MonthlyPayment that takes three parameters: numMonths, interestRate, and loadAmt.

Note

Version 1.0 of the .NET Framework SDK and Visual Studio .NET are compliant with version 1.1 of the SOAP specification.

In practice, the XML document will be encoded into an HTTP post message, which is then sent to the server machine. The following code shows how the XML document can be encoded into an HTTP post message. This message is then sent to an HTTP server application on the server machine. This could be either the standard HTTP server on port 80 or a custom high-performance HTTP server that runs on a different port. Notice that there is a SOAPAction line in the HTTP post, which is so that a firewall can filter this message out if it wants to. However, if this message is sent on port 80, most firewalls will let it through.

[View full width]

  [View full width] 
 POST /financialwebservice/TimeValue.asp HTTP/1.0 User-Agent: MSDNWS Content-Type: text/xml Content-Length: 325 SOAPAction: http://gordon1/financialwebservice/TimeValue.asp#AmortizationTable <?xml version="1.0"?> <SOAP:Envelope xmlns:SOAP="http://schemas.xmlsoap.org/soap/envelope/"SOAP:encodingStyle="http://schemas.  xmlsoap.org/soap/encoding/"> <SOAP:Body> <MonthlyPayment> <numMonths>60</numMonths> <interestRate>8</interestRate> <loanAmt>10000</loanAmt> </MonthlyPayment> </SOAP:Body> </SOAP:Envelope>

On the server machine, the stub (or listener as it's usually called in SOAP) simply has to parse the XML document to find the name of the method that it should call and the parameters that it should pass to the method. It then must find a piece of software that can handle this request. Notice that no implementation form is assumed or implied by this protocol. The MonthlyPayment method could be implemented in a COM object, a CORBA object, or no object at all. It could be implemented directly within the listener. After calling the method, the listener then serializes the return value into a SOAP response message, which is just another XML document, and sends it back to the client as the response to the HTTP post. The SOAP response to the request shown previously is shown here.

[View full width]

  [View full width] 
 <?xml version="1.0"?> <SOAP:Envelope xmlns:SOAP="http://schemas.xmlsoap.org/soap/envelope/"SOAP:encodingStyle="http://schemas.  xmlsoap.org/soap/encoding/"> <SOAP:Body> <MonthlyPaymentResponse> <return>202.76</return> </MonthlyPaymentResponse> </SOAP:Body> </SOAP:Envelope>

If there is an error, you will see a message similar to this instead:

[View full width]

  [View full width] 
 <?xml version="1.0"?> <SOAP:Envelope xmlns:SOAP="http://schemas.xmlsoap.org/soap/envelope/"SOAP:encodingStyle="http://schemas.  xmlsoap.org/soap/encoding/"> <SOAP:Body> <SOAP:Fault> <faultcode>-2147221005</faultcode><faultstring>006~ASP 0177~Server.CreateObject  Failed~Invalid ProgID. For additional information specific to this message please visit the Microsoft Online  Support site located at: http://www.microsoft.com/contentredirect.asp. </faultstring><faultactor>-2147221005</faultactor><detail>006~ASP 0177~Server.  CreateObject Failed~Invalid ProgID. For additional information specific to this message please visit the Microsoft Online  Support site located at: http://www.microsoft.com/contentredirect.asp. </detail></SOAP:Fault> </SOAP:Body> </SOAP:Envelope>

Of course, SOAP isn't perfect. Its primary benefitthat it will work on any platformis also its greatest weakness. SOAP implements a least-common denominator set of RPC features. SOAP does not address several issues that are crucial to building distributed systems. Issues like transactions, security and the ability to support the notion of a logical thread that spans multiple machines (DCOM calls this causality ) are not addressed by SOAP. These technologies are not implemented uniformly on all platforms, so they were left out of the SOAP specification for now. SOAP does provide the ability for you to add additional information to your SOAP messages beyond that required by the SOAP specification, and you can even specify that the server must understand the additional information if it is to process the message. Thus, a message that requires a transaction will only be processed if the server can honor the transaction.

Note

Shortly before the publication of this book, Microsoft announced the Global XML Web Services Architecture (GXA), which is an initiative to define open, standards-based protocols to provide infrastructure services such as transactions, security, and reliable messaging to XML Web Services. IBM and BEA Systems are participating in this effort. You can find out more about GXA at the following URL: msdn.microsoft.com/library/default.asp?url=/library/en-us/dngxa/html/understandgxa.asp.

SOAP also has relatively poor performance compared to other types of RPCs. My own benchmarking showed a SOAP method call to be about 10 times slower than an equivalent DCOM method call on a 100 megabyte per second (Mps) local area network (LAN). Part of the problem is HTTP, which is a relatively verbose, text-based protocol. Part of the problem is SOAP itself, which adds extra bytes in the form of XML tags. More bytes must be transferred across the network to make a method call using HTTP and SOAP than a comparable DCOM method call. The other overhead is the cost of parsing the XML and then creating a response XML document.

The numbers aren't as bad as they seem if you are using SOAP for the purposes for which it was intended. If the communication latency (the actual time to move the bytes across the wire from the client to the server through routers and so forth) is lowas it is on my home LANSOAP looks terrible next to DCOM because the additional overhead that SOAP causes is large compared to the network latency. However, on the Internet, network latency dwarfs the time to create and parse XML. Moreover, for a relatively small message size, the portion of the latency that is proportional to the size of the message is small compared to the fixed part of the network latency that you will incur regardless of whether you are sending a small message or a large message. The point is that, when used over the Internet, the performance difference between SOAP and DCOM will not look as bad as when it is used on a LAN. The comparison looks even better when you consider that, in many cases, you cannot use DCOM over the Internet at all because of firewall restrictions.


Team-Fly

Top