XML Web Service Design Considerations | Introducing Microsoft .NET (Pro-Developer)

When I asked Keith Ballinger, the Microsoft project manager who wrote the foreword to this book, what I should tell readers about designing their XML Web services, he said, “Two things—make them chunky, and think carefully about their state.” In this section of the chapter, I’ll discuss what he meant and why he said it, and throw in a few more suggestions for writing robust, high- performance XML Web services quickly.

Make Them Chunky

The simplest sample service shown in this chapter returns a single string. While an XML Web service method can return only one object as its return value, the object is not limited to being a simple type such as the string shown here. It can be any .NET object that can be serialized into XML. I discuss the requirements for XML serialization in Chapter 7, but basically it’s any type of object that contains a default constructor (one that requires no parameters), which means almost every type of object in existence. This means that you can return any amount of data from your XML Web service methods. In Chapter 6, for example, you’ll see a sample XML Web service that returns a DataSet representing the results of a database search.

An XML Web service can return almost any type of .NET object.

The overhead of making an XML Web service call is fairly high. A lot of work goes on at different levels even though you don’t have to write the code for it, as I’ve discussed earlier in this chapter. You want to incur this overhead as seldom as possible, which means bundling as much information as you think you’re going to want into each call. For example, you don’t want to make one call to get a customer’s first name, another to get his last name, and a third to get his middle initial. You’d rather make one call to obtain a customer object containing all this information. Even if the customer object contained some information that you didn’t care about in this particular case, maybe the prefix and suffix fields of the customer’s name, you still make a performance profit by avoiding multiple calls. On the other hand, you probably don’t want to transfer the customer’s entire 25-megabyte buying history record when all you want to do is send him a birthday card. It’s sort of like the shipping charges on Amazon.com, which they’ve cleverly structured to encourage you to buy more than one item per order.

The overhead of an XML Web service call can be significant.

I wondered where the break-even point for chunkiness was, so I wrote a sample program, which you can see in Figure 4-14. The XML Web service method accepts an integer parameter and returns an array containing that number of bytes. The user specifies the number of calls to make and the number of bytes to fetch per call. The client program makes that series of calls and reports the elapsed time.

click to expand
Figure 4-14: Sample program demonstrating the break-even point for chunkiness.

Going from one machine to another machine on a 100-MB/sec Ethernet in my own office with authentication turned off and nothing else happening on the network or on either machine, I measured the results shown in the following table. The basic overhead (the cost of Amazon.com shipping an empty box) is about 6 milliseconds per call. Transferring data took about 1.5 milliseconds per kilobyte (the extra shipping cost based on the weight of your package’s contents). This means that it’s worth transmitting about 4000 bytes of unneeded data to avoid one unnecessary call. Your mileage, of course, may vary significantly. Authentication will raise the per-call overhead; network congestion will raise the per-kilobyte transmission cost; faster machines and networks will lower them both. You’ll have to repeat this measurement on your target system to know exactly where the break-even point is for your project. But, in general, Keith is right; make them chunky.

The time required for a call itself is roughly equal to the time required to transfer a few thousand bytes.

Bytes transferred per call	Elapsed time per call, milliseconds
1	6
100	6
1000	7
4000	12

Think Carefully About Their State

XML Web service objects in their natural state are stateless (groan). That means that ASP.NET constructs a new instance of the XML Web service object for each incoming call and destroys it at the end of the call. None of the data from one call is available to the next call unless you go out of your way to make it so. This situation is somewhat analogous to just-in-time activation of objects in COM+.

XML Web services are stateless by default.

Sometimes this behavior is what you want; sometimes it isn’t. For my time service shown in this chapter, it probably is. Whether a client wants to show the seconds digit has no bearing on what the next client wants or should get. Each function call is sufficient unto itself; there’s no reason to remember anything from one to the next. In the classic sense of an object as a combination of data and the code that operates on the data, you might argue that there’s no object here. The client is simply making a call that has no relation to anything else, and that’s not an object, that’s a function. You can waste your time arguing religious semantics; I’ve got a product to ship.

When this behavior isn’t what you want, an object can maintain state between one call and another. The physical instance of an object is still created and destroyed on demand, but you can maintain the data that it was working on in its previous life, as you might leave instructions in your will for your descendants. There are basically two ways to do this, each with its advantages and disadvantages.

First, we can use the internal state collections in ASP.NET. In Chapter 3 I discussed the Session and Application state collections available to objects used on ASP.NET pages. The same state management options are available to your XML Web service objects, as they run on top of ASP.NET. The base class WebService, from which your XML Web service is derived, contains two collections for holding state, called Session and Application. You put data into them and take data out by accessing the collections through string names of items, just as ASP.NET page code does. However, XML Web service code has one large difference from ASP.NET page code. Because of its overhead, Session state is turned off by default in an XML Web service. If you want to use it, you have to explicitly turn it on by passing a parameter to the WebMethod attribute declaration, as shown in Listing 4-7. You must do this in every method that you want to access session state. If you don’t, the Session object will be Nothing, and accessing it will cause an exception. I’ve written a sample service and client to demonstrate session state and exception handling in the next section. The screen shot is shown in Figure 4-15.

XML Web services can store state in the session- level and application-level state containers in ASP.NET.

Listing 4-7: Turning on session state using the WebMethod attribute.

<WebMethod(EnableSession:=True)> _ Public Function IncrementAndReturnSessionHitCount() As String

click to expand
Figure 4-15: Sample service demonstrating session state and exception handling.

That sounds easy to program, and it is. However, excessive use of the Session collection can hinder your service’s scalability for several reasons. First, it takes time on each XML Web service call to create the session object, restore its state, and attach it to the XML Web service context. You also pay a space price to store the state information between calls. So don’t just store session state for the sake of storing it; think carefully about exactly how much you need and why. Make sure to omit it from methods that don’t require it.

Using ASP.NET Session state requires server CPU cycles and storage space.

The indeterminacy of the Session state lifetime is a subtler problem. Suppose a client calls method A on an XML Web service that stores some state in the Session object. Suppose the client then calls method B, whose functionality is dependent on that state, some time later. If the session state timeout interval has expired, then the state on which method B depends will no longer exist. You’ll have to program both client and server to handle that eventuality. You could increase the session state timeout interval, but this might mean keeping lots of state information around for XML Web service clients that have finished their business and gone away. Session state is a good idea in pages viewed by casual human users, but this indeterminacy is a bigger problem in functions called by other computer programs. If you go this route, think about having some sort of logout method that the client can call to clean up and dump the session state.

The indeterminacy of Session state lifetime can cause algorithm problems.

The alternative to storing object state in the Session collection is to maintain all state on the client. The client calls method A on an XML Web service, which returns some sort of object containing internal state information. The client then passes this object back to the service when it calls method B, and the server picks the state information out of the parameter and picks up where it left off. The advantages are obvious: neither party has to worry about session state timing out, and the server doesn’t have to buy space to store state for its thousands of clients. The disadvantages are also obvious: all the required state information has to travel over the network on each call, increasing the bandwidth requirements. Also, the client can examine and potentially change the state information between calls, so you’ll have to encrypt it if it contains any sensitive information. This is how Web Forms controls maintain their view state from one round trip to another. (You can download a chapter about Web Forms controls from the book’s Web site, http://www.introducingmicrosoft.net.)

Alternatively, all state can live on the client side, which has its own set of problems.

Tips from the Trenches

While XML Web service objects are inherently stateless, remoting objects can be made inherently stateful. If you are designing a Web service for communicating between one Microsoft system and another, and you find yourself doing a lot of work to manage state, have a good look at using remoting instead of Web services. Chapter 10 discusses remoting.

Handling Exceptions

I discussed error handling at great length in Chapter 2, explaining the problems of communicating an error from method to caller and why .NET structured exception handling was such an advance. However, structured exception handling only works in .NET because all programs use the same MSIL implementation internally. The whole point of XML Web services is to receive and service calls from clients that are not running on the .NET Framework, so how can we signal errors to them?

Signaling errors to XML Web service clients can be a problem.

The one commonality that client and server have in XML Web services is SOAP. The SOAP specification provides for error handling with an element called <Fault>. When the XML Web service throws an exception of any type, ASP.NET catches it. ASP.NET then copies the exception’s message into a SOAP <Fault> element, and passes it back that way.

XML Web service exceptions are transmitted to clients by a SOAP <Fault> element.

I’ve written a sample program that causes an exception by accessing a non-existent session property (I didn’t turn on the session state, as I discussed in the previous section). Accessing the state object throws a NULL reference exception, which ASP.NET catches and converts into a SOAP <Fault> element, as shown in Listing 4-8.

Listing 4-8: SOAP <Fault> example.

<soap:Envelope> <soap:Body> <soap:Fault> <faultcode> soap:Server </faultcode> <faultstring>Server was unable to process request--&gt; Object reference not set to an instance of an object </faultstring> <detail /> </soap:Fault> </soap:Body> </soap:Envelope>

Once the SOAP fault returns to the client, it’s entirely up to the client to determine how to proceed with it. Any proxy generator worthy of the name will read the SOAP fault and translate it back into the language of the proxy. For example, a Java proxy will turn the SOAP fault into some kind of Java exception. The .NET proxy generator code turns it into a .NET exception of type System.Web.Services.Protocols.SoapException. Our client program catches it, as it would any other type of exception, and tries to make sense of it. The Message element of the exception contains the original message from the server with the location of the error on the server. All the rest of the fields of the exception refer to where it originated on the client, which doesn’t tell us anything about the source of the error on the server side. So if you want the client to understand the cause of the error, to be able to differentiate, say, a login failure from a file-not-found error, you’ll have to come up with some sort of encoding system to place that information in this string.

It is up to the client to map the SOAP <Fault> element into its own programming language.

What if the client hasn’t accessed the service with SOAP? Then throwing an exception from your XML Web service code causes ASP.NET to send back an uninformative HTTP error screen. If we want to be able to signal useful error information to HTTP clients, we have to use special-case return values. Deciding whether or not this is cost effective is a marketing call. I’d suggest probably not.

Replacing the Namespace URI

You’ve probably seen the message on the XML Web service test page that says, “This web service is using http://tempuri.org/ as its default namespace. Recommendation: Change the default namespace before the XML Web service is made public.” In the wild and woolly world that is the Web, it’s entirely possible that other vendors will create services with the same object and method name as yours, such as Service1.GetTime. We’d like to have some way of definitively marking our service so that callers who care are certain that they have ours and not someone else’s. You can argue that the URL at which a client accesses a service definitively differentiates yours from another’s, but we’d like a way of identifying the service itself regardless of its location. When the secret police kick down your door at 4:00 AM shouting, “KGB! You are Ivan Ivanovich?” you’d very much like to be able to say, “No, he lives down the hall.”

Note

In Visual Basic .NET 2003, you won’t see this warning because the project generator now adds an attribute specifying a different value (which you probably still ought to change to make it unique). C# still omits any attribute, so you do see this warning.

You do this by placing a namespace URI on your service by means of a .NET programming attribute. A namespace URI is nothing more nor less than a big, long, fancy string that is the unique family name of your service and all other services that you create. You are promising callers that any service and method name within this namespace is unique, that you are in control of the development of services marked with this namespace URI, and that you’ve made sure that there aren’t any conflicts. A proxy wanting to access an XML Web service can specify the namespace URI in addition to the service and method name to make even more sure that he’s gotten what he asks for. The server will compare the requested URI with the URI that it actually has, and throw an exception if they don’t match.

An XML Web service contains a namespace URI to distinguish it from other XML Web services.

The namespace URI can be anything at all that you’d like it to be, as long as it’s unique in the universe—for example “PoliticalFamilyFromHyannisportMassWithMoreMoneyThanBrains.” It often begins with your Web address because you own it and no one else should be using it, just as you often use your e-mail address as a login ID. Nothing prevents someone else from using yours just to mess you up, but they probably won’t. No one accesses the namespace URI as a Web address during an XML Web service call, so there doesn’t have to be anything there and often isn’t. However, some vendors do indeed put product or service information at the Web address used as a namespace URI so that potential customers who see the code know where to contact them.

Once you’ve chosen a namespace URI, you place it on your XML Web service by using the attribute Namespace on your XML Web service class declaration, as shown in Listing 4-9. The namespace URI appears in your XML Web service’s WSDL file, and thus makes its way into generated proxy clients.

You assign a namespace URI to your service with a .NET programming attribute.

Listing 4-9: Placing a namespace on your XML Web service.

<WebService(Namespace:="http://www.rollthunder.com/")> _ Public Class Service1

Note that specifying the namespace of the service the server is accessing does not provide any real check on the identity of the server. It only specifies who the server must claim to be, without requiring proof that the server really is that particular one—sort of like a limousine driver carrying a sign with the expected passenger’s name on it, but not asking for a picture ID. If your client really cares about authenticating the server (as opposed to the other way around, which is much more common), then you’ll have to work with some sort of trust provider certificate program.

Tips from the Trenches

My customers suggest that you replace the namespace URI sooner rather than later, ideally when you first generate the service project, before you get a whole lot of clients pointed at the temporary URI and have to change them all.