XML and JMS | Professional JMS

The previous sections have looked at XML, and what it has to offer messaging applications in general. In this section, we will examine some of the special considerations associated with bringing JMS and XML together. Most application designers will attempt to decouple the two, because there really should not be any dependencies created between the two technologies. For example, most JMS-related application code should simply operate on the javax.jms.Message interface. The application can extract (or insert) text strings as appropriate. These strings may or may not be XML - the handler should encapsulate this so that the system could be easily adapted to accommodate non-XML content, if the need arises. Similarly, the XML code should be logically distinct from the JMS code. This allows reuse of message handlers in non-JMS applications (for example, as a web service handler). We will explore some of these design principles while developing a content-based router using JMS and XML in a later section.

Carrying this idea one step further, consider what happens when you need reliability in your messaging application, but you may not always be able to delegate this to the JMS infrastructure. This might be the case when a message has to transit across multiple intermediates, some of which may not have access to robust messaging software for transport. In these cases, it becomes particularly important to design message format to accommodate the fields needed to support reliable transfer across unreliable transports. Essentially, this means re-implementing some of what we take for granted from JMS providers. We will examine these interesting issues in the following sections.

Of course, there are also times when coupling between XML and JMS is entirely appropriate. In these cases, we should attempt to leverage what each technology does best. The following sections will consider these issues as well.

Requirements of Robust Messaging Systems

As messaging systems become more widely used, greater demands are made of them. Many of these demands need to be anticipated early because they can have a profound effect on message design. A typical requirement a mature messaging architecture must satisfy is for durable message transport. This provides message resilience by ensuring that data in transit is not lost if a system - or even the entire network fails. How to exchange errors and remote exceptions between the server and the client is another important point of consideration. The need to expose ACID (the transaction properties Atomicity, Consistency, Isolation, and Durability) server transactions to a client is also common. That is, provide a client API - possibly integral to the messaging infrastructure, and potentially interfacing with a downstream XA-protocol interface exported by a data store - to begin, end, monitor, commit, and rollback a distributed transaction. This interface may further coordinate two-phase commits among multiple data stores, thus making the messaging infrastructure into a resource manager. A good example is a bank doing a credit/debit between two partners. Either both operations must succeed or the transaction as a whole must rollback, undoing the changes on both systems.

Reliability is also important: there must be a guarantee that messages will arrive at their destination, even if the receiving system (or any intermediate hop) is currently down. Furthermore, only a single copy of the message must arrive. Imagine a bank system where a single debit message arrived twice. Scalability and fault tolerance are always necessary; previous discussions touched on these issues. Advanced queuing functions that relate to Quality of Service (QoS) need to be considered in a mature and robust messaging environment. Some examples are:

Message prioritization
Time-to-live; cancellation
Globally unique message identification
Association linking responses to requests when communication channels are multiplexed

One needs to consider session establishment, maintenance, identification, and teardown on a remote server, as well as distributed naming for remote services (for example, associating a well-known service name with a transport instance, such as a particular message queue). More complicated message flow models (beyond hub-and-spoke and even Publish/Subscribe) begin to become necessary as messaging architectures grow. High-level infrastructure that can choreograph complicated lattice communication flows, where messages fan out and consolidate, becomes crucial as messaging architectures expand in the enterprise. Workflow applications exhibit this kind of behavior.

For example, consider an electronic form that requires several authorizations. Authorizations can be made serially (that is, A gives their approval, followed by B,…); however, if each is independent, they could proceed in parallel. The final step then involves consolidating all the individual authorizations into a completed electronic form. Messaging applications often exhibit similar behavior, and orchestrating these can be extremely complex. This demands complicated, high-level software that coordinates transport infrastructure and applications (such as IBM's MQSeries Workflow).

Application-based vs. MOM-based Robust Messaging

The point of enumerating these issues is to draw attention to a fundamental consideration when designing a messaging framework. Since we know that eventually, the above items will become the requirement list for any rigorous and comprehensive messaging deployment, the question becomes where to address them.

One strategy is to push these functions into the application, thereby greatly increasing its complexity and demanding that considerable message handling logic be built into it (a task as daunting as it is unrewarding). To accomplish this, however, the message must have visible fields built in to enable messages to be reliable, traceable, expirable, etc. The message, therefore, becomes quite complex.

This strategy allows the application to make use of a transport that is very fast, simple, and lightweight, such as connection-oriented TCP sockets or connectionless UDP datagrams. XML is effective in such messaging applications because it is flexible enough to readily accommodate the additional fields needed to support implementation of some of the above demands. We will shortly see how to accomplish this when we examine some existing XML message frameworks like BizTalk and ebXML. It is a basic requirement to decouple these messages from transport. Therefore, they must support the message hooks to implement reliability in the application itself.

Alternatively, the messaging infrastructure can assume full responsibility for all these items. This is exactly what robust, Message-Oriented Middleware (MOM) attempts to do: take on the burden in its role as transport of all or many of the items listed above. There are a large number of implementations of MOM, including BEA MessageQ IBM's MQSeries, Microsoft's MSMQ, Progress's SonicMQ. They take the focus of the above items away from the developer; and turn it over to the MOM administrator. This results in much more flexible application code. Changes to systems or network topology become administrative changes, not coding changes. JMS is effective as an interface to messaging systems because it abstracts away the complexity of each different messaging product, helping developers decouple from a particular proprietary implementation.

It may seem like we have two competing agendas here, but in reality, when brought together, XML and JMS are actually very complementary. The next sections look at some of the considerations in making XML and JMS work effectively together.

JMS Headers and Properties

One of the fundamental design considerations when using JMS and XML is what data should go into JMS properties and what should go into the message. Each implementation will have slightly different requirements, and so will have to take different approaches. However, there are some important issues to consider.

Data put into JMS properties can be used for message selection. Fields that identify remote services by logical name are good candidates for inclusion in a JMS property. In the web services arena, it is common to use a URL to identify target services. A JMS equivalent could be a specific topic or queue; however, it is more common to publish multiple services off a single topic or queue. JMS defines a SQL-like syntax for selecting messages based on property contents that is highly efficient (depending on the service provider's implementation). This is available to both the Point-to-Point and Publish/Subscribe messaging domains. Pushing these data into the message will force the handler to examine each message individually. The message would have to be parsed and queried, probably using an XML query language like XPath, which has a very different syntax and query model from the JMS message selection. This forces Java programmers to learn about XML, which may not be a bad thing, but it may also not be an effective allocation of resources. It is also not particularly efficient, particularity for simple routing applications.

On the other hand, using JMS properties couples you to JMS. The JMS specification does not try to define how JMS properties are exposed to non-JMS enabled applications. Thus, in a business-to-business MOM-messaging environment, where not all the applications are using JMS, this could create problems. Furthermore, if messages leave the control of the MOM software, for example, in a multi-hop system where the final hop uses simple HTTP as transport, then the JMS headers will be lost.

Note

It is important to remember that two systems that communicate using messaging may not be directly connected. For example, a message may have to traverse multiple intermediates to get to its destination. The term "multi-hop" refers to these systems. Each hop may use different transports, such as HTTP, or even a MOM system that is not JMS-compliant, and thus may drop JMS headers and properties. The presence of intermediates will have an effect on the choice of what goes into the message proper, and what goes into JMS properties and headers. Some of the messaging frameworks we will examine in Appendix C have very specific rules regarding how intermediates treat headers.

One solution to the problem of preserving JMS headers and properties would be to render these into the XML message for transport to non-JMS clients. This would be particularity appropriate if the messaging format used one of the detailed messaging frameworks like ebXML or BizTalk, which already define XML elements for most of the important JMS properties that implement reliable messaging.

The JMSType header, which is a standard JMS header, has potential use to identify XML message schemas residing in a repository. The JMS specification does not specify how a repository is to be implemented, and warns that service providers may have their own implementation of this header.

What follows is an example, using simple synchronous queuing, of sending a message with an accompanying schema entry:

    String text = "<?xml version=\"1.0\"?> ...";    TextMessage textMessage = queueSession.createTextMessage();    textMessage.setText(text);    textMessage.setJMSType("http://someInternetAddress/example.dtd");    queueSender.send(textMessage);

On the receiving end:

    TextMessage textMessage = (TextMessage) queueReceiver.receive();    String text = textMessage.getText();    String schema = textMessage.getJMSType();

The receiver would make use of this URI in validation (the URI could identify a DTD, XML Schema, or even some other schema definition language). Of course, the message itself could also contain the URI in the DOCTYPE as well.

JMS Body and XML Documents

XML documents are best transported using the JMS javax.jms.TextMessage class. As an example, here is a simple XML text string sent to a queue:

    String text = "<?xml version=\"1.0\"encoding=\"UTF-8\"?> ... ";    TextMessage textMessage = queueSession.createTextMessage();    textMessage.setText(text);    queueSender.send(textMessage);

The synchronous receiver is similar:

    QueueReceiver queueReceiver = queueSession.createReceiver(queue);    TextMessage textMessage = (TextMessage) queueReceiver.receive();    String text = textMessage.getText();

To prepare a message for parsing with DOM and SAX:

    StringReader stringReader = new StringReader(text);    InputSource inputSource = new InputSource(stringReader);    try {      Document doc = documentBuilder.parse(inputSource); // Parse with DOM      xmlReader. parse (inputSource); // or Parse with SAX    } catch (java.io.IOException e) {        System.err.println ("IO Exception");    } catch (org.xml.sax.SAXException e) {        System.err.println("IO Exception");    }

Translation from an existing SAX or DOM representation to a text string is quite a bit more complicated. For DOM, it is not too complicated to build a recursive utility method that visits every node in depth-first order, rendering the results into a java.io.StringWriter. If you build your own, the DOMEcho class, included in the JAXP distribution, is a good place to start. For SAX, consider the XMLWriter class from David Megginson's site at http://megginson.com/Software/index.html.

Another consideration is to use the JAXP TRaX API with a null transform, writing into a StreamResult object. A strategy similar to this - but with a real XSLT transform- is employed in the router example below.

Another parsing and manipulation technology to keep an eye on is JDOM (http://www.jdom.org). This is a 100% pure Java API that approaches XML from the perspective of the Java programmer and leveraging the features of the language. The focus is to make working with XML simple and intuitive. JDOM greatly simplifies translation of text representations of XML to/from models like DOM and SAX. This will be a hot technology.