What Do We Need in a Solution? | Using XML with Legacy Business Applications

When we ask, "What do we need?" we're talking about requirements. There are two types of requirements: functional and nonfunctional (the latter are also known as quality requirements or system constraints). The former have to do with what the system is supposed to do. The latter have to do with how it does what it does. Both are important, and both determine the overall approach of this book.

Beyond the overall dictate of solving the problem, two distinct sets of requirements are imposed on the solution by technical end users on the one hand and by application developers on the other. I'll talk a little later about why I'm dealing with both, but for now if you don't care about the other group you can just skip the relevant paragraphs.

Functional Requirements

The technical end user who has an application that doesn't speak XML more than likely needs the solution to do one or more of the following:

Convert an XML-formatted file to a flat file
Convert a flat file to an XML-formatted file
Convert an XML-formatted file to a CSV file
Convert a CSV file to an XML-formatted file
Convert an EDI-formatted file to an XML-formatted file
Convert an XML-formatted file to an EDI-formatted file

A user may want the solution to support other formats, but CSV, flat file, and EDI should handle most cases. For example, an end user may also need to get data out of a database (relational, hierarchical, or otherwise ) and put it into an XML format, or go back the other way. Sorry, but these types of problems are a bit beyond our scope. I will, however, give in Chapter 12 an overview of some approaches for doing things like this. When I present the approaches, you'll understand why problems like this exceed our scope a bit.

The developer who has an application that doesn't speak XML has some fairly simple requirements:

Enable the application to export data in an XML format
Enable the application to import data from an XML format

The word "an" is very important here. I don't say "a specific XML format"; I say "an" XML format. Why? Because once we have data in an XML format, it is fairly easy to convert it to another XML format. You want to make your life simple? Don't try to anticipate all the XML formats your users will want. Give them one, and let them convert. I'll have more to say about this later.

Those are the primary functional requirements. Both groups may have a secondary functional requirement to be able to validate the format of an XML document (which conventionally is referred to as an instance document ). When validating an instance document, the format of the document is usually defined in an XML 1.0 Document Type Definition ( DTD ) or a schema written in the World Wide Web Consortium ( W3C ) XML Schema language . Both of these define things such as the Element and Attribute names used in the document and the overall structure of the document. The W3C XML Schema language allows documents to be defined in much greater detail than is possible with a DTD. These will both be discussed in Chapter 4.

Validation may need to occur either before the document is read or after it is produced. For going to and from EDI, end users may want to check that the EDI-formatted file complies with the relevant standard either before they read it or after they write it. The approach presented in this book satisfies most such requirements.

Some other functional requirements have do with enabling business applications to support the exchange of business data with other organizations electronically . Those requirements are within the scope of this book, and I'll discuss them in Chapters 12 and 13.

So, we should be pretty clear by now about what we want the solution to do. We now need to figure out how we want it to do it.

Nonfunctional Requirements: Good, Fast, and Cheap

Remember the old joke about good, fast, and cheap, that you can pick only two out of the three? Well, with any luck we might be able to get you all three. Good is pretty relative (as well as unspecific), so I won't go into a lot of detail here. The same applies for fast. So let's talk about cheap and a few other typical requirements.

If you're an end user and you're paying a relatively small price for this book instead of buying a full-featured Enterprise Application Integration (EAI) system like Mercator, we understand each other. If you find the word cheap a bit harsh , think of it this way: You can't cost-justify purchasing a full-featured package to convert a few files on an infrequent basis. You can't justify spending more on utilities than you did on the business application. Take your pick, but cheap is fine with me. If you're a developer, you want to support XML without spending the next release's entire new features budget. Return on investment is what it's all about. The bottom line is the bottom line.

So, beyond cheap, what's important? Simple usually goes hand in hand with cheap, as does easy. Beyond these biggies, there are several more that developers and information technology staff usually care about. I'll assume that you do too.

Maintainability : We need to be able to keep the solution running and add new functionality without too much trouble. Code that we add to the application should be easy to fix and enhance.
Reusability : When we develop code to solve one problem, we would like to be able to easily reuse it to solve similar problems. We don't want to keep reinventing the wheel or repeating the same snippets of code in several routines.
Flexibility : We shouldn't be straightjacketed into a single approach that handles just one type of situation. We should be able to modify it to handle changing circumstances.
Modularity : The solution should be nicely broken up into manageable pieces. This helps with reusability and maintainability.
Portability (or platform independence) : The solution should work on a variety of platforms so we can move it to a different platform with few, if any, changes.

However, we do need to draw the line somewhere and note some things that are of lesser importance. In this book I assume that you don't care as much about the following requirements.

Performance : In performance we're concerned with resource usage such as CPU time required to perform conversions, memory usage, and disk space usage. This is most often a concern when a system is used for many purposes simultaneously or is otherwise somewhat constrained in a key resource such as disk space or memory. Due to the relatively low price of hardware these days, and due to the fact that the typical machine is a stand-alone PC, we're not going to be overly concerned about performance.
Real-time processing : If we think of performance as applying to resource usage, then real-time processing (as opposed to old-fashioned batch processing) is more relevant to the elapsed time the system takes to complete an action. We aren't concerned with a user clicking a mouse or hitting a key and waiting for an immediate response. We are concerned with batch, file-oriented interfaces.
Support for Web services, frameworks, and other bleeding-edge technology : We're talking legacy applications here. If you want to make your applications do WSDL, UDDI, .NET (to be discussed in Chapters 12 and 13), or whatever, that's not cheap, simple, and easy. If you're concerned about this type of stuff, you need something beyond this book. Many of the techniques discussed in this book are quite relevant to getting data into and out of the XML formats typically used by Web services, but what happens after the handoff to the Web service is beyond our scope.

Summing it all up into just a few words, in this book I present a fairly simple, pragmatic approach that can be implemented relatively quickly and at fairly low cost.