Where to Use XSLT


This section identifies what tasks XSLT is good at, and by implication , tasks for which a different tool would be more suitable. I also look at alternative ways of using XSLT within the overall architecture of your application.

As I discussed at the beginning of the chapter, there are two main scenarios for using XSLT transformations: data conversion and publishing. We'll consider each of them separately.

Data Conversion Applications

Data conversion is not something that will go away just because XML has been invented. Even though an increasing number of data transfers between organizations or between applications within an organization are likely to be encoded in XML, there will still be different data models, different ways of representing the same thing, and different subsets of information that are of interest to different people (recall the example at the beginning of the chapter, where we were converting music between different XML representations and different presentation formats). So, however enthusiastic we are about XML, the reality is that there are going to be a lot of comma-separated-values files, EDI messages, and any number of other formats in use for a long time to come.

When you have the task of converting one XML data set into another, then XSLT is an obvious choice (Figure 1-4).

click to expand
Figure 1-4

It can be used for extracting the data selectively, reordering it, turning attributes into elements or vice versa, or any number of similar tasks. It can also be used simply for validating the data. As a language, XSLT 1.0 was best at manipulating the structure of the information as distinct from its content: it was a good language for turning rows into columns , but for string handling (for example, removing any text that appears between square brackets) it was rather laborious compared with a language like JavaScript or Perl that offered support for regular expressions. This has changed considerably in version 2.0, and now there are few XML transformation tasks that I wouldn't tackle using XSLT.

XSLT is also useful for converting XML data into any text-based format, such as comma-separated values, or various EDI message formats (Figure 1-5). Text output is really just like XML output without the tags, so this creates no particular problems for the language.

click to expand
Figure 1-5

Perhaps more surprising is that XSLT can often be useful to convert from non-XML formats into XML or something else (Figure 1-6). In this case you'll need to write some kind of parser that understands the input format; but you would have had to do that anyway. The benefit is that once you've written the parser, the rest of the data conversion can be expressed in a high-level language. This separation also increases the chances that you'll be able to reuse your parser next time you need to handle that particular input format. I'll show you an example in Chapter 11, page 703, where the input is a rather old-fashioned and distinctly non-XML format widely used for exchanging data between genealogy software packages. It turns out that it isn't even necessary to write the data out as XML before using the XSLT stylesheet to process it: all you need to do is to make your parser look like an XML parser, by making it implement one of the standard parser interfaces: SAX or DOM. Most XSLT processors will accept input from a program that implements the SAX or DOM interfaces, even if the data never saw the light of day as XML.

click to expand
Figure 1-6

One caveat about data conversion applications: today's XSLT processors all rely on holding all the data in memory while the transformation is taking place. The tree structure in memory can be anything up to ten times the original data size , and so if you have 512MB of memory, I wouldn't advise tackling a transformation larger than 50MB, unless you do some performance tests first. Even at this size, a complex conversion can be quite time-consuming ; it depends very much on the processing that you actually want to do.

One way around this is to split the data into chunks and convert each chunk separately- assuming , of course, that there is some kind of correspondence between chunks of input and chunks of output. But when this starts to get complicated, there comes a point where XSLT is no longer the best tool for the job. You would probably be better off-loading the data into an XML database such as Tamino or Xindice, and using the database query language to extract it again in a different sequence.

If you need to process large amounts of data serially , for example extracting selected records from a log of retail transactions, then an application written using the SAX interface might take a little longer to write than the equivalent XSLT stylesheet, but it is likely to run many times faster. Very often the combination of a SAX filter application to do simple data extraction, followed by an XSLT stylesheet to do more complex manipulation, can be the best solution in such cases.

Publishing

The difference between data conversion and publishing is that in the former case, the data is destined for input to another piece of software, while in the latter case it is destined to be read (you hope) by human beings. Publishing in this context doesn't just mean lavish text and multimedia, it also means data: everything from the traditional activity of producing and distributing reports so that managers know what's going on in the business, to producing online phone bills and bank statements for customers, and rail timetables for the general public. XML is ideal for such data publishing applications, as well as the more traditional text publishing, which was the original home territory of SGML.

XML was designed to enable information to be held independently of the way it is presented, which sometimes leads people into the fallacy of thinking that using XML for presentation details is somehow bad. Far from it, if you were designing a new format for downloading fonts to a printer today, you would probably make it XML-based. Presentation details have just as much right to be encoded in XML as any other kind of information. So, we can see the role of XSLT in the publishing process as being converting data-without-presentation to data-with-presentation, where both are, at least in principle, XML formats.

The two important vehicles for publishing information today are print-on-paper and the Web. The print-on-paper scene is the more difficult one, because of the high expectations of users for visual quality. XSL Formatting Objects attempts to define an XML-based model of a print file for high-quality display on paper or on screen. Because of the sheer number of parameters needed to achieve this, the standard has taken a while to come to maturity. But the Web is a less demanding environment, where all we need to do is convert the data to HTML and leave the browser to do the best it can on the display available. HTML, of course, is not XML, but it is close enough so that a simple mapping is possible. Converting XML to HTML is the most common application for XSLT today. It's actually a two-stage process: first convert to an XML-based model that is structurally equivalent to the target HTML, and then serialize this in HTML notation rather than strict XML.

The emergence of XHTML 1.0, of course, tidies up this process even further, because it is a pure XML format. But the emergence of XSLT has arguably reduced the need for XHTML, because once HTML becomes merely a transient protocol used to get information from the XSLT engine to the Web browser, its idiosyncrasies cease to matter so much.

When to do the Conversion?

The process of publishing information to a user is illustrated in Figure 1-7.

click to expand
Figure 1-7

There are several points in such a system where XSLT transformations might be appropriate:

  • Information entered by authors using their preferred tools, or customized form-filling interfaces, can be converted to XML and stored in that form in the content store.

  • XML information arriving from other systems might be transformed into a different flavor of XML for storage in the content store. For example, it might be broken up into page-size chunks.

  • XML can be translated into HTML on the server, when the users request a page. This can be controlled using technology such as Java servlets or Java Server Pages. On a Microsoft server you can invoke the transformation from script on ASP.NET pages.

  • XML can be sent down to the client system and translated into HTML within the browser. This can give a highly interactive presentation of the information and remove a lot of the processing load from the server, but it relies on all the users having a browser that can do the job.

  • XML data can also be converted into its final display form at publishing time and stored as HTML within the content store. This minimizes the work that needs to be done at display time and is ideal when the same displayed page is presented to very many users.

There isn't one right answer, and often a combination of techniques may be appropriate. Conversion in the browser is an attractive option when XSLT is widely available within browsers, but we still don't have universal availability of XSLT 1.0 in all browsers, let alone 2.0. Even when client-side conversion is done, there may still be a need for some server-side processing to deliver the XML in manageable chunks and to protect secure information. Conversion at delivery time on the server is a popular choice, because it allows personalization, but it can be a heavy overhead for sites with high traffic. Some busy sites have found that it is more effective to generate a different set of HTML pages for each section of the target audience in advance, and at page request time to do nothing more than select the right preconstructed HTML page.

It's time now to take a closer look at the relationship between XSLT and XPath and other XML- related technologies.




XSLT 2.0 Programmer's Reference
NetBeansв„ў IDE Field Guide: Developing Desktop, Web, Enterprise, and Mobile Applications (2nd Edition)
ISBN: 764569090
EAN: 2147483647
Year: 2003
Pages: 324

flylib.com © 2008-2017.
If you may any questions please contact us: flylib@qtcs.net