The adoption of XML technologies is primarily being driven by enterprise applications, which are loosely defined as mission-critical, business-class software applications, as opposed to desktop applications geared for home users. Now that I’ve introduced the core technologies, this section provides an overview of the various industry focus areas that are being overhauled thanks to new XML technologies.
By using SOAP, XML Schema, and other related technologies (collectively referred to as Web services), companies can expose programmatic access to business logic over the Web. This business logic can subsequently be accessed by any device, remote process, desktop application, or Web application. Web services are transforming the World Wide Web from simple business-to-consumer applications, which require human interaction, to a distributed federation of loosely coupled services. A key area for growth will be enhancing business-to-business (B2B) application infrastructure, enabling the creation of virtual marketplaces, as well as streamlined order processing and back-office operations.
The Web in its current form is growing at an astounding rate, with an estimated base of 3 billion HTML documents distributed across the world. These documents are primarily intended to be read by people through a browser. Because it could take thousands of years to manually read through these documents, it becomes increasingly important to preserve a document’s semantics. The semantics provide the context or meaning of a document, allowing you to better understand it. Contrast this to brute force search engines that determine a document’s relevancy to a particular subject simply by calculating the number of times a keyword occurs.
Although search engines such as Google.com and Alltheweb.com have developed impressive algorithms for making sense of the vast amount of data on the Web, computers in general have quite a tough time deciphering the billions of documents out there. The challenge comes from all the miscellaneous things that clutter the actual document content: navigation bars, graphics, advertisements, applets, Flash files, and other things meant to enhance the human user experience but that don’t count as actual page content as far as a Web-bot is concerned.
XSL/XSLT stylesheets are commonly used by Web developers to separate data from presentation markup on a Web page. This separation can greatly simplify the indexing, sharing, and retrieving of data on the Web by both people and Web-bots. XSLT also enables the internationalization and localization of Web sites and the delivery of personalized Web site content to Web-enabled mobile devices. XSLT has the potential to radically change Web development. It’s likely to become a critical skill of future Web developers.
The publishing and news industries regularly work with volumes of documents, typically published in multiple output forms, most commonly in print and Web-based media. The goal has long been a single document source from which all derivative output could be generated. XML has many benefits as a storage format for the rich, structured content represented in printed publications and Web articles. Industry standard XML vocabularies such as DocBook (an XML vocabulary for describing technical publications) and NewsML (an XML vocabulary for describing news articles) facilitate the preservation of the semantics and context of information and allow for efficient retrieval and repurposing of content. Using XSLT, an XML document can be transformed into several XML-based document-layout languages including PDF, PostScript, Scalable Vector Graphics (SVG), and XHTML.
Document management refers to storing a company’s documents in a document repository, thereby preserving the knowledge of a company. Document management systems have been around for a while—long before the relatively recent standardization of the XML specification. Historically, these systems have been both proprietary and costly to implement. Today, XML technologies make document management systems far easier to implement through the use of one or more industry-standard XML languages (or tag sets) for storing a particular type of information, an XML editor, and a database or XML server capable of storing XML documents. This standards-based approach to document management has the potential to unlock proprietary content management systems.
The back-end processing systems of large companies are a heterogeneous mix of various distributed application platforms (J2EE, CORBA, DCOM, and so on). These applications are written in different programming languages, run on different operating systems, and use different data repositories. XML is being used in many areas to integrate enterprise applications. Most commonly, an XML document is employed as an intermediary format (or adaptor) between two or more systems. For example, an Electronic Data Interchange (EDI) message may be encoded into an XML format and then sent off to another application or database that processes the XML message. Software vendors such as Microsoft and Oracle have been adding support to their database product offerings to deal with such scenarios.
Microsoft, the world’s largest software company, produces hundreds of products, Web-based services, and server applications. The challenge for the recently released .NET Framework is to make all these pieces work together and expose the combined functionality through Web services. The Microsoft .NET product vision encompasses various application servers, SQL Server 2000, the Windows operating system, multiple programming languages, mobile devices, and more. SOAP and XML Schema bring all the pieces together, allowing tremendous application interoperability. XML development skills are likely to become an essential requirement for Microsoft developers wanting to access the various .NET products and services.
The Java platform enables platform interoperability at a binary level. Java programs are compiled into an intermediate language and subsequently executed on any operating system through a native Java Virtual Machine. The combination of Java and XML has the potential to improve interoperability by further decoupling the application from the underlying data storage format and opening up the application’s communication protocol; these are important milestones in realizing true application portability. At the time of this writing, Sun has just recently released several powerful new standards for Web services, XML bindings, and XML messaging, which will greatly improve application interoperability.
XML technologies are interrelated and are pervasive across a wide spectrum of industry applications. Figure 1-2 graphically summarizes some of the most common uses and their relationships.
Figure 1-2: Common use of XML technologies in the enterprise.
It’s safe to say that any IT professional writing any kind of code has had (or will eventually have) the need to effectively edit and work with XML documents at some level. Clearly the ability to develop using XML is a critical skill in today’s job market!