Team-Fly |
XML, Web Services, and the Data Revolution By Frank P. Coyle | |
Table of Contents | |
Chapter 1. XML: Extending the Enterprise |
At the beginning of this chapter we outlined several areas in which XML's impact has been felt. To understand the changes that are occurring in today's software world, it's helpful to look at XML in the context of three revolutions in which XML is playing a major role. As Figure 1.7 illustrates, the three areas of impact are data, which XML frees from the confines of fixed, program-dependent formats; architecture, with a change in emphasis from tightly coupled distributed systems to a more loosely coupled confederation based on the Web; and software, with the realization that software evolution is a better path to managing complexity than building monolithic applications. In the following sections we'll explore each in more detail. Figure 1.7. The three XML revolutions: data, architecture, and software.
The Data Revolution
Prior to XML, data was very much proprietary, closely associated with applications that understood how data was formatted and how to process it. Now, XML-based industry-specific data vocabularies provide alternatives to specialized Electronic Data Interchange (EDI) solutions by facilitating B2B data exchange and playing a key role as a messaging infrastructure for distributed computing.
XML's strength is its data independence. XML is pure data description, not tied to any programming language, operating system, or transport protocol. In the grand scheme of distributed computing this is a radical idea. The implication is that we don't require lock-in to programmatic infrastructures to make data available to Web-connected platforms. In effect, data is free to move about globally without the constraints imposed by tightly coupled transport-dependent architectures. XML's sole focus on data means that a variety of transport technologies may be used to move XML across the Web. As a result, protocols such as HTTP have had a tremendous impact on XML's viability and have opened the door to alternatives to CORBA, RMI, and DCOM, which don't work over TCP/IP. XML does this by focusing on data and leaving other issues to supporting technologies. XML: Origin and Cultures
Although XML is a relatively new technology, its lineage extends back over several decades. Approved by the W3C in 1998, XML is an effort to simplify the Standard Generalized Markup Language (SGML), which, until XML, was the ISO standard for defining data vocabularies. Technically, XML is a subset of SGML designed to facilitate the exchange of structured documents over the Internet. Although SGML, which became an ISO standard in 1986, has been widely used by organizations seeking to structure their documents and documentation (for example, the General Motors parts catalog), its pre-Web complexity has been the main stumbling block to its widespread use and acceptance by the Web community. Figure 1.8 illustrates the relationship between SGML and XML and shows some of the languages derived from each. Figure 1.8. XML is the successor to SGML. Both are metalanguages that are used to define new data-oriented vocabularies.
The designers of XML took the best parts of SGML and, based on their experience, produced a technology comparable to SGML but much simpler to use. In fact, simplicity and ease of programming were requirements imposed by the W3C on the Working Group responsible for the final XML specification. The Code, Data, and Document Cultures
To understand XML's impact on the computing world, it's useful to place XML in perspective. As Figure 1.9 shows, XML comes out of a document culture that is distinct from the code and data cultures that are the hallmarks of the mainstream computer industry. The code culture is characterized by a focus on programming languages, beginning with FORTRAN and evolving through Algol to C, C++, and Java. The data culture is characterized by COBOL, data processing, and databases. Both the data and code cultures carry with them a built-in propensity to view the world through either a code or a data lens. From a code perspective, data is something to be transported by procedure calls. From a data perspective, data is something to be stored in databases and manipulated. Figure 1.9. Evolution: from programming languages to objects to components .
The late 1980s and early 1990s saw code and data combine in the form of object-oriented languages such as C++, Smalltalk, Java, and Object COBOL. And yet, object technology was only a partial answer. As practitioners in the data world had long realized, transactions ”the ability to update multiple databases in an all-or-none manner ”are essential to serious industrial-strength enterprise applications. Because component frameworks provide transactions as a service to applications regardless of language origins, the playing field quickly shifted from objects to components. Thus infrastructures such as CORBA, DCOM, and Enterprise JavaBeans (EJB) provide interconnection, security, and transaction-based services for extending the enterprise. In the mid 1990s, components were the only way to extend legacy. However, XML changed the rules of the game.
XML's emergence from the data-oriented document culture has forced a rethinking about application development, particularly for those accustomed to thinking of building applications from a code-based perspective. What XML brings to the computing world is a technology that allows data to be freed from the constraints created by code-centric infrastructures. Instead of requiring data to be subordinated to parameters in a procedure call, XML now permits data to stand on its own. More radically , it allows code to be treated as data, which has been the driving force behind using XML for remote procedure calls. As Figure 1.10 illustrates, XML offers an alternative to both EDI and technologies such as CORBA, RMI, and DCOM that lock data transfer into underlying networks and object infrastructures. It is this change in perspective that is driving the widespread use of XML across the entire computing industry and opening up new patterns of interaction, including Web services. Figure 1.10. XML in combination with Web protocols allows data to be independent of network, programming language, or platform.
The Architectural Revolution
Together these XML-based technology initiatives open up new possibilities for distributed computing that leverage the existing infrastructure of the Web and create a transition from object-based distributed systems to architectures based on Web services that can be discovered , accessed, and assembled using open Web technologies. The focal point of this change in architectural thinking has been a move from tightly coupled systems based on established infrastructures such as CORBA, RMI, and DCOM, each with their own transport protocol, to loosely coupled systems riding atop standard Web protocols such as TCP/IP. Although the transport protocols underlying CORBA, RMI, and DCOM provide for efficient communication between nodes, their drawback is their inability to communicate with other tightly coupled systems or directly with the Web.
Loosely coupled Web-based systems, on the other hand, provide what has long been considered the Holy Grail of computing: universal connectivity. Using TCP/IP as the transport, systems can establish connections with each other using common open-Web protocols. Although it is possible to build software bridges linking tightly coupled systems with each other and the Web, such efforts are not trivial and add another layer of complexity on top of an already complex infrastructure. As Figure 1.11 shows, the loose coupling of the Web makes possible new system architectures built around message-based middleware or less structured peer-to-peer interaction. Figure 1.11. XML in combination with Web protocols has opened up new possibilities for distributed computing based on message passing as well as peer-to-peer interaction.
The Software Revolution
XML is also part of a revolution in how we build software. During the 1970s and 1980s, software was constructed as monolithic applications built to solve specific problems. The problem with large software projects is that, by trying to tackle multiple problems at once, the software is often ill-suited to adding new functionality and adapting to technological change. In the 1990s a different model for software emerged based on the concept of simplicity. As Figure 1.12 illustrates, instead of trying to define all requirements up front, this new philosophy was built around the concept of creating building blocks capable of combination with other building blocks that either already existed or were yet to be created. Figure 1.12. The software revolution: simplicity and collaboration.
A case in point is the Web. After decades of attempts to build complex infrastructures for exchanging information across distributed networks, the Web emerged from an assemblage of foundational technologies such as HTTP, HTML, browsers, and a longstanding networking technology known as TCP/IP that had been put in place in the 1970s. Figure 1.13 illustrates how the Web as we know it was not something thought out in strict detail. Each of the contributing technologies focused on doing one thing well without inhibiting interconnection with other technologies. The essential idea was to maximize the possibility of interaction and watch systems grow. The result is the Web, a product of the confluence of forces that include the Internet, HTML, and HTTP. Let's now look at how these same forces of combination and collaboration are driving the revolution in software. Figure 1.13. The Web itself is an example of combinatoric simplicity in action. HTTP, a simple protocol, combines with browser technology to give us the Web as we know it today.
Software and Surprise
One byproduct of this new way of thinking about software combination is the element of surprise. Conventional software built around an ongoing series of requirements poses few surprises (except if it comes in under budget and on time). The Web, however, was different. It took just about everyone by surprise. Like a chemical reaction, the elements reacted in combination, giving rise to totally new structures.
Another example of the power of combination and surprise is Napster, a radical way of distributing music over the Internet. Napster relied on peer-to-peer connectivity rather than centralized distribution. Napster wasn't the result of a team of dedicated software professionals, but was created by a twenty-something upstart drawing on the power of assembly. The music industry will never be the same. Combination and Collaboration
The power of combination is finding its way not only into software construction but up the development chain to software specification and design. Rather than hoping to meet the needs of users, design is now more collaborative, bringing in stakeholders early to ensure maximum feedback and the benefits of collaborative thinking. Figure 1.14 illustrates how this collaborative model is used by the W3C, the Internet Engineering Task Force, and Sun in its Java Community Process. Figure 1.14. Part of the software revolution includes collaboration on specification and design. Examples include the Internet Engineering Task Force, the W3C, and Sun's Java Community Process.
|
Team-Fly |
Top |