Data Documents


In this application category, XML serves as the data encoding frame work for software communication. Such applications include interchangeable files, data integration, and remote interface applications. There are also opportunities to create new classes of Internet applications to solve specialized problems. Electronic software distribution (ESD), in which customers install a software program from the Internet instead of from a CD-ROM, is one example. Remote help desks, where a technical support representative remotely diagnoses problems with a computer's software configuration, is another. In fact, the Web services protocol stack itself is an excellent example of a Data Document application. So Data Document applications are particularly vital to the success of XML as a distributed computing platform.

In all these cases, two computers have to exchange a series of messages over the Internet or through a filesystem to accomplish the task. One of the barriers to implementing such protocols is agreeing on data formats. XML schemas provide a straightforward method of formally defining such format. Another implementation barrier is building the software to encode and decode data and then deliver the results to the main application for processing. By using publicly available XML engines, software developers can avoid writing much of this code themselves and, more important, are much less likely to discover errors in the encoding-decoding logic because a publicly available engine has probably undergone far more testing than a custom engine.

Software applications are the primary consumers of Data Documents. You could define a basic XSLT stylesheet for each document type so that software engineers could view data in a format convenient for debugging. Engineers get this feature essentially free.

However, it is a secondary concern when compared with the advantage they get from using off-the-shelf technology. Of course, a key requirement for achieving this benefit is access to publicly available and commercial-grade XML engines, as well as the tools for rapidly integrating them with applications written in different programming languages.

Development Process

Data Document applications are obviously very similar to traditional applications that manage input/output with the filesystem or over a network. The primary differences lie in the generalization of the information formats that XML allows. Moreover, the nature of Data Document applications usually makes them only one component of a larger application. They serve as the engine that enables information exchange; the higher-level application performs all the application-specific information processing. As Figure 6-3 shows, the development process for Protocol Document applications has the following five steps.

Figure 6-3. Development Process for Data Documents

graphics/06fig03.gif

  1. Adopt schemas. Because the primary goal of Data Document applications is to enable the exchange of information between different software applications, software developers do not usually have the luxury of designing all the schemas themselves. Most of the time, they want to exchange information either with an application that already uses particular schemas or with an application whose developers also want to have input into the schema design process. In the first case, the application developers simply adopt the schema used by the target applications. In the second case, they must work with either a formal standards body or an informal cooperative group to agree on a mutually acceptable set of schemas.

  2. Design schemas. In some cases, Data Document application developers may control all application components that exchange information. In others, their application may provide a natural focal point in a group of cooperating applications whose developers are looking for leadership in the specification of schemas. In these cases, Data Document application developers may design the necessary schemas themselves.

  3. Specify document processing. Because Data Document applications are essentially components of higher-level applications, developers must specify how the Data Document component interacts with higher-level applications. This specification may simply be an API. It may also include a set of utilities for manipulating the information contained in the protocol documents to facilitate processing by higher-level applications. For applications that use the filesystem to exchange documents, this document processing specification includes any conventions for directory and file use. For applications that use the network to exchange documents, this specification includes the messaging and network protocols as well as the allowable order of document exchange.

  4. Implement document processing. Generally, because different applications will communicate using Data Documents, there will be room for different implementations of the document processing specification. Therefore, application developers may either acquire a third-party implementation or develop one on their own. A good example of this decision is the construction of the distributed messaging component of a supply chain management package. Developers could either use an off-the-shelf SOAP implementation or write their own if they needed special features such as support for a rare transport protocol.

  5. Integrate application. Once developers have a Data Document component, they must integrate it with the higher-level application. This integration will consist of using the component's API and utilities. Some applications will use them to extract the information from protocol documents and integrate it with the data structures used by the higher-level application. Others will use them to take information from these higher-level application structures and encode them as Data Documents.

Both enterprise and vendor developers may use third-party tools and thereby skip all but the last step. These tools can't completely automate the process of mapping document data to application data, so developers may have to write some custom integration code. In most cases, developers can acquire complete protocol engines. Then they need simply integrate the higher-level application logic with the protocol engine through its APIs. Of course, enabling developers to avoid all but the last step means that some vendor has completed the first four steps. This is precisely the situation that has led to the Web services movement.

Required Staff

Because the development process for a Data Document application essentially boils down to (1) format design and (2) the use of the format within different applications, projects require staff experienced in these two areas. The types of format development staff are the interesting evolution. A standards bearer and format designer are necessary because of XML's more formal format design requirement.

  • Standards bearer. Developing Data Document components often requires cooperation with other parties that plan to use the protocol. As discussed in the Business Document section, this cooperation may take place through a formal standards body, an industry group, or informal meetings. In any case, the development team needs a representative in the cooperative development process. The standards bearer ensures that the data format schema meets the needs of the organization and that its future development plans coincide with the future development plans of the other participating organizations. The standards bearer will have a significant amount of development experience, including architectural design as well as experience cooperating with outside parties.

  • Format designer. If the Data Document application uses custom schemas, the format designer needs to design them. Whether the format schemas come from within the software development organization or from an outside organization, the format designer has to specify the document processing. This role includes designing the Data Document component APIs and specifying appropriate abstraction layers to allow replacement of messaging and network protocols as well as future extension of the component. The protocol designer will have experience in application design, API design, and wire protocol design.

  • Library developer. Each organization using a protocol needs an implementation of a corresponding Data Document component. During the document processing implementation phase, the library developer writes the code that implements the protocol, services API calls, and provides utilities. This implementation needs an appropriate level of abstraction so that different higher-level applications can use the library and application developers can swap the library out for a different implementation. The library developer will have experience implementing libraries in the necessary language such as Java, C++, or C.

  • Application developer. The point of Data Document components is that developers of higher-level applications can use them to enable their applications to exchange information. Therefore, for any given Data Document application, there may be many higher-level application developers who have to integrate the corresponding component into their applications. These developers own the corresponding development tasks for their applications and experience in the same language as the Data Document component's API.

None of the staff types required to implement Data Document applications is new. Format design has long been a recognized area of software development expertise. However, standards bearers , format designers, and library developers have to translate this expertise to XML. In most cases, the learning process should proceed quickly because self-describing data formats such as ASN.1 and s-expressions have existed for some time.



XML. A Manager's Guide
XML: A Managers Guide (2nd Edition) (Addison-Wesley Information Technology Series)
ISBN: 0201770067
EAN: 2147483647
Year: 2002
Pages: 75
Authors: Kevin Dick

flylib.com © 2008-2017.
If you may any questions please contact us: flylib@qtcs.net