The problem of achieving understanding among many parties is endemic to distributed computing. Over the years , people have developed many applications for distributed networks. As the reach of these networks has grown, the applications have encountered the information exchange problem. The heart of the problem is spontaneity. If people were able and willing to negotiate the format and interpretation of each message with each new party from the ground up, there would be no problem. However, the cost of such an arrangement is prohibitive on any but the smallest scale. People need a common infrastructure for describing how to conduct information exchanges that is flexible enough to serve many different domains but constrained enough so that parties can quickly agree on a limited set of parameters. The convergence of problems from multiple sources makes such a balanced solution critical in several areas. Web DocumentsThe number of documents accessible via the Web is staggering. Unfortunately, it's hard to figure out what any of the documents means without actually reading them. Search engines are one example. Suppose you wanted to find online versions of works by William Shakespeare. A search on "William Shakespeare" would return documents he has written, documents written about him, and documents describing people's preferred authors. In all probability, the number of documents in the latter two categories would be much greater than the number in the first. What's the problem? There is no agreement on how to indicate what a particular word or phrase means. Therefore, documents where "William Shakespeare" means document author, document subject, and preferred author list element all look the same. Consequently, either people are unable to find what they need, or it takes much longer than they want. The problem lies with HTML , by far the most widely used means of delivering information over the Web. The way most people use it, HTML serves as an online page layout language. Authors place information in specific parts of the screen and control the format of information. HTML is an adequate page layout language. Web page authors have used it to create some nice looking pages. However, it offers inadequate features for describing the information itself. Another area where HTML is inadequate is information analysis. Suppose you use a Web-based home search service to develop a list of homes that meet your criteria. Once you have the list, you might want to analyze it a number of different ways. You might want to sort the houses on the list by price, square footage, or school district . The Web site may be able to perform the transformations you want, but all the processing occurs on the Web server, which sends you formatted text. This approach limits you to the filtering options provided by the Web site and places a heavy processing load on the Web server, causing response delays. A more flexible and efficient approach would be to send the data itself to the browser and let a built-in spreadsheet give you the power to slice and dice the data any way you wished. Unfortunately, the Web server has no way to send the data to browsers in a format that it knows they can understand. Electronic CommerceElectronic commerce is perhaps the best example of the information exchange problem. It is clear why parties previously unknown to each other want to exchange information, and there is an obvious monetary value to the exchange. There are two fundamental types of electronic commerce: business-to-consumer and business-to-business. In business-to-consumer commerce, the information exchange problem imposes a time penalty. As we saw in the CD-RW drive example, people can compensate for different information representations at different electronic commerce sites by manually developing knowledge of the site structure. This process requires time and reduces the effectiveness of time-saving automated shopping agents . It also eliminates the possibility of higher-level automated tasks . What if you wanted to find CD-RW drives compatible with your existing SCSI controller card? This information is available at manufacturer sites. However, there is no standard way for you to describe the characteristics of your SCSI controller card or for manufacturers to describe compatibility information for their drives. This information is buried in paragraphs of text so you have to read every CD-RW drive data sheet. Other barriers to information exchange make it difficult to eliminate mundane tasks such as online product registration. You have to type in your name , address, phone number, e-mail address, and other information for every product that you buy. In business-to-business commerce, the information exchange problem imposes money and choice penalties. If two businesses want to conduct electronic commerce, they have to agree beforehand on formats for all messages and the order in which they can process these messages. Developing software code to implement every such agreement costs money. Requiring prior agreements also restricts choice. If an innovative new company that offers substantially lower prices and higher quality comes along, the enterprise is out of luck until it can reach an agreement on message processing with this new party. The situation for business-to-business commerce becomes even more complex when more than two businesses must coordinate their actions. Suppose a company that designs clothing receives an order from a chain of retail stores. To fill the order, it needs to contract with at least one raw materials supplier, at least one manufacturing plant, and at least one shipping provider. Adding these three participants to the designer and retailer results in at least five companies participating in the process. They all have to exchange information, and they must use the same overall plan for the completing the process. Database AccessEnterprises have accumulated large volumes of data in various databases over the years. The current problem is delivering it to people in a way that increases productivity. Certainly, it is easy for any one person to access any one database by installing the appropriate software on a client machine and executing queries against the database. However, in many cases, the information that a particular person needs to do a job resides in several different databases. Take the typical customer service representative as an example. The representative may need to help customers place orders, check on shipments, correct account information, configure products, and get replacement parts. Each of these tasks may require accessing a different database. In many cases, the representative simply cannot help customers with certain tasks. In others, it may be necessary to switch back and forth between different database access programs, writing notes on pieces of paper, to assemble the information necessary to assist the customer. The problem is not really that the enterprise uses different database products, although it is a compounding factor. The real problem arises when employees have different models of the business than those encoded in database schemas. There is simply no way for a database designer to predict and accommodate the different uses for even a single database. With multiple databases involved, the problem gets exponentially worse . There are many examples where the difference between the user information model and the database data model hurts productivity, including sales representatives trying to check manufacturing schedules for multiple products, marketing analysts trying to assemble historical sales information for multiple products, and executives trying to analyze sales trends across divisions. The database access problem becomes even more acute as the enterprise expands the number of people who have access to the data. There is a trend toward empowering employees throughout the enterprise by giving them access to the information they need to make good decisions. But if they cannot process this information, it does them no good. Each employee has specific information requirements particular to his job task. Simply throwing data at the employee doesn't help. There is also a trend toward making corporate data available to strategic partners and even customers so that they can integrate their operations more fully with the enterprise. Of course, if the enterprise has trouble providing its own employees with the data they need, delivering it to outsiders is nearly impossible . For years, enterprises have searched for a technology that could help them synthesize data from different databases into different packages, depending on the needs of the particular user. Such a technology would help reduce costs as well as increase the capabilities to deliver innovative products and services. Knowledge SharingSolving ever more complex problems requires vastly improved cooperation among large groups. A major source of difficulty is effectively sharing knowledge of a complex topic. Consider three scenarios: (1) a joint product development effort among five companies, (2) a research partnership among twelve laboratories to create medicines based on gene sequencing, and (3) joint combat exercises that include a dozen ships and a hundred aircraft from three countries . All of these scenarios have a common problem. Individual participants want to share their knowledge with the rest of the group and integrate the group 's knowledge into their understanding of the problem. Sharing the information itself is not too difficult. One approach is for each participant to do a presentation in a teleconference. However, this approach does not make it easy for a participant to integrate the knowledge shared by everyone else. Although presentation slides may be an effective means of organizing high-level concepts, they are not very useful for exchanging detailed information. The product development participants need the three-dimensional models of product concepts and specifications for different prototype parts. The research partnership participants need gene sequences and protein folding simulations. The joint naval exercise participants need detailed sensor deployment plans, rules of engagement contingencies, and weapon delivery assignments. A further refinement of the presentation approach is to follow up with electronic versions of software files that contain the detailed information. There is the small problem of different participants using different file formats. There is the large problem of relating these files to the original presentation and to each other. Fundamentally, there is no way for one participant to create an information package that includes all the connections among the different elements that make it truly meaningful. Such information packages along with standard file formats could accelerate the pace of innovation and enable the coordination of larger groups in real-time tasks. |