In order to put some of the previous XML DBMS features into context, let's consider two simple applications where an XML database can be used. The two applications are a data-centric invoice archive system and a document-centric content management system. Data-centric XML is mostly concerned with data values and less concerned with sections of text written in a natural language like English. In the invoice archive example below, the invoice data is more concerned with data values such as Name , Address , Quantity , and DollarAmount . This kind of data can be summed, referenced to other data, and generally manipulated in a standard data processing fashion. Data-centric XML often represents business documents or business forms ”things like invoices, mortgage applications, and purchase orders. Data-centric XML typically has a simple meaning that can be manipulated and understood by either computers or people. Generally, the information in data-centric XML is factual and precise. Document-centric XML includes text whose meaning depends on all the subtleties of a natural language. It is thus accessible to humans but less accessible to computers because these subtleties are not easily parsed by a computer. The information in document-centric XML is not always factual and not always precise ”it is often simply the opinion of the author. In practice many applications contain a mixture of the two kinds of information. For example, the personnel records in a company contain factual information about salaries, grades, and home addresses, alongside textual information such as the records of employee appraisals . Traditionally these two kinds of information have been handled separately in different applications and different databases, but one of the reasons for the success of XML is the increasing need to handle both kinds of information in an integrated way. However, because the two kinds of information are used in different ways, we will look at two applications that are dominated by one kind or the other. Applications that use data-centric XML tend to call upon different features of a native XML database than document-centric applications. Data-centric XML is more open to normalization and the kind of data engineering found in relational databases; XML databases therefore find it harder to compete in this territory. Users are reluctant to adopt new technology if old technology can do the job: Why not just normalize the data and store it as tables? But there are several reasons why a native XML database can provide benefits over a normalized relational database, even for this kind of information, including the following:
Invoice ArchiveOur invoice archive system uses a native XML database to archive incoming vendor invoices. Vendor invoices are complete documents that contain different values such as customer name, date, item description, item amount, and total. The diagram in Figure 8.5 shows this flow of documents. The vendor's computer sends an invoice for services rendered to the customer. This XML invoice is received by the customer's computer. The customer's computer stores the invoice in a native XML database with a status of "unqualified." Unqualified invoices are viewed by a clerk in accounting and are either rejected or qualified by the accounting department. Rejection of the invoice will trigger its own chain of events. If the accounting department qualifies the invoice, its status is set to "qualified," and the invoice in the database is updated. At this point the company's accounting system, perhaps running on another computer, is updated with the invoice number, date, and vendor payable amount. If questions arise later regarding the invoice, it can be retrieved from the invoice archive database. Finally, when the check run is done, the invoice is marked in the database as "paid." Figure 8.5. The Flow of Invoice Data
XML DBMS Features Used
A Content Management ApplicationThis scenario involves creating XML content using a full-featured XML editor and publishing the XML content to the web for further browsing and searching by a content reader (consumer). For the purposes of this example, imagine that you need a system for a market research company that wants to publish ten thousand documents on its web site. Consumers visit the web site and want to search on, browse for, and read content online as well as print this content. Figure 8.6 shows the flow of information between author and consumer in this scenario. Figure 8.6. Content Management Flow
A content management scenario involves at least three main roles:
In addition, a designer may be required to design the graphic layout of the content, and an administrator may registers consumers as authorized users of the system. The role of author may also be specialized further; for example, there may be researchers, subeditors, and photographers. Actual content management applications often use not one XML database, but two. One database is typically used to serve online published content, while another is used for material in the course of preparation. It is not uncommon to use a third database as a staging server. Authors are likely to use a variety of desktop tools for the actual content preparation, using a check-in / check-out mechanism to update the database. Publishers need to manage the workflow, checking on the status of content and releasing it for publication when all the steps in the process are complete. The web application used by the consumer is likely to support the following features:
XML DBMS Technology UsedThe author of the XML content
The consumer of the information
In addition, the system is likely to need some kind of replication technology to control the migration of data between one database and another, and it may require integration with a workflow engine to manage the authoring and publication process. These two sample applications give an idea of the range of XML applications and some of the XML DBMS features that can be brought into play when building such an application. For an application builder who needs to write one of the above applications, well-placed features in a native XML DBMS will greatly ease the task. |