Sample Applications

In order to put some of the previous XML DBMS features into context, let's consider two simple applications where an XML database can be used. The two applications are a data-centric invoice archive system and a document-centric content management system.

Data-centric XML is mostly concerned with data values and less concerned with sections of text written in a natural language like English. In the invoice archive example below, the invoice data is more concerned with data values such as Name , Address , Quantity , and DollarAmount . This kind of data can be summed, referenced to other data, and generally manipulated in a standard data processing fashion. Data-centric XML often represents business documents or business forms ”things like invoices, mortgage applications, and purchase orders. Data-centric XML typically has a simple meaning that can be manipulated and understood by either computers or people. Generally, the information in data-centric XML is factual and precise.

Document-centric XML includes text whose meaning depends on all the subtleties of a natural language. It is thus accessible to humans but less accessible to computers because these subtleties are not easily parsed by a computer. The information in document-centric XML is not always factual and not always precise ”it is often simply the opinion of the author.

In practice many applications contain a mixture of the two kinds of information. For example, the personnel records in a company contain factual information about salaries, grades, and home addresses, alongside textual information such as the records of employee appraisals . Traditionally these two kinds of information have been handled separately in different applications and different databases, but one of the reasons for the success of XML is the increasing need to handle both kinds of information in an integrated way. However, because the two kinds of information are used in different ways, we will look at two applications that are dominated by one kind or the other.

Applications that use data-centric XML tend to call upon different features of a native XML database than document-centric applications. Data-centric XML is more open to normalization and the kind of data engineering found in relational databases; XML databases therefore find it harder to compete in this territory. Users are reluctant to adopt new technology if old technology can do the job: Why not just normalize the data and store it as tables?

But there are several reasons why a native XML database can provide benefits over a normalized relational database, even for this kind of information, including the following:

  • An increasing number of applications use both data-centric and document-centric information in an integrated way. It makes sense to use the same database technology for both.

  • Many applications handle data that is small in quantity yet complex in structure. An example might be a web site holding the results of a sports tournament , or a schedule of training courses. The design of a normalized relational database holding this information can be a significant task, which becomes an obstacle when new web applications must be produced in two weeks from start to finish. Designing the data as an XML document structure is often far simpler and more flexible.

  • In some applications, the problem is variability of the data. For example, a telecommunications company managing records of installed equipment may have to handle thousands of different types of equipment, with slightly different data maintained for each kind. Designing a relational database for this situation is notoriously difficult, and managing the frequent changes needed by the design can be a nightmare. Such an application can be vastly simplified if each piece of equipment is recorded as one (semi-) structured XML document containing all the relevant fields.

  • Sometimes the data originates as XML outside the organization, and the recipient company may want to store the information in that same form, knowing that the details may change from time to time as the sending organization changes its business processes. The recipient organization doesn't want to change its relational database design every time the incoming document format has a new field added. Far better to store the XML in its original form. This is the rationale we use for our example of a data-centric application, an archive of incoming invoices.

Invoice Archive

Our invoice archive system uses a native XML database to archive incoming vendor invoices. Vendor invoices are complete documents that contain different values such as customer name, date, item description, item amount, and total. The diagram in Figure 8.5 shows this flow of documents. The vendor's computer sends an invoice for services rendered to the customer. This XML invoice is received by the customer's computer. The customer's computer stores the invoice in a native XML database with a status of "unqualified." Unqualified invoices are viewed by a clerk in accounting and are either rejected or qualified by the accounting department. Rejection of the invoice will trigger its own chain of events. If the accounting department qualifies the invoice, its status is set to "qualified," and the invoice in the database is updated. At this point the company's accounting system, perhaps running on another computer, is updated with the invoice number, date, and vendor payable amount. If questions arise later regarding the invoice, it can be retrieved from the invoice archive database. Finally, when the check run is done, the invoice is marked in the database as "paid."

Figure 8.5. The Flow of Invoice Data

graphics/08fig05.gif

XML DBMS Features Used
  • Storage as XML . The incoming invoice is in XML, and it is preferable to maintain the invoice in its original form. To place the invoice in the database requires an XQuery Insert command.

  • Schema validation of the incoming XML . Incoming data must be validated to ensure that it is a valid instance of the invoice schema, which has presumably been agreed upon between vendor and customer.

  • Invoice searching and sorting . For example, we may need to display the invoice documents in date order by status, or by status alone. It might be necessary to examine all invoices from a particular vendor, or to find the invoices relating to a particular purchase order.

  • Update of the XML . The invoice document in the XML DBMS requires updating to set the status.

  • Two-phase commit . Used when the invoice document is updated as "qualified" in the XML DBMS, and the accounting system DBMS is updated with the payable amount.

  • Full-text . Occasionally, a simple full-text search may be required to find an invoice with a particular word in the description.

A Content Management Application

This scenario involves creating XML content using a full-featured XML editor and publishing the XML content to the web for further browsing and searching by a content reader (consumer). For the purposes of this example, imagine that you need a system for a market research company that wants to publish ten thousand documents on its web site. Consumers visit the web site and want to search on, browse for, and read content online as well as print this content. Figure 8.6 shows the flow of information between author and consumer in this scenario.

Figure 8.6. Content Management Flow

graphics/08fig06.gif

A content management scenario involves at least three main roles:

  • Author : the person who authors the content.

  • Publisher : the person who makes the content available to the consumer.

  • Consumer : the person who uses the content.

In addition, a designer may be required to design the graphic layout of the content, and an administrator may registers consumers as authorized users of the system. The role of author may also be specialized further; for example, there may be researchers, subeditors, and photographers.

Actual content management applications often use not one XML database, but two. One database is typically used to serve online published content, while another is used for material in the course of preparation. It is not uncommon to use a third database as a staging server.

Authors are likely to use a variety of desktop tools for the actual content preparation, using a check-in / check-out mechanism to update the database. Publishers need to manage the workflow, checking on the status of content and releasing it for publication when all the steps in the process are complete.

The web application used by the consumer is likely to support the following features:

  • Full-text search on document title with ranking of results

  • Full-text search on the full document

  • Browsing titles of documents by category

  • Option to download a printable PDF of the content

XML DBMS Technology Used

The author of the XML content

  • May use a WebDAV interface to the database to enable the remote editing of documents using a rich editor

  • Will want some sort of exclusive locking while multiple authors collaborate and work on documents

  • Will want to query for documents using XQuery to select them for editing

  • May need her own application that uses the database to categorize and organize documents if there are many documents

  • May need document or partial document checkout if many authors are working on the document

  • Will need a way to categorize documents for the consumer

  • May need storage for large documents

  • May require document versioning

The consumer of the information

  • Will want to access the documents using full-text search. This access may specify different search targets in the XML document such as titles, all text, or abstracts.

  • Views the content typically as HTML or PDF, after it has gone through XSL transformation.

  • Will want to browse the documents by category. This might be implemented using a predefined set of XQuery stored functions.

In addition, the system is likely to need some kind of replication technology to control the migration of data between one database and another, and it may require integration with a workflow engine to manage the authoring and publication process.

These two sample applications give an idea of the range of XML applications and some of the XML DBMS features that can be brought into play when building such an application. For an application builder who needs to write one of the above applications, well-placed features in a native XML DBMS will greatly ease the task.



XQuery from the Experts(c) A Guide to the W3C XML Query Language
Beginning ASP.NET Databases Using VB.NET
ISBN: N/A
EAN: 2147483647
Year: 2005
Pages: 102

flylib.com © 2008-2017.
If you may any questions please contact us: flylib@qtcs.net