XML Databases


The databases described previously all work with XML from a relational model; they may store XML, but they are designed with rows and columns in mind. Storing XML data is either a matter of dumping the content into a BLOB (Binary Large OBject) field or extracting the information and mapping it to columns. XML databases, on the other hand, store and manipulate XML in a more native form. Three examples of XML databases are the open source Xindice (pronounced Zeen-dee-chay), the eponymous Mark Logic Server, and Berkeley DB XML from Sleepycat Software. These databases store XML natively, without requiring conversion to and from relational tables. Rather than attempt to shoehorn SQL as a query language, they use XQuery or XPath as a query mechanism.

Xindice

Xindice is an open source native XML database. It is part of the tools developed and managed by the Apache Foundation. Although not as full-featured as some commercial databases (it has no support for XQuery for example), binaries and source code are available for a number of Operating Systems. You can download it from the Apache Foundation Web site at http://www.xml.apache.org/xindice/. The samples in this section use Xindice 1.1 syntax, which is slightly different from that used by 1.0. It directly supports access via Java, but other languages are supported via WebDAV queries.

Setting up Xindice requires that you first install an application server. This can be either the freely available Tomcat (also from the Apache Foundation), or a commercial application server, such as BEA Weblogic or IBM Websphere. Xindice is distributed as a WAR (Web ARchive) file for installation into the application server. After it is set up and configured, you can begin to communicate with the database via the included command-line tool or Java.

Retrieving XML

As XML is stored in Xindice as the native format, retrieving XML is simpler than retrieving it from relational databases. You don't need to map columns to XML or to use extensions to SQL to work with the collection. Instead, you process Xindice collections using a provided command-line tool or via Java. Other languages may also be used via an XML-RPC interface.

Xindice stores XML documents in collections; these structures serve as the databases, and they may be nested. Before performing any other processing with Xindice, you must first create a collection in which to store your documents. When querying or adding XML, you reference the collection for the documents. You create a collection either with the command-line interface or via code. Via the command-line, you use the add_collection (or ac) command:

      xindice ac -c xmldb:xindice://db -n {collection} 

Listing 11-20 shows the code used to create a new collection.

Listing 11-20: Creating a Xindice collection

image from book
            private void collectionButtonActionPerformed(java.awt.event.ActionEvent evt) {          try {              getDatabase();              Collection col = DatabaseManager.getCollection(SERVICE_URL);              CollectionManager service =                  (CollectionManager) col.getService("CollectionManager", "1.0");              String collectionName = this.collectionField.getText();              String collectionConfig =                     "<collection compressed=\"true\" " +                     "            name=\"" + collectionName + "\">" +                     "   <filer class=\"org.apache.xindice.core.filer.BTreeFiler\"/>" +                     "</collection>";              service.createCollection(collectionName,                     DOMParser.toDocument(collectionConfig));              this.collectionMessage.setText("Collection created");          } catch (Exception e) {              System.err.println("Error creating collection " + e.getMessage());          }      } 
image from book

The first step when working with a Xindice collection is to register the database (see Listing 11-21). Next, the Collection Manager service is loaded and the new collection created. Notice that the parameters of the new collection are passed to the createCollection method. In this case the resulting collection is compressed and uses the BTreeFiler.

Listing 11-21: The getDatabase method

image from book
      private static void getDatabase() {          try {              Class c = Class.forName(driver);              Database database = (Database) c.newInstance();              DatabaseManager.registerDatabase(database);          } catch (Exception e) {              System.err.println("Error registering database " + e.getMessage());          }      } 
image from book

The getDatabase routine loads the class for the Xindice implementation (org.apache.xindice.client.xmldb.DatabaseImpl), and then it creates a new instance of that class and registers the database. This is required whenever you work with the database, either storing or retrieving information.

Xindice collections are queried using XPath statements. This means that it is relatively simple to retrieve individual items or lists from the collection. However, complex queries or calculations that would be possible in XQuery are not possible currently.

Using the command-line, you perform queries with the xpath command:

      xindice xpath -c xmldb:xindice://{server}:{port}/db/{collection} -q {query} 

For example, to retrieve the information in a collection named employees about Foo deBar, you use the following command:

      xindice xpath -c xmldb:xindice://{server}:{port}/db/employees          -q emps[lname="deBar"] 

Figure 11-7 shows a simple application for working with a Xindice collection (see Listing 11-22).

image from book
Figure 11-7

Listing 11-22: Using XPath to query a Xindice collection with Java

image from book
      private void queryButtonActionPerformed(java.awt.event.ActionEvent evt) {          try {              getDatabase();              this.queryResults.setText("");              String url = SERVICE_URL + "/" + this.collectionField.getText();              Collection col = DatabaseManager.getCollection(url);              XPathQueryService service =                  (XPathQueryService) col.getService("XPathQueryService", "1.0");              String xpath = this.queryField.getText();              ResourceSet resultSet = service.query(xpath);              ResourceIterator results = resultSet.getIterator();              StringBuffer buff = new StringBuffer();              while (results.hasMoreResources()) {                  Resource res = results.nextResource();                  buff.append(res.getContent());                  System.out.println((String) res.getContent());              }              this.queryResults.setText(buff.toString());          } catch(Exception e) {               System.err.println("Error querying collection " + e.getMessage());          }      } 
image from book

The query functionality first connects to the database as before. After you have registered the database, the next step is to load the collection. This is performed by referencing the URL (xmldb:xindice: //localhost:90/db/employees on my machine; the server and port depend on the URL of your application server). You then retrieve the XPath query service and execute the query. This returns a collection that can be iterated to process each item in the result set. Notice that the key is part of the result.

Storing XML

Storing XML in Xindice is similar to the retrieval: You access the collection and use the xindice ad (or add_document) command. The document is added into the collection based on the URL used.

      xindice ad -c xmldb:xindice://{server}:{port}/db/{collection} -f {file} 

You can also add a number of documents simultaneously with the -e switch, listing the extension of all the documents to be added. Listing 11-23 shows the code required to add documents to the collection.

Listing 11-23: Adding documents to Xindice collection

image from book
      private void addButtonActionPerformed(java.awt.event.ActionEvent evt) {          try {              getDatabase();              String url = SERVICE_URL + "/" + this.collectionField.getText();              Collection col = DatabaseManager.getCollection(url);              XMLResource document =                     (XMLResource) col.createResource(null, "XMLResource");              document.setContent(this.bodyField.getText());              col.storeResource(document);              this.itemMessage.setText("Item added");          } catch(Exception e) {              System.err.println("Error adding item " + e.getMessage());          }      } 
image from book

The XMLResource class provides a number of methods for working with XML. This includes adding the content as I have done here or via a DOM or SAX handle. To confirm the item is listed, you can either query for the new information or retrieve the full list of resources in the collection as shown in Listing 11-24.

Listing 11-24: Retrieving all resources in the Xindice collection

image from book
      private void refreshButtonActionPerformed(java.awt.event.ActionEvent evt) {          DefaultListModel theList = new DefaultListModel();          this.itemList.setModel(theList);           try {              //get all items from collection              getDatabase();              String url = SERVICE_URL + "/" + this.collectionField.getText();              Collection col = DatabaseManager.getCollection(url);              String[] items = col.listResources();              //add each to list              for(int i=0;i<items.length;i++) {                  theList.addElement(items[i]);              }          } catch (Exception e) {              System.err.println("Error retrieving items " + e.getMessage());          }      } 
image from book

Xindice provides an easy (and inexpensive) way to add a native XML database to your solution. If you are dealing with many small XML documents and don't want to add a relational database to your application, it can provide a useful data storage and query mechanism.




Professional XML
Professional XML (Programmer to Programmer)
ISBN: 0471777773
EAN: 2147483647
Year: 2004
Pages: 215

flylib.com © 2008-2017.
If you may any questions please contact us: flylib@qtcs.net