Before we can explore how to develop applications using Xindice, let’s take a tour through the Xindice command-line tools. These are the tools that allow you to do administration and to access the database via a command-line processor.
Some additional setup is required so you can run the command-line tools. You need to create a command/shell window (depending on your operating system type) and set the value of the environment variable XINDICE_HOME to the directory where Xindice is installed. You may also wish to add the XINDICE_HOME/bin directory to the PATH variable, so you can execute the Xindice commands no matter where you are in the directory hierarchy.
If you’re going to use JDK 1.4, then you’ll need to modify the batch (or shell) scripts in XINDICE_HOME/bin to use the Java Endorsed Standards Override Mechanism to get the proper versions of Xerces and Xalan into the classpath.
The first thing you need to be able to do is start the Xindice server. To do this, go to your shell window, change to the XINDICE_HOME directory, and execute either the batch or shell script file named startup. It will start the Xindice server. If you don’t execute this command in the background, you’ll need a second window to execute the Xindice command-line parsers. In the sections that follow, we’ll look at the commands available from the xindiceadmin command. The xindice command is for regular users and allows a subset of these commands.
Four subcommands let you add information to a Xindice database.
The add_collection subcommand (you can use ac for brevity) allows you to add/create a new collection. The collection can be specified as a top-level collection by providing / for the collection argument or as a child collection by specifying /parent-collection for the collection argument. This command is only available in xindiceadmin:
xindiceadmin add_collection –c collection –n name [–v]
-c collection—Collection to add a collection to.
-n name—Name for the new collection.
-v—Verbose mode (optional).
This example creates a new top-level collection called books:
xindiceadmin add_collection –c /db -n books
This example creates a computers as a child collection of /db/books:
xindiceadmin add_collection –c /db/books –n computers
The add_document subcommand (you can use ad for brevity) allows you to add a document to a collection. The collection must already exist:
xindiceadmin add_document –c collection –f file_path –n name [–v]
-c collection—Collection to add the document to.
-f file_path—File containing the document to add.
-n name—Name/ID for the document.
-v—Verbose mode (optional).
This example adds the document books.xml to the collection /db/books/computer. The document is retrievable by the IDinventory:
xindiceadmin add_document –c /db/books/computer –f books.xml –n inventory
The add_multiple command lets you add the contents of an entire directory to a collection. You can filter the directory contents by filename extension:
xindiceadmin add_multiple –c collection –f directory [–e extension] [–v]
-c collection—Collection to add the documents to.
-f directory—Directory containing the files you wish to add.
-e extension—Filename extension of the files you wish to add (optional).
-v—Verbose mode (optional).
This example adds all the .xml files in the current directory to the /db/books/computer collection:
xindiceadmin add_multiple –c /db/books/computer –f . –e .xml
The import command lets you add the contents of a directory tree to a collection. Unlike with add_multiple, any directories in the directory being imported are created as collections, and suitable files in those directories are imported into the corresponding collections. This command is available only in xindiceadmin:
xindiceadmin import –c collection –f directory [–e extension] [–v]
-c collection—Collection to add the documents and directories to.
-f directory—Root of the directory hierarchy to import.
-e extension—Filename extension of the files you want to import (optional)
-v—Verbose mode (optional).
This example adds all .xml files in the directory tree rooted at the current directory into a matching hierarchy of collections rooted at /books:
xindiceadmin import –c /db/books –f . –e .xml
The retrieval commands allow you to find out what collections there are, retrieve the contents of a document from a collection, and issue an XPath query over the contents of the collection.
The list_collections command (you can use lc for brevity) displays a listing of all the collections contained in the collection you specify. The root collection is called /db:
xindiceadmin list_collections –c collection [–v]
-c collection—The collection to list.
-v—Verbose mode (optional).
This example lists the collections in the root collection:
xindiceadmin list_collections –c /db
The list_documents command (you can use ld for brevity) displays a listing of all the documents contained in the collection you specify:
xindiceadmin list_documents –c collection -v
-c collection—The collection to list.
-v—Verbose mode (optional).
This example lists the documents in the books collection:
xindiceadmin list_documents –c /db/books
The retrieve_document command (you can use rd for brevity) retrieves the document named name from the collection. If you specify the -f option, the retrieved document is stored in the file-path argument:
xindiceadmin retrieve_document –c collection –n name [–f file-path] [–v]
-c collection—Collection from which to retrieve the document.
-n name—Name of the of document to retrieve.
-f file-path—File to retrieve into (optional). If omitted, the document is sent to the standard output.
-v—Verbose mode (optional).
This example retrieves the document named inventory from the collection /db/books and stores it in a file called mybooks.xml:
xindiceadmin –c /db/books –n inventory –f mybooks.xml
The xpath command allows you to retrieve elements from documents in a collection that match an XPath expression:
xindiceadmin xpath –c context –q query [–v]
-c collection—Collection to query.
-q query—XPath expression.
-v—Verbose mode (optional).
This example retrieves entire documents using XPath. There’s a problem using the xpath command with documents that use namespaces, because there is no way to bind the namespace prefixes in the command-line tool:
xindiceadmin xpath –c /db/books –q /
Of course, there are command for deleting collections and documents.
The delete_collection command (you can use dc for brevity) deletes the named collection from a collection. This command is available only in xindiceadmin:
xindiceadmin delete_collection –c collection –n name –y [–v]
-c collection—The collection containing the collection to be deleted.
-n name—The name of the collection to be deleted.
-y—Don’t ask for confirmation.
-v —Verbose mode (optional).
This example shows how to delete the books collection:
xindiceadmin delete_collection –c /db –n books
The delete_document command (you can use dd for brevity) deletes the named document from the collection:
xindiceadmin delete_document –c collection –n name [–v]
-c collection—The collection containing the document to delete
-n name—The name of the document to delete.
-v—Verbose mode (optional).
This example shows how to delete the inventory document from the books collection:
xindiceadmin delete_document –c /db/books –n inventory
In order to speed access to documents stored in Xindice, you can create indexes.
The add_indexer command (you can use ai for brevity) allows you to index a collection for a particular pattern. To index more than one pattern, create multiple indices. This command is available only in xindiceadmin:
xindiceadmin add_indexer –c collection –n name –p pattern [–pagesize pagesize] [–maxkeysize max-key-size] [–t index-type] [–v]
-c collection—The collection to index.
-n name—The name of the index being created.
-p pattern—The pattern used to create the index; the syntax of the pattern looks like this:
element-name indexes the value of the named element.
element-name@attribute-name indexes the value of the named attribute on the named-element.
* indexes the values of all elements.
*@attribute-name indexes the value of the named attribute on any element.
element-name@* indexes the value of all the attributes of the named element.
*@* indexes the value of all attributes for all elements.
To specify elements that are in a namespace, you write the name of the element like this: [namespace-uri]element-name. So, [http://sauria.com/ schemas/apache-xml-book/books]book is the book element from the book inventory.
-maxkeysize max-key-size—The maximum key size for the index (defaults to 0) (optional).
-pagesize pagesize—The size of pages in the index (defaults to 4096 bytes) (optional).
-t index-type—The type of index (optional). If the -t option is omitted, then the type is either string or trimmed. If the pattern contains @, then the type is string; otherwise it’s trimmed. Possible values for index-type are:
string
trimmed
short
int
long
float
double
byte
char
boolean
name
-v—Verbose mode (optional).
This example shows how to build an index for the author elements of the book inventory schema:
xindiceadmin add_indexer -c /db/books -n authorIndex -p [http://sauria.com/schemas/apache-xml-book/books]author
The list_indexers command (you can use li for brevity) displays all the indexers in use for a given collection. This command is available only in xindiceadmin:
xindiceadmin list_indexers –c context [–v]
-c collection—The collection for which to list indexers.
-v—Verbose mode (optional).
This example shows how to list all the indexers on the books collection:
xindiceadmin list_indexers -c /db/books
The delete_indexer command (you can use di for brevity) allows you to remove an index from a collection. This command is available only in xindiceadmin:
xindiceadmin delete_indexer -c collection -n name [-v]
-c collection—The collection from which to delete the indexer .
-n name—The name of the indexer to delete.
-v—Verbose mode (optional).
This example shows how to remove the author index from the books collection:
xindiceadmin delete_indexer -c /db/books -n authorIndex
The next two commands don’t really fit into any of the categories that we’ve seen up to this point. The first deals with backing up the contents of the database, while the second is used to shut down the Xindice server.
The export command allows you to export the contents of a collection (including child collections and their children) to a directory:
xindiceadmin export (export) –c collection –f directory-path [–v]
-c collection—The collection to be exported.
-f directory-path—The directory where the exported data should go; the directory must already exist.
-v—Verbose mode (optional).
This example shows how to export the books collection to a directory called archive:
xindiceadmin export –c /db/books –f archive
The shutdown command is used to shut down the Xindice server. This command is available only in xindiceadmin:
xindiceadmin shutdown –c collection [-v]
-c collection—The root collection of the database.
-v—Verbose mode (optional).
This example shows how to shut down the current xindice server:
xindiceadmin shutdown –c /db
