Classify Native XML Databases by Content | Beginning XML Databases (Wrox Beginning Guides)

So, we know that XML documents contain data, metadata (data describing the data), and some semantics in the form of any inherent hierarchical structure. Now I want to describe XML document content just a little further, by describing its nature. In other words, XML documents can be document-centric, which is suitable for human consumption. They can also be data-centric. A data-centric XML document is generally a chunk of data shared between computers. This also implies that data-centric information is more generic and is more likely to be subject to program control using scripting languages such as XSL.

Document-Centric XML

A document-centric XML document, suitable for human consumption, is essentially not really easily understandable for a computer, if at all possible. Document-centric XML is essentially the type of documents that would normally be written by hand, by an author such as a Word document, a PDF document, or even something like this book. These types of XML documents are generally stored in their entirety and are generally not accessed programmatically, or even by XML element content. Sometimes these types of documents are indexed for index searching such as for libraries of technical papers. On the contrary, there are some databases of technical papers in existence that tend to mix document-centric and data-centric data. For example, in a library database containing technical papers going back years , it would be sensible to categorize those documents based on subject matter, authors, dates written, and any other generic descriptive information. The content of those documents can be indexed to create generic information of indexed subset phrases as well.

A specialized type of document-centric native XML database is called a content management system. Content management database systems allow a certain amount of management and control over human-written XML data, stored inside XML data types of native XML databases.

Data-Centric XML

A data-centric XML document in its purest form uses XML data as a method of transporting data between computers. In reality, XML is often a mixture of data-centric and document-centric. The document-centric part is the human-written and readable part. The data-centric section is that which is generic and program accessible because it is repetitive. Some of the best examples of data-centric documentation is on websites such as Amazon and eBay. These sites contain pages and pages of information with varying levels of flexibility and different types of content. For example, for a book on Amazon, all the primary details such as title, ISBN, publisher, dates, and otherwise are all data-centric details. On the other hand, many books on Amazon are listed with PDF downloads of the actual book. The PDFs are document-centric, specific to a particular book and only programmable in relation to the PDF document itself, as opposed to the content of the PDF.