The first step in building your own custom markup language is to lay down the ground rules to which all XML documents belonging to your custom markup language must adhere. Specifically, you need to design a model of which elements, attributes, and entities can appear in your XML documents, in what order, and how many times. I refer to the model of a custom markup language as an XML content model or simply as a content model. A content model is the blueprint or schema for your own custom family of XML documents. You use it similarly to the way you use a database schema, which defines a data model for tabular data, or a class definition, which defines a model for software objects. The ground rules of a content model must be expressed in a schema language such as DTD or XML Schema, and in this chapter, I show you how to build content models by using DTDs.
As it turns out, the XML documents you worked with in Chapters 1 and 2 were document fragments—well-formed snippets of XML code. They didn’t belong to any particular class of document. In the book example in Chapter 2, you expressed a book’s description using an XML format, but you never explicitly specified any rules for how a book’s information should be represented. In this chapter, however, you build a content model for describing books—a book markup language—that lets you create XML documents that describe other books in a clear, consistent, and structured manner. By defining a content model for your books, you can differentiate your book XML documents from any other XML document and determine whether the information contained within the XML document is valid (that is, it abides by all the rules and restrictions specified in the content model). I should note that content models are also sometimes referred to as simply schemas for short. In this book, I use content model to avoid any potential confusion with XML Schemas.
Here are some of the reasons why you should develop a content model for your XML documents:
Improved interoperability: If everyone created individual markup languages without creating associated content models, the overall world-wide level of application interoperability would be practically the same as if no one used XML in the first place. Software application interoperability requires informing others (both programmers and applications) about the kind of input and output required to work with your application. All this information must be expressed in a content model.
Enhanced editing support: By working with XMLSPY and a content model (typically expressed in either a DTD or an XML Schema), you can automate application development support including visual editing, code completion, code generation, and database schema generation.
Good programming practice: A content model is your data interface. Any distributed Internet application will likely require programming of clients (consumers) and servers (producers) of data. If you specify a data interface (content model) prior to developing the client and server applications, different programming teams can independently implement their respective programming contracts, resulting in reduced development times.
It is necessary, but not sufficient, to simply write out information and data encoded within angle brackets as specified by the XML syntax. In order for XML-encoded information to have meaning and be understood, you must first design a conceptual model of the information being conveyed and clearly specify the set of permissible data elements and the structure to which an XML document must conform to be considered a member of a particular markup language.
Cross-Reference This chapter explains how to use a DTD to build a straightforward content model for books. In industry today, the development of XML content models to describe business processes is of great interest to all companies. Global consortiums, such as the World Wide Web Consortium (W3C) and the Organization for the Advancement of Structured Information Standards (OASIS), bring together a diverse group of products, services, and systems integration companies. These drive the development, convergence, and adoption of XML content models to describe business processes such as purchase orders, stock quotations, research reports—essentially any information communicated from one source to another. A listing of the most widely used XML content models (both DTDs and XML Schemas) in industry is included in Appendix B.