DocBook (http://www.docbook.org/ ) is an SGML application designed for new documents, not old ones. It's especially common in computer documentation. Several O'Reilly books have been written in DocBook, including Norm Walsh and Leonard Muellner's DocBook: The Definitive Guide . No special tools are required to author it. Much of the Linux Documentation Project (LDP, http://www.linuxdoc.org/ ) corpus is written in DocBook. The current version of DocBook, 4.3, is available as both an SGML and an XML application. Example 6-2 shows a simple DocBook XML document based on the book you're reading now. Needless to say, the full version of this document would be much longer.
Example 6-2. A DocBook document
<?xml version="1.0" encoding="UTF-8" standalone="no"?> <!DOCTYPE book PUBLIC "-//OASIS//DTD DocBook XML V4.3//EN" "docbook/docbookx.dtd"> <book> <title>XML in a Nutshell</title> <bookinfo> <author> <firstname>Elliotte Rusty</firstname> <surname>Harold</surname> </author> <author> <firstname>W. Scott</firstname> <surname>Means</surname> </author> </bookinfo> <toc> <tocchap><tocentry>Introducing XML</tocentry></tocchap> <tocchap><tocentry>XML as a Document Format</tocentry></tocchap> <tocchap><tocentry>XML as a "better" HTML</tocentry></tocchap> </toc> <chapter> <title>Introducing XML</title> <para></para> </chapter> <chapter> <title>XML as a Document Format</title> <para> XML is first and foremost a document format. It was always intended for web pages, books, scholarly articles, poems, short stories, reference manuals, tutorials, texts, legal pleadings, contracts, instruction sheets, and other documents that human beings would read. Its use as a syntax for computer data in applications like syndication, order processing, object serialization, database exchange and backup, electronic data interchange, and so forth is mostly a happy accident. </para> <sect1> <title>SGML's Legacy</title> <para></para> </sect1> <sect1> <title>TEI</title> <para></para> </sect1> <sect1> <title>DocBook</title> <para> <ulink url="http://www.docbook.org/">DocBook</ulink> is an SGML application designed for new documents, not old ones. It's especially common in computer documentation. Several O'Reilly books have been written in DocBook including <citation>Norm Walsh and Leonard Muellner's <citetitle>DocBook: The Definitive Guide</citetitle></citation>. Much of the <ulink url="http://www.linuxdoc.org/">Linux Documentation Project (LDP)</ulink> corpus is written in DocBook. </para> </sect1> </chapter> <chapter> <title>XML on the Web</title> <para></para> </chapter> <index> <indexentry> <primaryie>SGML, 8, 89</primaryie> </indexentry> <indexentry> <primaryie>DocBook, 95-98</primaryie> </indexentry> <indexentry> <primaryie>TEI (Text Encoding Initiative), 92-95</primaryie> </indexentry> <indexentry> <primaryie>Text Encoding Initiative</primaryie> <seeie>TEI</seeie> </indexentry> </index> </book>
DocBook offers many advantages to technical authors. First and foremost, it's open, nonproprietary, and can be created with any text editor. It would feel a little silly to write open source documentation for open source software with closed and proprietary tools like Microsoft Word (which is not to say this hasn't been done). If your documents are written in DocBook, they aren't tied to any one platform, vendor, or application software. They're portable across essentially any plausible environment you can imagine.
Not only is DocBook theoretically editable with basic text editors, it's simple enough that such editing is practical as well. One of us (Harold) wrote an entire 1,200 page book in DocBook by hand in jEdit ( Processing XML with Java , Addison Wesley, 2002). Of course, if you'd like a little help, there are a number of free tools available, including an Emacs major mode (http://www.nwalsh.com/emacs/docbookide/index.html ). Furthermore, like many good XML applications, DocBook is modular. You can use the pieces you need and ignore the rest. If you need tables, there's a very complete tables module. If you don't need tables, you don't need to know about or use this module. Other modules cover various entity sets and equations.
DocBook is an authoring format, not a format for finished presentation. Before a DocBook document is read by a person, it is converted to any of several formats, including the following:
XSL Formatting Objects
Rich Text Format (RTF)
T E X
For example, if you want high-quality printed documentation for a program, you can convert a DocBook document to T E X, then use the standard T E X tools to convert the resulting T E X file to a DVI and/or PostScript file and print that. If you just want to read it on your computer, then you'd probably convert it to HTML and load it into your web browser. For other purposes, you'd pick something else. With DocBook, all these formats come essentially for free. It's very easy to produce multiple output documents in different formats from a single DocBook source document. Indeed, this benefit isn't just limited to DocBook. Most well-thought-out XML input formats are just as easy to publish in other formats.