6.1 SGML's Legacy
XML is a simplified form of the Standardized General Markup Language (SGML). The language that would eventually become SGML was invented by Charles F. Goldfarb, Ed Mosher, and Ray Lorie at IBM in the 1970s and developed by many people around the world until its eventual adoption as ISO standard 8879 in 1986. SGML was intended to solve many of the same problems XML solves in much the same way as XML solves them. It was and is a semantic and structural markup language for text documents. SGML is extremely powerful and achieved some success in the U.S. military and government, in the aerospace sector, and in other domains that needed ways of efficiently managing technical documents that were tens of thousands of pages long.
SGML's biggest success was HTML, which was and is an SGML application. However, HTML is just one SGML application. It does not have anything close to the full power of SGML itself. SGML has also been used to define many other document formats, including DocBook and TEI, both of which we'll discuss shortly.
However, SGML is complicatedvery, very complicated. The official SGML specification is over 150 very technical pages. It covers many special cases and unlikely scenarios. It is so complex that almost no software has ever implemented it fully. Programs that implement or rely on different subsets of SGML are often incompatible. The special feature that one program considers essential is all too often considered extraneous fluff and omitted by the next program. Nonetheless, experience with SGML taught developers a lot about the proper design, implementation, and use of markup languages for a wide variety of documents. Much of that general knowledge applies equally well to XML.
One thing all this should make clear is that XML documents aren't just used on the Web. XML can easily handle the needs of publishing in a variety of media, including books, magazines, journals, newspapers, and pamphlets. XML is particularly useful when you need to publish the same information in several of these formats. By applying different stylesheets to the same source document, you can produce web pages, speaker's notes, camera-ready copy for printing, and more.