SGML: The Good Stuff

[Previous] [Next]

The International Organization for Standardization (ISO) creates and maintains standards that help the world's businesses do business. The ISO owns screw-thread standards, for example, that make it possible to order a bolt from a vendor in Luxembourg that will be compatible with a nut from a company in Taiwan.

The Standard Generalized Markup Language (SGML) is a standard owned by the ISO. SGML was created to allow the sharing of information between companies that might have different systems. IBM, DEC, and the U.S. Internal Revenue Service (IRS) were the big players when the SGML specification was being developed in the early 1980s. As the standard developed, other large industrial companies and federal government agencies became involved in the specification. One of these big players was the U.S. Department of Defense (DoD).

The DoD wanted to lower the cost of transferring contracts from an incumbent to a new contractor. The cost of converting documents from an existing contractor's system to a new contractor's system was sometimes so high that incumbents could keep underbidding new contractors—the incumbents didn't have to factor the cost of conversion into their bids. This led the DoD to look for a standard document technology, the implementation of which could be used as a condition for winning a contract. SGML became that technology. Now contractors must deliver their supporting documents as SGML. Using this standard, new contractors taking over legacy projects can easily read these supporting documents. This early adoption by the DoD and IRS led some people to call SGML the "Standard Government Markup Language." (I've also heard SGML called "Sounds good, maybe later" by people who were reluctant to implement it because of its complexity. Actual implementers claim that it stands for "Someone get my lithium.") Other industries followed suit, using SGML as a syntax for communication between players in a given industry.

SGML was the first standard technology that allowed users to separate data from the processes that acted on it. With SGML, users can go through a process called information analysis to discover the structure and content of their data. A vocabulary called a document type definition (DTD) is then developed from that analysis. The DTD defines a class of information, so each DTD is customized for each set of data. The DTD indicates the contents of objects in the information set by using a precise syntax called the content model.

Because each information set has different requirements and different objects, the DTD for describing each set is different. For example, a DTD that describes maintenance procedures for aircraft might have elements such as access cover, procedure, and fuse. A DTD that describes a training course will have elements such as objectives, question, and answer. Neither of these DTDs would have the format-oriented elements that HTML has, such as center and bold. The DTD does not contain the details about how these elements eventually appear, since that would limit a DTD's usefulness. The processing software adds only formatting and processing instructions to the information when the output method is known.

SGML gives companies the ability to leverage a single information asset by applying different processes. In this way, companies can "Create once, publish many." For example, the designer of a training course could use the same SGML source to create the student guide and instructor guide, since most of the guides' information is shared. The typesetter creating the student guide would print only the questions, but for the instructor's guide would print both the questions and answers. SGML gives companies the ability to leverage a single information asset by applying different processes. In this way, the companies can create their data once and publish it many times.

An SGML document consists of a simple ASCII stream with markup and content. A parser reads the document and determines the structure of the information by identifying markup and noting the content inside. Because an SGML document is clear ASCII, it is portable and runs on any platform that has a parser. You'll see that XML has many of these same capabilities.

Because of its status as an ISO standard, SGML is stable and difficult to change. Each ISO standard must be reviewed by the committee that created and maintains it every five years to see whether that standard is still required and, if so, whether it needs to be updated. This review process works fine for screw threads, but the business of information management changes quite a bit faster than every five years—which brings us to one of the problems with SGML that I'll discuss in the next section.



XML and SOAP Programming for BizTalk Servers
XML and SOAP Programming for BizTalk(TM) Servers (DV-MPS Programming)
ISBN: 0735611262
EAN: 2147483647
Year: 2000
Pages: 150

flylib.com © 2008-2017.
If you may any questions please contact us: flylib@qtcs.net