Item 26. Version Documents, Schemas, and Stylesheets

XML applications evolve over time just like traditional software applications. It's almost inevitable that some months after releasing an application, you'll have a better understanding of the domain your documents represent. You'll undoubtedly discover mistakes and inefficiencies in the original design. Requirements will shift and grow and vanish . Laws, regulations, and business practices can all change in ways that require you to modify XML applications. XML may not be a programming language, but XML applications are subject to all the same software engineering headaches associated with programs written in Java, C++, Cobol, and other languages. Consequently, it's important to use the same practices to track the components of your XML systems as you would any other equivalently complex piece of software. Use version control systems like CVS for your schemas and stylesheets, and follow the same release naming and numbering conventions as you would for a program written in Java, C++, or any other language.

Within the XML documents, schemas, and stylesheets themselves , it's generally a good idea to identify the version of the application they adhere to. XML 1.0 itself recognized its likely need to evolve and included a version number in its XML declaration so that parsers could recognize what version of XML was in use. So far that hasn't been needed, but it probably will be soon.

You should do the same in your own documents. Each XML application should include a version element or attribute that can contain a reasonable version number. Typically this would be placed on the root element, as shown here.

 <?xml version="1.0"?> <Statement xmlns="" version="1.0">   ... </Statement> 

However, in cases where a document combines elements from multiple applications and namespaces, you should probably put the version attribute on each top-most element from a particular vocabulary. For example, the following statement uses the Statement vocabulary, an Address vocabulary, and XHTML, all with separate versions.

 <Statement xmlns="" version="1.0">   <Bank>     <Logo href="logo.jpg" height="125" width="125"/>     <Name>MegaBank</Name>     <Motto>We Really Pretend to Care</Motto>     <Branch>        <Address xmlns=""                 version="1.2">           <Street>666 Fifth Ave.</Street>           <City>New York</City>           <State>NY</State>           <PostalCode>10010</PostalCode>           <Country>USA</Country>        </Address>     </Branch>   </Bank>   <Account>     <Number>00003145298</Number>     <Type>Savings</Type>     <Owner>John Doe</Owner>     <Address xmlns=""              version="1.2">        <Street>123 Peon Way</Street>        <Apt>28Z</Apt>        <City>Brooklyn</City>        <State>NY</State>        <PostalCode>11239</PostalCode>        <Country>USA</Country>     </Address>   </Account>   <Date>2003-30-02</Date>   <AccountActivity>     ...   </AccountActivity>   <Legal version="1.0 Transitional">      <html xmlns="">        <body style="font-size: xx-small">          ...        </body>      </html>   </Legal> </Statement> 

There's generally no need to repeat the version on each element.

Most of the time, a simple numeric versioning system as used in most software (0.1, 0.2, 0.9, 1.1, 2.0, and so on) works well enough. This should normally be treated as a string because it may contain letters (e.g., 1.0a1, 1.0b2) or may not exactly fit as a decimal (e.g., 1.0.1, 1.5.3). There are a number of related conventions for how to version software. Any of them work as long as you're consistent.

Other options are possible. For example, some developers prefer to use the date when a particular application was released.

 <?xml version="1.0"?> <Statement xmlns=""            version="20020824">   ... </Statement> 

Use whatever form works best for you, but do use something.

Instead of an attribute, some applications use the public ID in the document type declaration. For example, XHTML documents can begin in any of the following ways, all of which identify different variations and profiles of XHTML. (See Item 23.)

 <!DOCTYPE html PUBLIC "-//W3C//DTD XHTML Basic 1.0//EN"     ""> <!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.1//EN"     ""> <!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN"     ""> <!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Frameset//EN"     ""> <!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 2.0//EN" "xhtml20.dtd"> <!DOCTYPE html PUBLIC    "-//W3C//DTD XHTML 1.1 plus MathML 2.0 plus SVG 1.1//EN"    "" > 

This works well enough as long as you expect documents to carry document type declarations and to be fairly independent. However, it's much more problematic when you begin mixing and matching several DTDs into the same document. You need to define (and applications need to recognize) new public IDs for each separate combination. Furthermore, including a document type declaration may convince some tools that they actually have to read and process the DTD in order to handle the document. Worse yet, not all APIs provide reliable access to the public ID. Version attributes are much more reliable and portable and much less fraught with unintended side effects.

One thing you should probably not do is change the namespace URI with each new version. Too much other software depends on the namespace for proper operation. Changing it can unnecessarily break all these applications, many or most of which will function properly with the new version. Unless the changes in the new version of the application are so radical that essentially an entire new language has been invented, keep the namespace URI constant.

You should also identify the version in use in the ancillary documents for the application: schemas, stylesheets, documentation, and so on. Here you need to be careful not to confuse the version of the XML vocabulary in which the document is written and the version of the XML vocabulary for which the document is intended. For example, here's an XSLT style sheet that's written in XSLT 1.0 but processes DocBook 4.1.2.

 <?xml version="1.0" encoding="US-ASCII"?> <xsl:stylesheet  xmlns:xsl=""                 version="1.0">   <rdf:RDF     xmlns:rdf=""     xmlns:dc="">     <rdf:Description about="">       <dc:title>docbook.xsl</dc:title>       <dc:date>2002-04-23</dc:date>       <dc:relation>          -//OASIS//DTD DocBook XML V4.1.2//EN       </dc:relation>     </rdf:Description>   </rdf:RDF>   <!-- rest of stylesheet... --> </xsl:stylesheet> 

In this example, I placed the version information in a top-level element from a non-XSLT namespace, which an XSLT processor is guaranteed to ignore. I also identified the version of DocBook the stylesheet applies to using a formal public identifier. I got a little fancy here (possibly too fancy) and used RDF and the Dublin Core, but any non-XSLT namespace would suffice.

One of my favorite stylesheet tricks is to put the version information in global variables. This allows it to be referenced from the stylesheet itself so it can be moved to the output. Here we use two variables , one for the version of DocBook in use and one for the version of the stylesheet itself.

 <xsl:variable name="stylesheet-version" value="1.73"/> <xsl:variable name="application-version" value="1.1.2"/> <xsl:template name="VersionInfo">   Output produced by the MegaBank Statement Stylesheet   <xsl:value-of select="$stylesheet-version"/>   for  MegaBank Statement   <xsl:value-of select="$application-version"/> </xsl:template> 

CSS stylesheets present much less opportunity to embed version information. Here you essentially must use a comment, as demonstrated below.

 /*  MegaBank Statement CSS stylesheets  1.73 for MBSML 1.1.2 */ book { display: block; font-size: 12pt;        font-family: Times, serif } /* ... */ 

DTDs are similarly limited. They provide no formal way to identify the version of the DTD. Normally a comment suffices.

 <!--  MegaBank Account Statement DTD, Version 1.1.2 --> 

See Item 5 for more suggestions about how to comment a DTD.

The W3C XML Schema Language, by contrast, provides an xsd:annotation element that is perfect for storing version information and other meta-information about the schema. Generally, its xsd:documentation child is used for human-readable metadata and its xsd:appInfo child is used for machine-processable metadata. These may contain any elements from any vocabulary other than the W3C XML Schema Language itself. For example, the following schema has a top-level xsd:annotation element providing both RDF and human-readable metadata about the schema.

 <xsd:schema xmlns:xsd="">   <xsd:annotation>     <xsd:appInfo>        <rdf:RDF          xmlns:rdf=""          xmlns:dc="">          <rdf:Description about="">            <dc:title>docbook.xsd</dc:title>            <dc:date>2002-04-23</dc:date>             <dc:relation>                -//OASIS//DTD DocBook XML V4.2//EN             </dc:relation>          </rdf:Description>        </rdf:RDF>     </xsd:appInfo>     <xsd:documentation>       Schema version 1.5 for DocBook XML 4.1.2     </xsd:documentation>     <!-- rest of schema... -->   </xsd:annotation> </xsd:schema> 

Schema processors will normally ignore xsd:annotation elements, but other processes reading the schema can easily extract the relevant metadata, including the version information.

Other components of an XML-based system, such as processing software, RDDL documentation, schemas in other languages like RELAX NG, and so forth, can be handled similarly. The main thing to remember is that your applications will change, whether you expect them to or not. It only makes sense to be ready for them to do so.

Effective XML. 50 Specific Ways to Improve Your XML
Effective XML: 50 Specific Ways to Improve Your XML
ISBN: 0321150406
EAN: 2147483647
Year: 2002
Pages: 144 © 2008-2017.
If you may any questions please contact us: