Style Location | Effective XML: 50 Specific Ways to Improve Your XML

The classic example of a processing instruction, far and away the most familiar one used in practice, is the xml-stylesheet processing instruction that tells web browsers where to find the stylesheet for a particular instance document. For example, consider the following document, which describes a chess game.

 <?xml version="1.0"?> <?xml-stylesheet type="text/css" href="chess.css"?> <game>   <move>f3</move>   <move>e5</move>   <move>g4</move>   <move>Qh4++</move> </game>

The content of the document is completely related to chess. It can be used for many purposes, not just for display to people but also as input to chess-playing programs like Deep Junior. The styles used to format this content for a person are not a fundamental part of the data. They are extra meta-information intended only for one class of process: browser display. There's no way for any one XML application to anticipate the needs of all the different processes that will operate on its documents. Thus it's completely reasonable to include process-specific information in processing instructions when that information extends beyond the bounds of the application itself.

In general, the xml-stylesheet processing instruction satisfies almost all the rules for when to use a processing instruction.

It provides extra information for processes such as web browsers that format a document for display to people. Other kinds of applications are uninterested in its content.
It locates the stylesheet to be applied to the document. It does not say anything about what is in the document.
It can be treated as a unit. The processing instruction contains all the information needed to process it. It is not context dependent. It does not have any structure beyond the processing instruction itself.
If the processing instruction format is wrong, nothing too terrible will happen. At worst a default stylesheet will be applied instead.
A stylesheet applies to the entire document, not just to one element in the document.
The xml-stylesheet processing instruction can be and is used in many different XML applications including DocBook, SVG, MathML, XHTML, and informally defined custom applications that authors invent on the fly.

The one criterion that's a little iffy is the complexity of the processing instruction. The xml-stylesheet processing instruction uses a pseudo-attribute format that makes it look a lot like an XML empty-element tag. However, of the major APIs only JDOM provides support for parsing these pseudo-attributes (and it's a little buggy here). If a real empty-element tag with real attributes were used instead, all XML parsers would be able to read the real attributes. You can imagine something like this:

 <?xml version="1.0"?> <game>   <stylesheet type="text/css" href="chess.css" />   <move>f3</move>   <move>e5</move>   <move>g4</move>   <move>Qh4++</move> </game>

However, the positioning raises a lot of questions about what exactly the stylesheet element applies to. For instance, does it apply to the entire game element or just the move elements within it? Could each move element possess its own stylesheet element that overrides the parent element's?

Worse yet, this interferes with validation of the game element, which must now declare that it can contain a stylesheet element.

Perhaps attributes in some other namespace could be used instead, as shown below.

 <?xml version="1.0"?> <game xmlns:ss="http://www.example.com/stylesheet"       ss:type="text/css" ss:href="chess.css">   <move>f3</move>   <move>e5</move>   <move>g4</move>   <move>Qh4++</move> </game>

This solves the problem of scope. Clearly each style attribute has scope within the element where it appears. However, this approach is still bulky, and it still interferes with validation. On the other hand, this makes it much easier to mix different stylesheets into different parts of the document.

Still, I think a processing instruction in the document prolog is the right solution here. It neatly hits the 80/20 point. The few applications with more complex needs can define a more complex solution such as the namespaced attributes. Most applications don't need anything that complex.