The classic example of a processing instruction, far and away the most familiar one used in practice, is the xml-stylesheet processing instruction that tells web browsers where to find the stylesheet for a particular instance document. For example, consider the following document, which describes a chess game. <?xml version="1.0"?> <?xml-stylesheet type="text/css" href="chess.css"?> <game> <move>f3</move> <move>e5</move> <move>g4</move> <move>Qh4++</move> </game> The content of the document is completely related to chess. It can be used for many purposes, not just for display to people but also as input to chess-playing programs like Deep Junior. The styles used to format this content for a person are not a fundamental part of the data. They are extra meta-information intended only for one class of process: browser display. There's no way for any one XML application to anticipate the needs of all the different processes that will operate on its documents. Thus it's completely reasonable to include process-specific information in processing instructions when that information extends beyond the bounds of the application itself. In general, the xml-stylesheet processing instruction satisfies almost all the rules for when to use a processing instruction.
The one criterion that's a little iffy is the complexity of the processing instruction. The xml-stylesheet processing instruction uses a pseudo-attribute format that makes it look a lot like an XML empty-element tag. However, of the major APIs only JDOM provides support for parsing these pseudo-attributes (and it's a little buggy here). If a real empty-element tag with real attributes were used instead, all XML parsers would be able to read the real attributes. You can imagine something like this: <?xml version="1.0"?> <game> <stylesheet type="text/css" href="chess.css" /> <move>f3</move> <move>e5</move> <move>g4</move> <move>Qh4++</move> </game> However, the positioning raises a lot of questions about what exactly the stylesheet element applies to. For instance, does it apply to the entire game element or just the move elements within it? Could each move element possess its own stylesheet element that overrides the parent element's? Worse yet, this interferes with validation of the game element, which must now declare that it can contain a stylesheet element. Perhaps attributes in some other namespace could be used instead, as shown below. <?xml version="1.0"?> <game xmlns:ss="http://www.example.com/stylesheet" ss:type="text/css" ss:href="chess.css"> <move>f3</move> <move>e5</move> <move>g4</move> <move>Qh4++</move> </game> This solves the problem of scope. Clearly each style attribute has scope within the element where it appears. However, this approach is still bulky, and it still interferes with validation. On the other hand, this makes it much easier to mix different stylesheets into different parts of the document. Still, I think a processing instruction in the document prolog is the right solution here. It neatly hits the 80/20 point. The few applications with more complex needs can define a more complex solution such as the namespaced attributes. Most applications don't need anything that complex. |