Another common use of processing instructions is in page formatting. Figuring out exactly where to break words, columns , pages, and so forth is a very difficult problem for a machine. TeX probably does a better job of this than anything else, and it still doesn't always get it right. Human intervention is normally necessary for high-quality typesetting. Processing instructions like those shown below are often used for this purpose.
<?xml version="1.0"?> <game> <?begin-keep-together?> <date>2003-10-24</date> <white>Jane Smith</white> <black>Alice Jones</black> <?end-keep-together?> <?page-break?> <move>f3</move> <move>e5</move> <move>g4</move> <move>Qh4++</move> </game>
I think this is reasonable in applications not related to publishing, such as this simple chess vocabulary, in which a stylesheet will be used to produce pages. While ideally the stylesheet and formatting engine would be smart enough to figure out where to put everything, the reality is that these tools just aren't good enough to handle real-world documents. Indeed, it may not even be theoretically possible to do so. If you're using the common chain of XML XSLT XSL-FO PDF, it may not be possible to determine where objects are placed until the final layout step that chooses sizes for everything. However, without the sizes, you can't calculate whether or not a break is needed from within the XSLT stylesheet. Processing instructions are a necessary escape.
On the other hand, I do not think this is appropriate for publishing-related vocabularies such as DocBook, XSL-FO, and OpenOffice. Keeps and breaks should be elements in these vocabularies, just like paragraphs, lists, tables, and other aspects of layout. XHTML would accomplish this by using CSS keeps and breaks properties attached to elements with particular IDs. This works, but it does require at least one custom stylesheet for each document. It's sometimes (though not always) more convenient to keep all of a document's unique markup in one place.