12.10 Controlling XSLT Stylesheet Loading


Credit: Jürgen Hermann

12.10.1 Problem

You need to process XML documents and access external documents (e.g., stylesheets), but you can't use filesystem paths (to keep documents portable) or Internet-accessible URLs (for performance and security).

12.10.2 Solution

4Suite's xml.xslt package (http://www.4suite.org/) gives you all the power you need to handle XML stylesheets, including the hooks for sophisticated needs such as those met by this recipe:

# uses 4Suite Version 0.10.2 or later from xml.xslt.Processor import Processor from xml.xslt.StylesheetReader import StylesheetReader class StylesheetFromDict(StylesheetReader):     "A stylesheet reader that loads XSLT stylesheets from a python dictionary"     def _ _init_ _(self, styles, *args):         "Remember the dict we want to load the stylesheets from"         StylesheetReader._ _init_ _(self, *args)         self.styles = styles         self._ _myargs = args     def _ _getinitargs_ _(self):         "Return init args for clone(  )"         return (self.styles,) + self._ _myargs     def fromUri(self, uri, baseUri='', ownerDoc=None, stripElements=None):         "Load stylesheet from a dict"         parts = uri.split(':', 1)         if parts[0] == 'internal' and self.styles.has_key(parts[1]):             # Load the stylesheet from the internal repository (your dictionary)             return StylesheetReader.fromString(self, self.styles[parts[1]],                 baseUri, ownerDoc, stripElements)         else:             # Revert to normal behavior             return StylesheetReader.fromUri(self, uri,                 baseUri, ownerDoc, stripElements) if _ _name_ _ == "_ _main_ _":     # test and example of this stylesheet's loading approach     # the sample stylesheet repository     internal_stylesheets = {         'second-author.xsl': """             <person xmlns:xsl="http://www.w3.org/1999/XSL/Transform"                 xsl:version="1.0">             <xsl:value-of select="books/book/author[2]"/>             </person>         """     }     # the sample document, referring to an "internal" stylesheet     xmldoc = """       <?xml-stylesheet href="internal:second-author.xsl"           type="text/xml"?>       <books>         <book title="Python Essential Reference">           <author>David M. Beazley</author>           <author>Guido van Rossum</author>         </book>       </books>     """     # Create XSLT processor and run it     processor = Processor(  )     processor.setStylesheetReader(StylesheetFromDict(internal_stylesheets))     print processor.runString(xmldoc)

12.10.3 Discussion

If you get a lot of XML documents from third parties (via FTP, HTTP, or other means), problems could arise because the documents were created in their environments, and now you must process them in your environment. If a document refers to external files (such as stylesheets) in the filesystem of the remote host, these paths often do not make sense on your local host. One common solution is to refer to external documents through public URLs accessible via the Internet, but this, of course, incurs substantial overhead (you need to fetch the stylesheet from the remote server) and poses some risks. (What if the remote server is down? What about privacy and security?)

Another approach is to use private URL schemes, such as stylesheet:layout.xsl. These need to be resolved to real, existing URLs, which this recipe's code does for XSLT processing. We show how to use a hook offered by 4Suite, a Python XSLT engine, to refer to stylesheets in an XML-Stylesheet processing instruction (see http://www.w3.org/TR/xml-stylesheet/).

A completely analogous approach can be used to load the stylesheet from a database or return a locally cached stylesheet previously fetched from a remote URL. The essence of this recipe is that you can subclass StylesheetReader and customize the fromUri method to perform whatever resolution of private URL schemes you require. The recipe specifically looks at the URL's protocol. If it's internal: followed by a name that is a known key in an internal dictionary that maps names to stylesheets, it returns the stylesheet by delegating the parsing of the dictionary entry's value to the fromString method of StylesheetReader. In all other cases, it leaves the URI alone and delegates to the parent class's method.

The output of the test code is:

<?xml version='1.0' encoding='UTF-8'?> <person>Guido van Rossum</person>

This recipe requires at least Python 2.0 and 4Suite Version 0.10.2.

12.10.4 See Also

The XML-Stylesheet processing instruction is described in a W3C recommendation (http://www.w3.org/TR/xml-stylesheet/); the 4Suite tools from FourThought are available at http://www.4suite.org/.



Python Cookbook
Python Cookbook
ISBN: 0596007973
EAN: 2147483647
Year: 2005
Pages: 346

flylib.com © 2008-2017.
If you may any questions please contact us: flylib@qtcs.net