Parsers, DOM implementations , XSLT engines, and other XML- related tools vary widely in speed, efficiency, specification conformance, and bugginess. Sometimes these are innate characteristics of the code and the skill of the programmers who wrote it. Other times the relative quality depends on the environment. For instance, early versions of the Xerces XML parser for Java, written by IBM, tended to perform very well on IBM's Java virtual machine and very poorly on Sun's Java virtual machine, while the Crimson parser written by the Sun team had almost precisely opposite performance characteristics. Still other times the relative performance of tools depends on the documents they process. For instance, some DOM implementations are tuned for relatively small document sizes while others are tuned for very large documents. Benchmarks based primarily on large or small documents can come to very different conclusions about the same tools.
One of the best ways to improve performance is to try out different libraries on your code and pick the one that performs the best for your documents in your environment. However, this only works if you haven't tied yourself too closely to one parser's API. In SAX2 and DOM3 it's normally possible to write completely parser-independent code. Do so. For Java, JAXP extends this capability to DOM2. Always prefer implementation-independent APIs like DOM and SAX to parser-dependent APIs like the Xerces Native Interface or ElectricXML.
A few APIs such as XOM and JDOM fall somewhere in the middle. They allow you to choose a different parser but not a different implementation of their core classes. If parsing is the bottleneck, this can be helpful. However, if the bottleneck lies elsewhere, for instance, in an inefficient use of string buffers that bedeviled one JDOM beta, you're pretty much stuck with it.
Unfortunately, not all standard APIs are as complete as you might wish, so you may sometimes need to tie yourself to a specific implementation. If this proves necessary, try to clearly delineate the implementation-dependent parts of your code, and keep those parts as small as possible.
Now let's investigate the details of writing implementation-independent code with different APIs and tools.