< Day Day Up > |
Tap into a powerful new way to find exactly what you're looking for on a page. Firefox contains a little-known but powerful feature called XPath. XPath is a query language for searching the Document Object Model (DOM) that Firefox constructs from the source of a web page. As mentioned in "Add or Remove Content on a Page" [Hack #6], virtually every hack in this book revolves around the DOM. Many hacks work on a collection of elements. Without XPath, you would need to get a list of elements (for example, with document.getElementsByTagName) and then test each one to see if it's something of interest. With XPath expressions, you can find exactly the elements you want, all in one shot, and then immediately start working with them.
1.9.1. Basic SyntaxTo execute an XPath query, use the document.evaluate function. Here's the basic syntax: var snapshotResults = document.evaluate('XPath expression', document, null, XPathResult.UNORDERED_NODE_SNAPSHOT_TYPE, null); The function takes five parameters:
The document.evaluate function returns a snapshot, which is a static array of DOM nodes. You can iterate through the snapshot or access its items in any order. The snapshot is static, which means it will never change, no matter what you do to the page. You can even delete DOM nodes as you move through the snapshot. A snapshot is not an array, and it doesn't support the standard array properties or accessors. To get the number of items in the snapshot, use snapResults.snapshotLength. To access a particular item, you need to call snapshotResults.snapshotItem(index). Here is the skeleton of a script that executes an XPath query and loops through the results: var snapResults = document.evaluate("XPath expression", document, null, XPathResult.UNORDERED_NODE_SNAPSHOT_TYPE, null); for (var i = snapResults.snapshotLength - 1; i >= 0; i--) { var elm = snapResults.snapshotItem(i); // do stuff with elm } 1.9.2. ExamplesThe following XPath query finds all the elements on a page with : var snapFoo = document.evaluate(//*[@class='foo']", document, null, XPathResult.UNORDERED_NODE_SNAPSHOT_TYPE, null); The // means "search for things anywhere below the root node, including nested elements." The * matches any element, and [@class='foo'] restricts the search to elements with a class of foo. You can use XPath to search for specific elements. The following query finds all <input type="hidden"> elements. (This example is taken from "Show Hidden Form Fields" [Hack #30].) var snapHiddenFields = document.evaluate("//input[@type='hidden']", document, null, XPathResult.UNORDERED_NODE_SNAPSHOT_TYPE, null); You can also test for the presence of an attribute, regardless of its value. The following query finds all elements with an accesskey attribute. (This example is taken from "Add an Access Bar with Keyboard Shortcuts" [Hack #68].) var snapAccesskeys = document.evaluate("//*[@accesskey]", document, null, XPathResult.UNORDERED_NODE_SNAPSHOT_TYPE, null); Not impressed yet? Here's a query that finds images whose URL contains the string "MZZZZZZZ". (This example is taken from "Make Amazon Product Images Larger" [Hack #25].) var snapProductImages = document.evaluate("//img[contains(@src, 'MZZZZZZZ')", document, null, XPathResult.UNORDERED_NODE_SNAPSHOT_TYPE, null); You can also do combinations of attributes. This query finds all images with a width of 36 and a height of 14. (This query is taken from "Zap Ugly XML Buttons" [Hack #86].) var snapXMLImages = document.evaluate("//img[@width='36'][@height='14']", document, null, XPathResult.UNORDERED_NODE_SNAPSHOT_TYPE, null); But wait, there's more! By using more advanced XPath syntax, you can actually find elements that are contained within other elements. This code finds all the links that are contained in a paragraph whose class is g. (This example is taken from "Refine Your Google Search" [Hack #96].) var snapResults = document.evaluate("//p[@class='g']//a", document, null, XPathResult.UNORDERED_NODE_SNAPSHOT_TYPE, null); Finally, you can find a specific element by passing XPathResult.FIRST_ORDERED_NODE_TYPE in the third parameter. This line of code finds the first link whose class is "yschttl". (This example is taken from "Prefetch Yahoo! Search Results" [Hack #52].) var elmFirstResult = document.evaluate("//a[@class='yschttl']", document, null, <b>XPathResult.FIRST_ORDERED_NODE_TYPE</b>, null).singleNodeValue; If you weren't brain-fried by now, I'd be very surprised. XPath is, quite literally, a language all its own. Like regular expressions, XPath can make your life easier, or it can make your life a living hell. Remember, you can always get what you need (eventually) with standard DOM functions such as document.getElementById or document.getElementsByTagName. XPath's a good tool to have in your tool chest, but it's not always the right tool for the job. |
< Day Day Up > |