10.4 Parsing XML | Core JSTL[c] Mastering the JSP Standard Tag Library

Before we can manipulate data in an XML document, we must first parse the document with the <x:parse> action, which has the following syntax: ^[8]

^[8] Items in brackets are optional. See "<x:parse>" on page 543 for a complete description of <x:parse> syntax.

  <x:parse xml [systemId] [filter] {var [scope]  varDom [scopeDom]}/>

The preceding syntax for the <x:parse> action has two required attributes: xml and either var or varDom , which represent the name of a scoped variable that references the parsed document. The xml attribute, which represents an XML document, can be a string or a reader.

If you specify the var attribute, the JSTL implementation is free to represent the parsed document with any data type. If you specify the varDom attribute instead, the JSTL implementation must make the parsed document available as a Document Object Model (DOM) object. If you specify the var attribute, you can set the scope for that variable with the scope attribute; likewise, you can specify the scopeDom attribute to set the scope for the variable specified with the varDom attribute.

The <x:parse> action also supports an alternative syntax that lets you specify the XML document in the body of the action:

  <x:parse [systemId] [filter] {var [scope]  varDom [scopeDom]}   xml   </x:parse>

The preceding syntax is the same as the first syntax, except that the xml attribute is replaced by the body of the action.

Both <x:parse> syntaxes support two additional attributes: systemId and filter . The systemId attribute specifies a URI used to resolve external entities. The use of that attribute is discussed in "Accessing External Entities" on page 460. The filter attribute specifies a Simple API for XML (SAX) filter that's used to filter the XML parsed by the <x:parse> action. The use of that attribute is discussed in "Filtering XML" on page 452.

The <x:parse> action does not perform validation against XML DTDs or Schemas.

Figure 10-2 shows a JSP page that parses the Rolodex XML file listed in Listing 10.1 on page 424 and displays the information contained in that file.

Figure 10-2. Parsing XML

graphics/10fig02.jpg

The JSP page shown in Figure 10-2 is listed in Listing 10.2.

Parsing an XML file with <x:parse> is typically a two-step process, as is the case for the preceding JSP page. ^[9] The first step, which is normally accomplished with the <c:import> action, imports the file and stores the contents of that file in a string or reader. In the preceding JSP page, the <c:import> action imports the file rolodex.xml and stores the contents of that file in a string that you can reference with a scoped variable named

^[9] You can also parse XML in one step if you specify the XML in the body of the <x:parse> action; however, because the XML document typically resides in a separate file, the two-step approach is most often used.

Listing 10.2 index.jsp (Parsing XML)

 <!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.0 Transitional//EN"> <html>    <head>       <title>Parsing XML</title>    </head>    <body>       <%@ taglib uri='http://java.sun.com/jstl/core' prefix='c' %>       <%@ taglib uri='http://java.sun.com/jstl/xml'  prefix='x' %>       <%-- Import the XML file --%>  <c:import var='rolodex_xml' url='rolodex.xml'/>  <%-- Parse the XML file --%>  <x:parse var='document' xml='${rolodex_xml}'/>  <p>There are  <x:out select='count($document//contact)'/>  contacts in the rolodex.<p>  <x:out   select='count($document//contact/phone[@type="work"])'/>  of those contacts have a work phone number and  <x:out   select='count($document//contact/phone[@type="home"])'/>  of those contacts have a home phone number.<p>       <%-- For each contact in the Rolodex... --%>  <x:forEach select='$document//contact'>  <table>             <tr>                <td>First Name:</td>                <td>  <x:out select='firstName'/>  </td>             </tr>             <tr>                <td>Last Name:</td>                <td>  <x:out select='lastName'/>  </td>             </tr>             <tr>                <td>Email:</td>                <td>  <x:out select='email'/>  </td>             </tr>             <tr>                <td>Work Phone:</td>                <td>  <x:out select='phone[@type="work"]'/>  </td>             </tr>             <%-- Home phone is optional, so we check to see                  if it exists before processing it --%>  <x:if select='phone[@type="home"]'>  <tr>                   <td>Home Phone:</td>                   <td>  <x:out select='phone[@type="home"]'/>  </td>                </tr>  </x:if>  </table><p>  </x:forEach>  </body> </html>

rolodex_xml . The second step parses the contents of the XML file with <x:parse>; in the preceding JSP page, the scoped variable named rolodex_xml is specified with the xml attribute and the parsed XML is stored in a page-scoped variable named document .

Once you've parsed XML with the <x:parse> action, you can use the scoped variable created by that action to access the parsed XML file. The preceding JSP page shows you how to do that with the <x:if>, <x:out>, and <x:forEach> actions. That JSP page uses the <x:out> action to display the number of contacts in the Rolodex and the number of contacts that have work and home phone numbers . The JSP page uses the <x:forEach> action to iterate over every contact in the Rolodex and uses <x:if> and <x:out> actions in the body of the <x:forEach> action to display the information associated with each contact.

Besides XML parsing, there are a few other points of interest in the preceding JSP page. First, the <x:forEach> action and the <x:out> actions that are not contained in the body of the <x:forEach> action establish a context node by specifying the name of the scoped variable ” document ”that references the parsed XML with this syntax: $document//... The <x:out> actions and the <x:if> action contained in the body of the <x:forEach> action do not specify a context node, because the context node ”which represents each contact in the order in which it appears in the XML file ”is established by the <x:forEach> action.

Second, the double slash is an XPath abbreviated syntax that means all of the nodes that are descendants of a specified node; ^[10] for example, the XPath expression $document//contact evaluates to a node-set that contains all of the contact nodes within the document, regardless of where they appear in the document; that expression is equivalent to $document/rolodex/contact , which also selects all of the document's contacts.

^[10] The specified node is also included in the node-set if it matches the location path criteria.

Third, notice the use of the <x:if> action. The body of that action is evaluated if the XPath expression specified with the select attribute evaluates to true . That expression ” phone[@type="home"] ”specifies a location path, phone ”that selects all phone nodes that are children of the context node. That expression also specifies a predicate ” [@type="home"] ”which restricts that node-set to phone nodes that have a type attribute equal to home . The end result of that expression is a node-set of all phone nodes that are children of the context node and that have a type attribute whose value is home . That node-set is coerced to a boolean value according to the algorithm described in Table 10.2; if the node-set is not empty, the expression evaluates to true and the body of the <x:if> action is evaluated. Otherwise, it evaluates to false and the body of the <x:if> action is not evaluated.

The preceding JSP page specifies a scoped variable ” document ”in XPath expressions to establish a context node. JSTL XML actions let you access other JSP variables , including request parameters and cookies, as discussed in the next section.