Generating XML: The FOR XML Clause

The FOR XML clause extends a SELECT statement by allowing return of data from relational tables into an XML structured document. The basic syntax of the FOR XML clause is as follows :

   FOR XML {    RAW [('<element>')] [<directives>] [ELEMENTS]   AUTO [<directives>] [ELEMENTS]   EXPLICIT [<directives>]   PATH [('<element>')] [<directives>] [ELEMENTS] } <directives> = [ , TYPE ] [, ROOT  ('<root>') ]   

The FOR XML clause syntax can be described as follows:

  • RAW : Returns a single XML element for each record returned by a query. Thus a join returns an element for each record returned by the join, regardless of the number of tables in the join query. All fields become attributes with each record-element produced.

  • AUTO : A nested structure is created when there is more than one table in the query. Thus, for join queries, a nested structure of XML elements is returned. In other words, each table in the query returns a single layer of nested elements. Additionally, the order of fields in the selected list helps to determine XML document structure.

  • EXPLICIT : This mode allows the most flexibility but also more complex queries. However, explicit definition allows for explicit definition of an XML element hierarchical structure.

  • ('<element>') : Allows for an element name change.

  • ELEMENTS : Fields are returned as child elements as opposed to attribute name-value pairs.

  • <directives> : Some of these are the TYPE and ROOT directives:

    • TYPE : A query or subquery will return a string typecasted into a SQL Server XML data type.

    • ROOT : Adds a top layer root element to XML document output, creating a properly structured XML document. Not specifying a name creates the root element as <root> .

  • PATH : Allows more simplistic and comprehensive query construction than the EXPLICIT clause.

  • Other options are as follows:

    • XMLDATA : Returns an XML-Data Reduced (XDR) schema

    • XMLSCHEMA : Returns an XSD schema

    • BINARY BASE64 : Returns binary data

Lets begin with RAW mode.

FOR XML RAW Mode

This is a simple query executed against a single table, displaying regions :

   USE [newdem] GO SELECT * FROM region GO   

Figure 6-4 shows the result of the preceding query.

image from book
Figure 6-4: A single table query

The following query uses RAW mode to return regions. Each record is returned as a single XML element where each field becomes an attribute of each record-element:

   SELECT * FROM region FOR XML RAW GO   

RAW mode produces this:

   <row region_id="1" region="Africa" population="789548670" area="26780325" /> <row region_id="2" region="Asia" population="47382633" area="657741" /> <row region_id="3" region="Australasia" population="24340222" area="7886602" /> <row region_id="4" region="Caribbean" population="40417697" area="268857" /> <row region_id="5" region="Central America" population="142653392" area="2360325" /> <row region_id="6" region="Europe" population="488674441" area="4583335" /> <row region_id="7" region="Far East" population="2100636517" area="15357441" /> <row region_id="8" region="Middle East" population="294625718" area="6798768" /> <row region_id="9" region="Near East" population="1499157105" area="4721322" /> <row region_id="10" region="North America" population="331599508" area="18729272" /> <row region_id="11" region="Oceania" population="9133256" area="536238" /> <row region_id="12" region="Russian Federation" population="258037209" area="21237500" /> <row region_id="13" region="South America" population="375489788" area="17545171" />   

The element name can be renamed by doing this:

   SELECT * FROM region FOR XML RAW('region') GO   

This is the result showing the name of the element changed to region :

   <region region_id="1" region="Africa" population="789548670" area="26780325" /> <region region_id="2" region="Asia" population="47382633" area="657741" /> <region region_id="3" region="Australasia" population="24340222" area="7886602" /> <region region_id="4" region="Caribbean" population="40417697" area="268857" /> <region region_id="5" region="Central America" population="142653392" area="2360325" /> <region region_id="6" region="Europe" population="488674441" area="4583335" /> <region region_id="7" region="Far East" population="2100636517" area="15357441" /> <region region_id="8" region="Middle East" population="294625718" area="6798768" /> <region region_id="9" region="Near East" population="1499157105" area="4721322" /> <region region_id="10" region="North America" population="331599508" area="18729272" /> <region region_id="11" region="Oceania" population="9133256" area="536238" /> <region region_id="12" region="Russian Federation" population="258037209" area="21237500" /> <region region_id="13" region="South America" population="375489788" area="17545171" />   

Use the ELEMENTS clause to convert all attributes produced in RAW mode into child elements:

   SELECT * FROM region FOR XML RAW('region'), ELEMENTS GO   

This is a partial result:

   <region>   <region_id>1</region_id>   <region>Africa</region>   <population>789548670</population>   <area>26780325</area> </region> <region>   <region_id>2</region_id>   <region>Asia</region>   <population>47382633</population>   <area>657741</area> </region> ... <region>   <region_id>13</region_id>   <region>South America</region>   <population>375489788</population>   <area>17545171</area> </region>   

Now lets do a join without the ELEMENTS clause:

   SELECT r.region, co.country FROM region AS r JOIN country AS co ON co.region_id = r.region_id FOR XML RAW GO   

This is a partial result:

   ... <row region="Africa" country="Zambia" /> <row region="Africa" country="Zimbabwe" /> <row region="Asia" country="Burma" /> <row region="Australasia" country="Australia" /> <row region="Australasia" country="New Zealand" /> <row region="Caribbean" country="Bahamas" /> <row region="Caribbean" country="Barbados" /> ...   

And now lets add the ELEMENTS clause and this time join three tables:

   SELECT r.region, co.country, s.state FROM region AS r JOIN country AS co ON co.region_id = r.region_id JOIN state AS s ON s.country_id = co.country_id FOR XML RAW GO   

Here is a partial result showing the three tables joined, returning states, within countries , within regions:

   <row region="North America" country="Canada" state="British Columbia" /> <row region="North America" country="Canada" state="Nova Scotia" /> <row region="North America" country="Canada" state="Ontario" /> <row region="North America" country="Canada" state="Quebec" /> <row region="North America" country="United States" state="Alaska" /> <row region="North America" country="United States" state="Alabama" /> <row region="North America" country="United States" state="Arkansas" /> <row region="North America" country="United States" state="Arizona" /> ...   

Now the ELEMENTS clause is added back in:

   SELECT r.region, co.country, s.state FROM region AS r JOIN country AS co ON co.region_id = r.region_id JOIN state AS s ON s.country_id = co.country_id FOR XML RAW('region'), ELEMENTS GO   

This is the result. The problem is that there is only a single element layer:

   <region>   <region>North America</region>   <country>Canada</country>   <state>British Columbia</state> </region> <region>   <region>North America</region>   <country>Canada</country>   <state>Nova Scotia</state> </region> <region>   <region>North America</region>   <country>Canada</country>   <state>Ontario</state> </region> <region>   <region>North America</region>   <country>Canada</country>   <state>Quebec</state> </region> <region>   <region>North America</region>   <country>United States</country>   <state>Alaska</state> </region> ... <region>   <region>North America</region>   <country>United States</country>   <state>Vermont</state> </region>   

At this stage the limits of what RAW mode can do have more or less been reached, and it is better to proceed to using AUTO mode.

FOR XML AUTO Mode

As already stated, AUTO mode allows the creation of a nested XML structure. The previous example in the previous section on RAW mode showed that only a single nested layer could be created using RAW mode. There are two ways in which AUTO mode improves on using RAW mode:

  • Element names are not called row by default. And they do not have to be set as in FOR XML RAW('region') . With AUTO mode the table names in a query become the XML element names.

  • The order of tables in a join query, coupled with the order of fields in the selected field list determines the layer of the hierarchy in which elements are displayed.

The following query is the last example used in the previous section covering RAW mode, joining three tables. The query is now changed to return XML in AUTO as opposed to RAW mode:

   SELECT r.region, r.population, r.area, co.country, co.currency, s.state, s.nickname FROM region AS r JOIN country AS co ON co.region_id = r.region_id JOIN state AS s ON s.country_id = co.country_id FOR XML AUTO GO   

Again here is a complete result:

   <r region="North America" population="331599508" area="18729272">   <co country="Canada" currency="Dollars">     <s state="British Columbia" nickname="" />     <s state="Nova Scotia" nickname="" />     <s state="Ontario" nickname="" />     <s state="Quebec" nickname="" />   </co>   <co country="United States" currency="">     <s state="Alaska" nickname="" />     <s state="Alabama" nickname="" />     <s state="Arkansas" nickname="The Natural State" />     <s state="Arizona" nickname="" />     <s state="California" nickname="Golden State" />     <s state="Colorado" nickname="Centennial State" />     <s state="Connecticut" nickname="Nutmeg State" />     <s state="Dictrict of Columbia" nickname="" />     <s state="Delaware" nickname="Diamond State" />     <s state="Florida" nickname="Sunshine State" />     <s state="Georgia" nickname="Peach State" />     <s state="Hawaii" nickname="Aloha State" />     <s state="Iowa" nickname="Hawkeye State" />     <s state="Idaho" nickname="Gem State" />     <s state="Illinois" nickname="Prairie State" />     <s state="Indiana" nickname="Hoosier State" />     <s state="Kansas" nickname="Sunflower State" />     <s state="Kentucky" nickname="Bluegrass State" />     <s state="Louisiana" nickname="Pelican State" />     <s state="Massachusetts" nickname="Bay State" />     <s state="Maryland" nickname="Free State" />     <s state="Maine" nickname="Pine Tree State" />     <s state="Michigan" nickname="Wolverine State" />     <s state="Minnesota" nickname="North Star State" />     <s state="Missouri" nickname="Show-me State" />     <s state="Mississippi" nickname="Magnolia State" />     <s state="Montana" nickname="Treasure State" />     <s state="North Carolina" nickname="Tar Heel State" />     <s state="North Dakota" nickname="Sioux State" />     <s state="New Hampshire" nickname="Granite State" />     <s state="New Mexico" nickname="Land of Enchantment" />     <s state="Nevada" nickname="Sagebrush State" />     <s state="New York" nickname="Empire State" />     <s state="Ohio" nickname="Buckeye State" />     <s state="Oklahoma" nickname="Sooner State" />     <s state="Oregon" nickname="Beaver State" />     <s state="Pennsylvania" nickname="Keystone State" />     <s state="Rhode Island" nickname="The Ocean State" />     <s state="South Carolina" nickname="Palmetto State" />     <s state="South Dakota" nickname="Mount Rushmore State" />     <s state="Tennessee" nickname="Volunteer State" />     <s state="Texas" nickname="Lone Star State" />     <s state="Utah" nickname="Beehive State" />     <s state="Virginia" nickname="The Old Dominion" />     <s state="Washington" nickname="Evergreen State" />     <s state="Wisconsin" nickname="Badger State" />     <s state="West Virginia" nickname="Mountain State" />     <s state="Wyoming" nickname="Equality State" />   </co>   <co country="Canada" currency="Dollars">     <s state="Alberta" nickname="" />   </co>   <co country="United States" currency="">     <s state="Nebraska" nickname="Cornhusker State" />     <s state="New Jersey" nickname="Garden State" />     <s state="Vermont" nickname="Green Mountain State" />   </co> </r>   

A couple of things are odd about the preceding query result:

  • The ordering is completely messed up.

  • Table aliases are used to name elements.

These issues can be resolved by including an ORDER BY clause in the query and by removing aliases from the query. I have also changed the attribute names for regions, countries, and states to a generic value in the form of the name attributes:

   SELECT region.region AS name, region.population, region.area,    country.country AS name, country.currency, state.state AS name, state.nickname FROM region JOIN country ON country.region_id = region.region_id JOIN state ON state.country_id = country.country_id ORDER BY region.region, country.country, state.state FOR XML AUTO GO   

The result is now properly sorted with more sensibly named elements:

   <region name="North America" population="331599508" area="18729272">   <country name="Canada" currency="Dollars">     <state name="Alberta" nickname="" />     <state name="British Columbia" nickname="" />     <state name="Nova Scotia" nickname="" />     <state name="Ontario" nickname="" />     <state name="Quebec" nickname="" />   </country>   <country name="United States" currency="">     <state name="Alabama" nickname="" />     <state name="Alaska" nickname="" />     <state name="Arizona" nickname="" />     <state name="Arkansas" nickname="The Natural State" />     <state name="California" nickname="Golden State" />     <state name="Colorado" nickname="Centennial State" />     <state name="Connecticut" nickname="Nutmeg State" />     <state name="Delaware" nickname="Diamond State" />     <state name="Dictrict of Columbia" nickname="" />     <state name="Florida" nickname="Sunshine State" />     <state name="Georgia" nickname="Peach State" />     <state name="Hawaii" nickname="Aloha State" />     <state name="Idaho" nickname="Gem State" />     <state name="Illinois" nickname="Prairie State" />     <state name="Indiana" nickname="Hoosier State" />     <state name="Iowa" nickname="Hawkeye State" />     <state name="Kansas" nickname="Sunflower State" />     <state name="Kentucky" nickname="Bluegrass State" />     <state name="Louisiana" nickname="Pelican State" />     <state name="Maine" nickname="Pine Tree State" />     <state name="Maryland" nickname="Free State" />     <state name="Massachusetts" nickname="Bay State" />     <state name="Michigan" nickname="Wolverine State" />     <state name="Minnesota" nickname="North Star State" />     <state name="Mississippi" nickname="Magnolia State" />     <state name="Missouri" nickname="Show-me State" />     <state name="Montana" nickname="Treasure State" />     <state name="Nebraska" nickname="Cornhusker State" />     <state name="Nevada" nickname="Sagebrush State" />     <state name="New Hampshire" nickname="Granite State" />     <state name="New Jersey" nickname="Garden State" />     <state name="New Mexico" nickname="Land of Enchantment" />     <state name="New York" nickname="Empire State" />     <state name="North Carolina" nickname="Tar Heel State" />     <state name="North Dakota" nickname="Sioux State" />     <state name="Ohio" nickname="Buckeye State" />     <state name="Oklahoma" nickname="Sooner State" />     <state name="Oregon" nickname="Beaver State" />     <state name="Pennsylvania" nickname="Keystone State" />     <state name="Rhode Island" nickname="The Ocean State" />     <state name="South Carolina" nickname="Palmetto State" />     <state name="South Dakota" nickname="Mount Rushmore State" />     <state name="Tennessee" nickname="Volunteer State" />     <state name="Texas" nickname="Lone Star State" />     <state name="Utah" nickname="Beehive State" />     <state name="Vermont" nickname="Green Mountain State" />     <state name="Virginia" nickname="The Old Dominion" />     <state name="Washington" nickname="Evergreen State" />     <state name="West Virginia" nickname="Mountain State" />     <state name="Wisconsin" nickname="Badger State" />     <state name="Wyoming" nickname="Equality State" />   </country> </region>   

And now for something really odd I can change the hierarchical structure of the output by rearranging the sequence of the fields in the selected list of fields:

   SELECT country.country AS name, country.currency,    state.state AS name, state.nickname,    region.region AS name, region.population, region.area FROM region JOIN country ON country.region_id = region.region_id JOIN state ON state.country_id = country.country_id ORDER BY region.region, country.country, state.state FOR XML AUTO GO   

The partial result is nonsensical but it shows what can be done with AUTO mode:

   <country name="Canada" currency="Dollars">   <state name="Alberta" nickname="">     <region name="North America" population="331599508" area="18729272" />   </state>   <state name="British Columbia" nickname="">     <region name="North America" population="331599508" area="18729272" />   </state>   <state name="Nova Scotia" nickname="">     <region name="North America" population="331599508" area="18729272" />   </state>   <state name="Ontario" nickname="">     <region name="North America" population="331599508" area="18729272" />   </state>   <state name="Quebec" nickname="">     <region name="North America" population="331599508" area="18729272" />   </state> </country> <country name="United States" currency="">   <state name="Alabama" nickname="">     <region name="North America" population="331599508" area="18729272" />   </state>   <state name="Alaska" nickname="">     <region name="North America" population="331599508" area="18729272" />   </state> ...   <state name="Wyoming" nickname="Equality State">     <region name="North America" population="331599508" area="18729272" />   </state> </country>   

The AUTO clause works better when using subqueries. In the following query the TYPE clause is effectively used to typecast a subquery into the class of the calling (parent) query. Additionally, the ELEMENTS clause is used in this query to push all attributes into child elements, and the ROOT clause is used to add a root element as a wrapper for the entire XML document:

   SELECT region.region AS name, region.population, region.area,    (SELECT country.country AS name, country.currency,       (SELECT state.state AS name, state.nickname        FROM state WHERE state.country_id = country.country_id FOR XML AUTO, TYPE)     FROM country WHERE country.region_id = region.region_id FOR XML AUTO, TYPE) FROM region FOR XML AUTO, ROOT, ELEMENTS GO   

Here is another partial result:

   <root>   <region>     <name>Africa</name>     <population>789548670</population>     <area>26780325</area>     <country name="Algeria" currency="Algeria Dinars" />     <country name="Angola" currency="Kwanza" /> ...   <region>     <name>Near East</name>     <population>1499157105</population>     <area>4721322</area>     <country name="Bangladesh" currency="Taka" />     <country name="India" currency="Rupees" />     <country name="Pakistan" currency="Rupees" />     <country name="Sri Lanka" currency="Rupees" />     <country name="Turkey" currency="New Lira" />   </region>   <region>     <name>North America</name>     <population>331599508</population>     <area>18729272</area>     <country name="Canada" currency="Dollars">       <state name="British Columbia" nickname="" />       <state name="Nova Scotia" nickname="" />       <state name="Ontario" nickname="" />       <state name="Quebec" nickname="" />       <state name="Alberta" nickname="" />     </country>     <country name="Greenland" currency="" />     <country name="United States" currency="">       <state name="Alaska" nickname="" />       <state name="Alabama" nickname="" />       <state name="Arkansas" nickname="The Natural State" /> ...       <state name="Vermont" nickname="Green Mountain State" />     </country>   </region>   <region>     <name>Oceania</name>     <population>9133256</population>     <area>536238</area> ...     <country name="Guyana" currency="Dollars" />     <country name="Paraguay" currency="Guarani" />     <country name="Peru" currency="Nuevos Soles" />     <country name="Surinam" currency="" />     <country name="Uruguay" currency="Pesos" />     <country name="Venezuela" currency="Bolivares" />   </region> </root>   

Thats enough about AUTO mode. EXPLICIT mode allows for more precise XML document output.

FOR XML EXPLICIT Mode

As stated previously, EXPLICIT mode provides the most flexibility in creation of XML documents but with more complexity when building queries. The EXPLICIT clause allows for complete control of XML document structure and content. The EXPLICIT clause has very specific syntax in that the first field in the selected list is called TAG and the second field is called PARENT . This applies to the primary calling query, and all subqueries that use the EXPLICIT clause.

The TAG and PARENT fields are used to determine the level of each query in the hierarchy. The TAG field is a positive number determining the level of a query in the hierarchy. The PARENT field is also a positive number but is the tag number of the field under which the current element will be placed into the hierarchy. The PARENT of the first element can also be NULL or because the topmost element does not effectively have a parent in the resulting query. Data fields appear after the TAG and PARENT fields.

Every data field retrieved must be aliased using the following syntax notation:

   [<element>]!<tag#>[!<attribute>][!<directive>]   

This is a simple example query finding records in a single table:

   SELECT 1 AS TAG, 0 AS PARENT,   region AS [region!1!name] FROM region FOR XML EXPLICIT GO   

This is the result:

   <region name="Africa" /> <region name="Asia" /> <region name="Australasia" /> <region name="Caribbean" /> <region name="Central America" /> <region name="Europe" /> <region name="Far East" /> <region name="Middle East" /> <region name="Near East" /> <region name="North America" /> <region name="Oceania" /> <region name="Russian Federation" /> <region name="South America" />   

Now lets build a hierarchy. Each layer in the hierarchy is constructed as a separate query. The individual queries for each level are merged together using a UNION clause:

   SELECT 1 AS TAG, 0 AS PARENT,    region AS [region!1!name],    NULL AS [region!2!population], NULL AS [region!2!area] FROM region UNION ALL SELECT 2 AS TAG, 1 AS PARENT,    region AS [region!1!name],    population AS [region!2!population], area AS [region!2!area] FROM region ORDER BY [region!1!name] FOR XML EXPLICIT GO   

This is the result:

   <region name="Africa">   <region population="789548670" area="26780325" />   <region population="47382633" area="657741" /> </region> <region name="Asia" /> <region name="Australasia">   <region population="24340222" area="7886602" />   <region population="40417697" area="268857" /> </region> ... <region name="Russian Federation" /> <region name="South America">   <region population="375489788" area="17545171" /> </region>   

Doing the same thing with a multiple table join would look something like this:

   SELECT 1 AS TAG, 0 AS PARENT,    region AS [region!1!name],     population AS [region!1!population], area AS [region!1!area],    NULL AS [country!2!name], NULL AS [country!2!currency],    NULL AS [state!3!name], NULL AS [state!3!nickname] FROM region UNION ALL SELECT 2 AS TAG, 1 AS PARENT,    NULL AS [region!1!name],    NULL AS [region!1!population], NULL AS [region!1!area],    country AS [country!2!name], currency AS [country!2!currency],    NULL AS [state!3!name], NULL AS [state!3!nickname] FROM region JOIN country ON country.region_id = region.region_id UNION ALL SELECT 3 AS TAG, 2 AS PARENT,    NULL AS [region!1!name],    NULL AS [region!1!population], NULL AS [region!1!area],    NULL AS [country!2!name], NULL AS [country!2!currency],    state AS [state!3!name], nickname AS [state!3!nickname] FROM country JOIN state ON state.country_id = country.country_id ORDER BY [region!1!name], [country!2!name], [state!3!name] FOR XML EXPLICIT GO   

The preceding query is ugly and complicated. The better way is by using the PATH clause.

FOR XML PATH Mode

The PATH clause is a better option than the EXPLICIT clause because queries are just as powerful, but much easier to code, and much more efficient.

This example reads a single table, allowing specification of both attributes and elements:

   SELECT region_id AS "@region_id", region AS "name",    population AS "population", area AS "area" FROM region FOR XML PATH('region'), ROOT GO   

This is a partial result:

   <root>   <region region_id="1">     <name>Africa</name>     <population>789548670</population>     <area>26780325</area>   </region>   <region region_id="2">     <name>Asia</name>     <population>47382633</population>     <area>657741</area>   </region> ...   <region region_id="13">     <name>South America</name>     <population>375489788</population>     <area>17545171</area>   </region> </root>   

The next example uses nesting subqueries within the selected fields list in order to create a hierarchical structure. As you can see, using the PATH clause makes the query look a lot more like simple SQL, as opposed to the horrendous complexity of using the EXPLICIT clause:

   SELECT region_id AS "@region_id", region AS "name",    population AS "population", area AS "area", (    SELECT country_id AS "@country_id", country.country AS name, country.currency,    (       SELECT state_id AS "@state_id", state.state AS name, state.nickname       FROM state       WHERE state.country_id = country.country_id       FOR XML PATH('state'), TYPE    )    FROM country    WHERE country.region_id = region.region_id    FOR XML PATH('country'), TYPE ) FROM region FOR XML PATH('region'), ROOT GO   

This is a partial result:

   <root>   <region region_id="1">     <name>Africa</name>     <population>789548670</population>     <area>26780325</area>     <country country_id="1">       <name>Algeria</name>       <currency>Algeria Dinars</currency>     </country> ...   </region> ...   <region region_id="10">     <name>North America</name>     <population>331599508</population>     <area>18729272</area>     <country country_id="125">       <name>Canada</name>       <currency>Dollars</currency>       <state state_id="1">         <name>British Columbia</name>         <nickname />       </state>       <state state_id="2">         <name>Nova Scotia</name>         <nickname />       </state>       <state state_id="3">         <name>Ontario</name>         <nickname />       </state> ...     </country>   </region> </root>   

Overall, the FOR XML clause is quite an effective and simplistic method of producing XML documents data from relational database tables.



Beginning XML Databases
Beginning XML Databases (Wrox Beginning Guides)
ISBN: 0471791202
EAN: 2147483647
Year: 2006
Pages: 183
Authors: Gavin Powell

flylib.com © 2008-2017.
If you may any questions please contact us: flylib@qtcs.net