Implementing XML in an Oracle Database | Beginning XML Databases (Wrox Beginning Guides)

There are a number of things that you need to know about using XML documents in a database:

Creating XML documents from a database: This is specialized functionality allowing the creation of XML documents in Oracle Database. Some of this was functionality described generically, using very basic SQL commands, in Chapter 4. This chapter will get a little more sophisticated, and obviously Oracle Database specific, on this particular topic.
Creating XML documents inside a database: In this chapter, you examine how to create XML documents within Oracle Database. This involves use of XMLType data types in tandem with Oracle Database ˆ specific functionality to create those XML document structures.
Retrieving data from XML documents: Again, there is specialized Oracle Database functionality for retrieving XML Document content from XMLType data types.
Changing XML document content: And once again, the XMLType data type comes packaged with specialized functionality. This functionality allows changes to XML documents while they are stored in an Oracle database.

Lets begin.

Creating XML Documents from an Oracle Database

When creating an XML document you are going to read data in an Oracle database. Reading data from any relational database generally involves reading records from tables. There are various Oracle Database ˆ specific methods of doing this:

The SQL/XML Standard: There are numerous SQL/XML functions and attributes that adhere to the SQL Standard for XML.

The SQL Standard for XML is a standard created and supported by INCITS (International Committee for Information Technology Standards).

Basic functions create elements (XML tags) and assign attributes to elements (attributes within individual tags), amongst other functionality.

The DBMS_XMLGEN Package: This package is complex and creates an XML document based on an entire query.

A package is a programming modular structure used by Oracle PL/SQL. A package is simply a group of stored procedures or functions, grouped together as a single block of code. Thus a package allows access to contained procedures and functions, by referencing procedures or functions as being within that package. This is a programming concept similar to creating class libraries in a language such as Java.
The SYS_XMLGEN Function: This function creates an XML document for each record as retrieved by a query to an Oracle database.
The XSU utility: This is an XML SQL utility specifically built for Java. This utility is too advanced to be covered in this book. Oracle Database allows for Java-written code to be both stored and executed within an Oracle database. Java can be used to code stored procedures and functions within an Oracle database. Additionally, that Java code can be compiled to a less interpretive form of machine code. This allows Java-created stored procedures and functions to help the overall performance of an Oracle database.

Because the SQL/XML functions are the accepted standard, lets examine these functions in detail.

The SQL/XML Standard

As already stated, SQL/XML as used in Oracle Database, does adhere to INCITS established standards. Lets first of all introduce some of the basic functionality of SQL/XML as it is available in Oracle Database. After that you will see examples to explain it all:

XMLELEMENT : Allows creation of XML tag elements. The syntax is as follows :

   XMLELEMENT ([ NAME ] identifier [, attributes ] [, expression [ , ... ] ])

XMLATTRIBUTES : Assigns attribute values to tags. The syntax is as follows:
```
   XMLATTRIBUTES (expression [ AS alias ] [ , ... ])   
```
XMLCONCAT : Concatenates multiple XML element tags. The syntax is as follows:
```
   XMLCONCAT (XMLType object)..   
```
XMLAGG : Creates a single column or expression from multiple rows by aggregating them into a single row and XML tag. The syntax is as follows:
```
   XMLAGG (XMLType object [ ORDER BY ... ])   
```
XMLCOLATTVAL : Attempts to standardize for relational tables. Every subset unit is given the tag column and the original name of the tag becomes an attribute of the column tag. For example, <name>Jim Jones</name> becomes <column name = "NAME">Jim Jones</column> .
```
   XMLCOLATTVAL (expression [ AS alias ] [ , ... ])   
```
XMLFOREST : Functions the same way as multiple XMLELEMENT executions where each element is created as a tag, containing their respective values. The syntax is as follows:
```
   XMLFOREST (expression [ AS alias ] [ , ... ])   
```
XMLTRANSFORM : Executes a transformation for repeating groups in an XML document, applying an XSL (eXtensible Style Sheet) to each repeating group item in the XML document. The XMLTRANSFORM method will not be demonstrated in the examples that follow because all it does is apply an XSL style sheet to an XML document. Applying XSL formatting and templating to XML documents is covered in Chapter 3. The syntax is as follows:
```
   XMLTRANSFORM (XMLType object, XMLType object)   
```

Now let me explain each method by example.

This example uses the XMLELEMENT function to create XML elements (tags) directly from an SQL statement (a much easier method than shown at the end of Chapter 4):

   SELECT XMLELEMENT("region", REGION) FROM REGION;

This is the result:

   <region>Africa</region> <region>Asia</region> <region>Australasia</region> <region>Caribbean</region> <region>Central America</region> <region>Europe</region> <region>Far East</region> <region>Middle East</region> <region>Near East</region> <region>North America</region> <region>Oceania</region> <region>Russian Federation</region> <region>South America</region>

And multiple XMLELEMENT functions can even be embedded within one another to produce a hierarchy of multiple layers , as an XML document should be constructed :

   SELECT    XMLELEMENT("region"       ,XMLELEMENT("name", REGION)       ,XMLELEMENT("population", POPULATION)) FROM REGION;

This is the result:

   <region><name>Africa</name><population>789548670</population></region> <region><name>Asia</name><population>47382633</population></region> <region><name>Australasia</name><population>24340222</population></region> <region><name>Caribbean</name><population>40417697</population></region> <region><name>Central America</name><population>142653392</population></region> <region><name>Europe</name><population>488674441</population></region> <region><name>Far East</name><population>2100636517</population></region> <region><name>Middle East</name><population>294625718</population></region> <region><name>Near East</name><population>1499157105</population></region> <region><name>North America</name><population>331599508</population></region> <region><name>Oceania</name><population>9133256</population></region> <region><name>Russian Federation</name><population>258037209</population></region> <region><name>South America</name><population>375489788</population></region>

In Oracle SQL*Plus tools you may have to execute SET LONG 2000 WRAP ON LINESIZE 5000 .

The resulting output, shown in the preceding code, is a little messy but the point is made. The preceding query is much easier than similar queries performed in Chapter 4.

This next example mixes elements and attributes together by adding in the XMLATTRIBUTES function:

   SELECT    XMLELEMENT("region", XMLATTRIBUTES(REGION AS "name")      ,XMLELEMENT("population", POPULATION)) FROM REGION;

This is the result with the name attribute highlighted for the first region:

   <region  name="Africa"  ><population>789548670</population></region> <region name="Asia"><population>47382633</population></region> <region name="Australasia"><population>24340222</population></region> <region name="Caribbean"><population>40417697</population></region> <region name="Central America"><population>142653392</population></region> <region name="Europe"><population>488674441</population></region> <region name="Far East"><population>2100636517</population></region> <region name="Middle East"><population>294625718</population></region> <region name="Near East"><population>1499157105</population></region> <region name="North America"><population>331599508</population></region> <region name="Oceania"><population>9133256</population></region> <region name="Russian Federation"><population>258037209</population></region> <region name="South America"><population>375489788</population></region>

And this query places all the fields applicable to each region into the element for each region:

   SELECT    XMLELEMENT("region",      XMLATTRIBUTES(REGION AS "name", POPULATION AS "population")) FROM REGION;

This is the result with the attributes highlighted for the first region again:

   <region  name="Africa" population="789548670"  ></region> <region name="Asia" population="47382633"></region> <region name="Australasia" population="24340222"></region> <region name="Caribbean" population="40417697"></region> <region name="Central America" population="142653392"></region> <region name="Europe" population="488674441"></region> <region name="Far East" population="2100636517"></region> <region name="Middle East" population="294625718"></region> <region name="Near East" population="1499157105"></region> <region name="North America" population="331599508"></region> <region name="Oceania" population="9133256"></region> <region name="Russian Federation" population="258037209"></region>

The XML_CONCAT function concatenates multiple XML fragments into a single XML pattern, which will be demonstrated in a roundabout way.

Concatenate means to add multiple strings together into a single string.

This example simply retrieves three fields from the COUNTRY table, all as separate elements:

   SELECT XMLELEMENT("region", REGION) AS "Region"    , XMLELEMENT("population", POPULATION) AS "Population"    , XMLELEMENT("area", AREA) AS "Area" FROM REGION;

The result is just nasty (showing only the first two rows, with huge and ugly long strings representing each XMLELEMENT method. The dots, .........., represent hundreds of space characters :

   <region>Asia</region>............<population>47382633</population>............<area >657741</area> <region>Australasia</region>............<population>24340222</population>.......... ..<area>7886602</area>

A TRIM function could be used, where the TRIM function removes all space characters and white space from either side of each of the three separate string values:

   SELECT TRIM(XMLELEMENT("region", REGION)) AS "Region"    , TRIM(XMLELEMENT("population", POPULATION)) AS "Population"    , TRIM(XMLELEMENT("area", AREA)) AS "Area" FROM REGION;

Again, the following shows the two regions in the previous example. This time the space characters are removed:

   <region>Asia</region> <population>47382633</population> <area>657741</area> <region>Australasia</region> <population>24340222</population> <area>7886602</area>

Now add the XMLCONCAT method and the benefit is obvious. There are now no spaces:

   SELECT XMLCONCAT(XMLELEMENT("region", REGION)    ,XMLELEMENT("population", POPULATION)    ,XMLELEMENT("area", AREA)) FROM REGION;

And now it makes sense to display all of the regions again. This is a much neater result:

   <region>Africa</region><population>789548670</population><area>26780325</area> <region>Asia</region><population>47382633</population><area>657741</area> <region>Australasia</region><population>24340222</population><area>7886602</area> <region>Caribbean</region><population>40417697</population><area>268857</area> <region>Central America</region><population>142653392</population><area>2360325</ area> <region>Europe</region><population>488674441</population><area>4583335</area> <region>Far East</region><population>2100636517</population><area>15357441</area> <region>Middle East</region><population>294625718</population><area>6798768</area> <region>Near East</region><population>1499157105</population><area>4721322</area> <region>North America</region><population>331599508</population><area>18729272</area> <region>Oceania</region><population>9133256</population><area>536238</area> <region>Russian Federation</region><population>258037209</population><area>21237500 </area> <region>South America</region><population>375489788</population><area>17545171</area>

The XMLAGG method aggregates separate lines of output, all into a single string. The XMLAGG method also allows sorting:

   SELECT XMLELEMENT("demographics", REGION' 'POPULATION' 'AREA) FROM REGION;

The result shows multiple lines returned:

   <demographics>Africa 789548670 26780325</demographics> <demographics>Asia 47382633 657741</demographics> <demographics>Australasia 24340222 7886602</demographics> <demographics>Caribbean 40417697 268857</demographics> <demographics>Central America 142653392 2360325</demographics> <demographics>Europe 488674441 4583335</demographics> <demographics>Far East 2100636517 15357441</demographics> <demographics>Middle East 294625718 6798768</demographics> <demographics>Near East 1499157105 4721322</demographics> <demographics>North America 331599508 18729272</demographics> <demographics>Oceania 9133256 536238</demographics> <demographics>Russian Federation 258037209 21237500</demographics> <demographics>South America 375489788 17545171</demographics>

Including the XMLAGG method returns the output as a single string and sorted in order of decreasing population. The POPULATION field is returned first in each demographics element to demonstrate this:

   SELECT XMLAGG(XMLELEMENT("demographics"    ,POPULATION' 'REGION' 'AREA) ORDER BY POPULATION DESC) FROM REGION;

This is the result, containing only the first three regions and clearly showing a single wrapped line in descending order of populations:

   <demographics>2100636517 Far East 15357441</demographics><demographics> 1499157105 N ear East 4721322</demographics><demographics>789548670 Africa 26780325</demographic s> ...

The disadvantage of XMLAGG is that it can only aggregate a single element, and thus fields are concatenated . The only real benefit of the XMLAGG method is to remove new line characters from output and resort. In general, unless XML documents are completely monstrous, or perhaps never viewed by a human eye, then it is not advantageous to remove new line characters. The following query would simply return regions, containing an executable XML document, simply aggregated into a single line:

   SELECT XMLAGG(XMLELEMENT("region", XMLATTRIBUTES(REGION AS "name")       ,XMLELEMENT("population", POPULATION)       ,XMLELEMENT("area", AREA))) FROM REGION;

The XMLCOLATTVAL method can be used to generate a form of a relational structure in an XML document format. The following query pulls three fields from the region table generating XML into a relational database recognizable format:

   SELECT    XMLELEMENT("region"      ,XMLCOLATTVAL(REGION, POPULATION, AREA)) FROM REGION;

This is the output of the query, containing only the first three regions with each separate line wrapped for the purposes of demonstration:

   <region><column name = "REGION">Africa</column><column name = "POPULATION">78954867 0</column><column name = "AREA">26780325</column></region> <region><column name = "REGION">Asia</column><column name = "POPULATION">47382633</ column><column name = "AREA">657741</column></region> <region><column name = "REGION">Australasia</column><column name = "POPULATION">243 40222</column><column name = "AREA">7886602</column></region>

Before going any further lets introduce the use of a function called EXTRACT , which has been mentioned previously in this chapter. This is the same query again but passing the entire resulting XMLType object produced by the SELECT statement through the EXTRACT function:

   SELECT    XMLELEMENT("region"      ,XMLCOLATTVAL(REGION, POPULATION, AREA)).EXTRACT('/*') FROM REGION;

Only the first two regions are shown but the output is so much easier to read in a standard XML document display format:

   <region>   <column name="REGION">Africa</column>   <column name="POPULATION">789548670</column>   <column name="AREA">26780325</column> </region> <region>   <column name="REGION">Asia</column>   <column name="POPULATION">47382633</column>   <column name="AREA">657741</column> </region>

In the future, the EXTRACT function will be used where it is appropriate.

The XMLFOREST method allows creation of multiple XML elements in a single method. The following query creates multiple elements by selecting fields. It is not creating each field as an individual element:

   SELECT    XMLELEMENT("region"      ,XMLFOREST(REGION AS "name", POPULATION AS "population", AREA AS "area")).EXTRACT('/*') FROM REGION;

This is the result showing only the first three regions:

   <region>   <name>Africa</name>   <population>789548670</population>   <area>26780325</area> </region> <region>   <name>Asia</name>   <population>47382633</population>   <area>657741</area> </region>

There is really one fundamental problem, which will be demonstrated shortly. You may have already surmised what this problem is. If not, then dont worry about it for it will be explained shortly. The first thing to discuss about this so far possibly unknown problem is the reason why there is a problem. There is a very fundamental disparity between relational database structure and object structures. A relational database and SQL working together output what are called tuples. A tuple is really a two-dimensional structure, where parent values are duplicated . An XML document can duplicate parent values by duplicating parent tags, but it is not supposed to. An XML document is supposed to be a hierarchical structure, where there is only ever one unique copy of each parent tag. This has already been explained in Chapter 2. Lets demonstrate the difference with some examples and show why SQL/XML falls short of the needs of proper XML document hierarchical structure. Examine the following example join query:

   SELECT R.REGION, R.POPULATION, R.AREA,    CO.COUNTRY, CO.POPULATION, CO.AREA, CO.CURRENCY, CO.RATE FROM REGION R JOIN COUNTRY CO ON (CO.REGION_ID=R.REGION_ID) WHERE R.REGION_ID IN (3,9) ORDER BY R.REGION, CO.COUNTRY;

In the partial result that follows, note how the regions are duplicated in all records (there are two occurrences of Australasia and five of Near East):

   REGION       POPULATION       AREA COUNTRY      POPULATION       AREA CURRENCY ------------ ---------- ---------- ------------ ---------- ---------- -------- Australasia    24340222    7886602 Australia      20264082    7617931 Dollars Australasia    24340222    7886602 New Zealand     4076140     268671 Dollars Near East    1499157105    4721322 Bangladesh    147365352     133911 Taka Near East    1499157105    4721322 India        1095351995    2973190 Rupees Near East    1499157105    4721322 Pakistan      165803560     778720 Rupees Near East    1499157105    4721322 Sri Lanka      20222240      64740 Rupees Near East    1499157105    4721322 Turkey         70413958     770761 New Lira

As you see in the preceding result, the parent regions for each country are duplicated. This is the equivalent SQL/XML query:

   SELECT    XMLELEMENT("region", XMLATTRIBUTES(R.REGION "name"),       XMLFOREST(R.POPULATION AS "population", R.AREA AS "area"),       XMLELEMENT("country", XMLATTRIBUTES(CO.COUNTRY "name"),          XMLFOREST(CO.POPULATION "population", CO.AREA "area",                    CO.CURRENCY "currency", CO.RATE "rate"))).EXTRACT('/*') FROM REGION R JOIN COUNTRY CO ON (CO.REGION_ID=R.REGION_ID) WHERE R.REGION_ID IN (3,9) ORDER BY R.REGION, CO.COUNTRY;

And this is the result of the SQL/XML query, showing the first four countries :

   <region  name="Australasia">   <population>24340222</population>   <area>7886602</area>  <country name="Australia">     <population>20264082</population>     <area>7617931</area>     <currency>Dollars</currency>     <rate>1.30141</rate>   </country> </region> <region  name="Australasia">   <population>24340222</population>   <area>7886602</area>  <country name="New Zealand">     <population>4076140</population>     <area>268671</area>     <currency>Dollars</currency>     <rate>1.42369</rate>   </country> </region> <region  name="Near East">   <population>1499157105</population>   <area>4721322</area>  <country name="Bangladesh">     <population>147365352</population>     <area>133911</area>     <currency>Taka</currency>     <rate>0</rate>   </country> </region> <region  name="Near East">   <population>1499157105</population>   <area>4721322</area>  <country name="India">     <population>1095351995</population>     <area>2973190</area>     <currency>Rupees</currency>     <rate>43.62</rate>   </country> </region>

Note how in the preceding XML output the highlighted parts show duplicated information. Imagine an XML document of this nature, where there are many layers of parent-child relationships between tags. The result could be XML data that is physically very much larger than it should be. Physical file sizes can affect performance and scanning efficiency drastically because I/O costs can go up.

I/O means Input/Output and refers to the activity of reading and writing from and to disk. Disk storage is much slower than communication between RAM (Random Access Memory) and the CPU (Central Processing Unit the processor) of a computer. I/O is very significant and should always be considered seriously.

Of course, a GROUP BY clause could be used in the join query to return only a single record for each region:

   SELECT R.REGION, R.POPULATION, R.AREA,    SUM(CO.POPULATION), SUM(CO.AREA) FROM REGION R JOIN COUNTRY CO ON (CO.REGION_ID=R.REGION_ID) WHERE R.REGION_ID IN (3,9) GROUP BY R.REGION, R.POPULATION, R.AREA ORDER BY R.REGION;

But as you can see in the result, the population and area values for each country are summarized, and the currency is completely listed:

   REGION       POPULATION       AREA SUM(CO.POPULATION) SUM(CO.AREA) ------------ ---------- ---------- ------------------ ------------ Australasia    24340222    7886602           24340222      7886602 Near East    1499157105    4721322         1499157105      4721322

And, of course, the GROUP BY clause could be applied to the SQL/XML query as well:

   SELECT    XMLELEMENT("region", XMLATTRIBUTES(R.REGION "name"),       XMLFOREST(R.POPULATION AS "population",          R.AREA AS "area",          SUM(CO.POPULATION) AS "populationSum",          SUM(CO.AREA) AS "areaSum")).EXTRACT('/*') FROM REGION R JOIN COUNTRY CO ON (CO.REGION_ID=R.REGION_ID) WHERE R.REGION_ID IN (3,9) GROUP BY R.REGION, R.POPULATION, R.AREA ORDER BY R.REGION;

The following produces unique regions. Again, there are no details on the countries because they have been aggregated into their respective parent regions:

   <region name="Australasia">   <population>24340222</population>   <area>7886602</area>   <populationSum>24340222</populationSum>   <areaSum>7886602</areaSum> </region> <region name="Near East">   <population>1499157105</population>   <area>4721322</area>   <populationSum>1499157105</populationSum>   <areaSum>4721322</areaSum> </region>

As you can see in the preceding result, the GROUP BY clause summarizes, removes country records, and becomes something that is not XML. What the XML document should really look like is shown here. The GROUP BY clause does not really help in this situation:

   <regions>   <region name="Australasia">     <population>24340222</population><area>7886602</area>     <country name="Australia">       <population>20264082</population><area>7617931</area>       <currency>Dollars</currency><rate>1.30141</rate>     </country>     <country name="New Zealand">       <population>4076140</population><area>268671</area>       <currency>Dollars</currency><rate>1.42369</rate>     </country>   </region>   <region name="Near East">     <population>1499157105</population><area>4721322</area>     <country name="Bangladesh">       <population>147365352</population><area>133911</area>       <currency>Taka</currency><rate>0</rate>     </country>     <country name="India">       <population>1095351995</population><area>2973190</area>       <currency>Rupees</currency><rate>43.62</rate>     </country>     <country name="Pakistan">       <population>165803560</population><area>778720</area>       <currency>Rupees</currency><rate>0</rate>     </country>     <country name="Sri Lanka">        <population>20222240</population><area>64740</area>        <currency>Rupees</currency><rate>99.4</rate>     </country>     <country name="Turkey">       <population>70413958</population><area>770761</area>       <currency>New Lira</currency><rate>0</rate>     </country>   </region> </regions>

The preceding example has some start-end tag sets placed onto the same line to keep the output smaller, and perhaps on a single page. The fundamental difference with the preceding XML document is that there is no duplication of regional information. Just imagine how much duplication of parent information would be contained in a join of the REGION , COUNTRY , STATE and CITY tables. The XML document using SQL/XML could easily be thousands of times larger. And Oracle Database does have an XMLROOT function. It does not appear to be much use as of Oracle10 g Database (10.2.0.1). I took the liberty of manually adding a root tag to the preceding properly structured XML document.

So the problem discussed earlier is essentially that even though relational databases store data in hierarchical table structures, queries return into flattened two-dimensional structures. XML document object hierarchical structure is not really equivalent to a SQL query as it is hierarchical and thus more three-dimensional from a structural perspective. So therefore, SQL and even SQL/XML are of limited use with respect to XML. The only way to produce an XML document like the preceding well- formed XML document is to have full program control with the language querying the database. SQL is a scripting language. A scripting language does not allow full program control because successive commands cannot use the results of previous commands and adapt accordingly . A proper programming language does allow full program control with communication between any two points in a piece of code.

Oracle Database has all sorts of other tricks such as CAST(MULTISET( into nested tables. A nested table allows the creation of a table within a table, a little like an object collection structural relationship. CAST(MULTISET( essentially allows type casting between different object layers of a class hierarchy. The only problem with a solution like this is that one will be attempting to impose an object architecture onto a relational database. That is just as bad as imposing a relational structure onto an object structure. You have already seen this with the previous examples generating potentially tremendous amounts of data duplication from relational queries into XML documents using SQL/XML.

The SYS_XMLGEN Function

The SYS_XMLGEN function returns an XMLType data type object instance, which contains an XML document. Each row in a query is returned as a separate XML document. The following query converts each record retrieved from the REGION table into an XML document:

   SELECT SYS_XMLGEN(REGION) FROM REGION;

This result shows the first two records returned:

   <?xml version="1.0"?> <REGION>Africa</REGION> <?xml version="1.0"?> <REGION>Asia</REGION>

The input to the SYS_XMLGEN function should be a scalar expression. In other words, a list of fields cannot be used. Finding multiple fields requires retrieval of multiple expressions:

   SELECT SYS_XMLGEN(REGION), SYS_XMLGEN(POPULATION) FROM REGION;

This is the result, with a few obvious issues:

   <?xml version="1.0"?> <?xml version="1.0"?> <REGION>Africa</REGION> <POPULATION>789548670</POPULATION> <?xml version="1.0"?> <?xml version="1.0"?> <REGION>Asia</REGION> <POPULATION>47382633</POPULATION>

A single scalar expression is required for a single SYS_XMLGEN function execution. Thus the fields that are to be retrieved can be structured into new user-defined types. In setting up queries to join regions, countries, and cities, the following user -defined types can be created. The first type is of three fields in the CITY table:

   CREATE OR REPLACE TYPE tCITY AS OBJECT(CITY VARCHAR2(32), POPULATION INTEGER,    AREA INTEGER); /

The second type creates a collection of the tCITY type:

   CREATE OR REPLACE TYPE tCITIES AS TABLE OF tCITY; /

Next create a type to represent some of the fields in the COUNTRY table, and also embed the collection of cities (tCITIES) as a collection structure. Oracle Database allows a fully indexable, single column table data type, called a nested table (TABLE):

   CREATE OR REPLACE TYPE tCOUNTRY AS OBJECT(COUNTRY VARCHAR2(32), POPULATION INTEGER,    AREA INTEGER, CURRENCY VARCHAR2(32), RATE FLOAT, CITIES tCITIES); /

It follows that a collection of countries is created, containing multiple countries, where each country contains multiple cities:

   CREATE OR REPLACE TYPE tCOUNTRIES AS TABLE OF tCOUNTRY; /

Last, create a region type, containing region fields, and a collection of countries, which in turn contain a collection of cities:

   CREATE OR REPLACE TYPE tREGION AS OBJECT(REGION VARCHAR2(32), POPULATION INTEGER,    AREA INTEGER, COUNTRIES tCOUNTRIES); / CREATE OR REPLACE TYPE tREGIONS AS TABLE OF tREGION; /

If Oracle Database TYPE objects as shown in the preceding code are improperly coded, replacing the TYPE objects requires dropping them in the reverse order that they were created. This is relevant in this case because a three-layer hierarchy of TYPE objects is created:

   DROP TYPE tREGIONS; DROP TYPE tREGION; DROP TYPE tCOUNTRIES; DROP TYPE tCOUNTRY; DROP TYPE tCITIES; DROP TYPE tCITY;

To begin with, the following query finds all cities within countries. The CAST function converts one type into another. In object parlance this is called a type-caste. The MULTISET function treats the result as a list, which in this case is a list of cities:

   SELECT SYS_XMLGEN (tCOUNTRY    (CO.COUNTRY, CO.POPULATION, CO.AREA, CO.CURRENCY, CO.RATE,       CAST       (MULTISET          (SELECT tCITY(CI.CITY, CI.POPULATION, CI.AREA)             FROM CITY CI             WHERE CI.COUNTRY_ID=CO.COUNTRY_ID) AS tCITIES))) AS COUNTRIES FROM COUNTRY CO;

If you executed the DROP TYPE commands shown previously, you will have to recreate the types again as shown prior to the DROP TYPE commands. Otherwise, the preceding query will not function.

The result that follows shows a single XML document for each record. This result shows only the first two countries in the full result of the query, which contains multiple cities within each country, such as Oran and Algiers, both cities in the country of Algeria:

   <?xml version="1.0"?> <ROW>  <COUNTRY>Algeria</COUNTRY>  <POPULATION>32930091</POPULATION>  <AREA>2381741</AREA>  <CURRENCY>Algeria Dinars</CURRENCY>  <RATE>0</RATE>  <CITIES>   <TCITY>  <CITY>Oran</CITY>  <POPULATION>1200000</POPULATION>    <AREA>0</AREA>   </TCITY>   <TCITY>  <CITY>Algiers</CITY>  <POPULATION>4100000</POPULATION>    <AREA>0</AREA>   </TCITY>  </CITIES> </ROW> <?xml version="1.0"?> <ROW>  <COUNTRY>Angola</COUNTRY>  <POPULATION>12127071</POPULATION>  <AREA>1246699</AREA>  <CURRENCY>Kwanza</CURRENCY>  <RATE>0</RATE>  <CITIES>   <TCITY>    <CITY>Luanda</CITY>    <POPULATION>2800000</POPULATION>    <AREA>0</AREA>   </TCITY>  </CITIES> </ROW>

If elements such as <CURRENCY> do not appear in the preceding script, that is because the default is that an element be omitted if it has no entry in the XML document as stored in the database. A record in a table is called a row in Oracle Database.

This next example goes one step further by returning all cities embedded within all parent countries, embedded within all parent regions:

   SELECT SYS_XMLGEN (tREGION    (R.REGION, R.POPULATION, R.AREA,       CAST       (MULTISET          (SELECT tCOUNTRY             (CO.COUNTRY, CO.POPULATION, CO.AREA, CO.CURRENCY, CO.RATE,                CAST                (MULTISET                   (SELECT tCITY(CI.CITY, CI.POPULATION, CI.AREA)                      FROM CITY CI                      WHERE CI.COUNTRY_ID=CO.COUNTRY_ID) AS tCITIES))             FROM COUNTRY CO             WHERE CO.REGION_ID=R.REGION_ID) AS tCOUNTRIES))) AS REGIONS FROM REGION R;

This is once again a partial result, showing the XML document for the region of Australasia. The region Australasia in the demographics database for this book includes only the countries of Australia and New Zealand. This example contains all cities, with the countries, within the region of Australasia:

   <?xml version="1.0"?> <ROW>  <REGION>Australasia</REGION>  <POPULATION>24340222</POPULATION>  <AREA>7886602</AREA>  <COUNTRIES>   <TCOUNTRY>  <COUNTRY>Australia</COUNTRY>  <POPULATION>20264082</POPULATION>    <AREA>7617931</AREA>    <CURRENCY>Dollars</CURRENCY>    <RATE>1.30141</RATE>    <CITIES>     <TCITY>  <CITY>Adelaide</CITY>  <POPULATION>1125000</POPULATION>      <AREA>0</AREA>     </TCITY>     <TCITY>  <CITY>Brisbane</CITY>  <POPULATION>1775000</POPULATION>      <AREA>0</AREA>     </TCITY>     <TCITY>  <CITY>Canberra</CITY>  <POPULATION>0</POPULATION>      <AREA>0</AREA>     </TCITY>     <TCITY>  <CITY>Darwin</CITY>  <POPULATION>0</POPULATION>      <AREA>0</AREA>     </TCITY>     <TCITY>  <CITY>Hobart</CITY>  <POPULATION>0</POPULATION>      <AREA>0</AREA>     </TCITY>     <TCITY>  <CITY>Melbourne</CITY>  <POPULATION>3600000</POPULATION>      <AREA>0</AREA>     </TCITY>     <TCITY>  <CITY>Perth</CITY>  <POPULATION>1450000</POPULATION>      <AREA>0</AREA>     </TCITY>     <TCITY>  <CITY>Sydney</CITY>  <POPULATION>4300000</POPULATION>      <AREA>0</AREA>     </TCITY>    </CITIES>   </TCOUNTRY>   <TCOUNTRY>  <COUNTRY>New Zealand</COUNTRY>  <POPULATION>4076140</POPULATION>    <AREA>268671</AREA>    <CURRENCY>Dollars</CURRENCY>    <RATE>1.42369</RATE>    <CITIES>     <TCITY>  <CITY>Auckland</CITY>  <POPULATION>1250000</POPULATION>      <AREA>0</AREA>     </TCITY>     <TCITY>  <CITY>Christchurch</CITY>  <POPULATION>0</POPULATION>      <AREA>0</AREA>     </TCITY>     <TCITY>  <CITY>Wellington</CITY>  <POPULATION>0</POPULATION>      <AREA>0</AREA>     </TCITY>    </CITIES>   </TCOUNTRY>  </COUNTRIES> </ROW>

If you dont see this output, then scroll up in the SQL*Plus tool.

Still, you get separate XML documents for each record returned from a query using the SYS_XMLGEN function, but at least you now have properly structured XML documents, containing appropriate XML hierarchical object structures.

PL/SQL and XML

Another option for generating XML documents using Oracle Database is the DBMS_XMLGEN PL/SQL built-in package. This package is somewhat complex so there is only a single example. An example multiple hierarchical layered query is shown here:

   CREATE TABLE TMP_CLOB(obj CLOB);

The CREATE TABLE statement creates a table to contain an XML document in a binary text object.

   DECLARE    qry DBMS_XMLGEN.CTXHANDLE;    obj CLOB; BEGIN    DBMS_XMLGEN.SETROWTAG(qry,NULL);    qry := DBMS_XMLGEN.NEWCONTEXT('SELECT tCOUNTRY(CO.COUNTRY, CO.POPULATION, CO.AREA, CO.CURRENCY, CO.RATE, CAST(MULTISET(SELECT CI.CITY, CI.POPULATION, CI.AREA    FROM CITY CI WHERE CI.COUNTRY_ID=CO.COUNTRY_ID) AS tCITIES)) AS countryXML FROM COUNTRY CO WHERE CO.COUNTRY     obj := DBMS_XMLGEN.GETXML(qry);    DELETE FROM TMP_CLOB;    INSERT INTO TMP_CLOB VALUES(obj);    COMMIT;    DBMS_XMLGEN.CLOSECONTEXT(qry); END; / SELECT * FROM TMP_CLOB;

The benefit of using the DBMS_XMLGEN package rather than SQL/XML, is that it produces a single root element called <ROWSET> and a single XML tag, and thus a complete and usable XML document. This is the result of the preceding query:

    <?xml version="1.0"?>   <ROWSET>  <ROW>   <COUNTRYXML>    <COUNTRY>Algeria</COUNTRY>    <POPULATION>32930091</POPULATION>    <AREA>2381741</AREA>    <CURRENCY>Algeria Dinars</CURRENCY>    <RATE>0</RATE>    <CITIES>     <TCITY>      <CITY>Oran</CITY>      <POPULATION>1200000</POPULATION>      <AREA>0</AREA>     </TCITY>     <TCITY>      <CITY>Algiers</CITY>      <POPULATION>4100000</POPULATION>      <AREA>0</AREA>     </TCITY>    </CITIES>   </COUNTRYXML>  </ROW>  <ROW>   <COUNTRYXML>    <COUNTRY>Angola</COUNTRY>    <POPULATION>12127071</POPULATION>    <AREA>1246699</AREA>    <CURRENCY>Kwanza</CURRENCY>    <RATE>0</RATE>    <CITIES>     <TCITY>      <CITY>Luanda</CITY>      <POPULATION>2800000</POPULATION>      <AREA>0</AREA>     </TCITY>    </CITIES>   </COUNTRYXML>  </ROW>  </ROWSET>  1 row selected.

The preceding method using the DBMS_XMLGEN package creates lots of fluff in the way tags represent user-defined type names , and other tags such as <ROW> and <ROWSET> . These tags are all essentially meaningless and superfluous to XML. Also generating XML tags in uppercase characters is quite often non-standard. One of the plus sides to using XML is its readability with the human eye. It is also a known fact that mixed case characters are easier to read than just lowercase or uppercase characters. Also, lowercase is better than CAPITALS.

All of the methods for generating XML documents in this chapter so far are either inadequate (not creating completed and usable XML documents), or they are convoluted. So far, the pseudocode example presented at the end of Chapter 4 remains the best method for generating XML documents.