XML::DBMS Perl Module


The XML::DBMS Perl module is middleware for transferring data between XML and relational databases. Originally written by Ronald Bourret as a Java library, it was later ported to Perl by Nick Semenov. In this section, I'll be discussing version 1.1 of this library. Although it is a very powerful module, it is still lacking some important features. Most (if not all) of the missing features will be addressed in Version 2.0.

One of the missing features in the current version is type checking. The type-checking feature will allow users to transfer data between different database types. Using Version 1.1, you have to manually manage the different data types. As of this writing,Version 2.0 is still in alpha stages, so I won't discuss it here. As the development progresses, information becomes available on the official XML-DBMS web site, http://www.rpbourret.com/xmldbms/,as well as available on a number of mailing lists.

XMl::DBMS Perl Module Example

Let's revisit the scenario discussed earlier of transferring data from one database to another. The XML::DBMS Perl module can be used to develop a middleware component using a mapping syntax to bind database column names to XML elements and vice versa.

In this example, we'll use XML as an intermediate storage facility during the translation. Why, you might ask, do I need an XML-based middleware component if I can just query one database and insert it into another? Well, if you have simultaneous access to both databases, you can easily develop a Perl DBI-based application to perform that task. However, the problem is a little more difficult if you have separate applications (one to read the source database and another to write to the source database), or if they don't have access to both databases at the same time. Let's say that your company is teaming with another company to compete for a contract and you both need to exchange some of your research and development work, and both databases reside behind firewalls.You'll need to use some type of a two-step process to transfer the data.

This example will consist of two databases, a source database named from_db and a destination database named to_db . The from_db database has a table named membership , and the to_db database has a table named user_info . For this example, we'll use a middleware XML component to transfer the contents of the membership table in the from_db database to the user_info table in the to_db database. The process is illustrated in Figure 6.2.

Figure 6.2. Using XML middleware to translate between two databases.

graphics/06fig02.gif

Using XML as the intermediate data format has several benefits over a standard text file. For example, XML supports data type mapping and validation. In the next section, we'll take a look at the example based on the XML::DBMS Perl module.

Create and Populate the Source Database Tables

To test the application, we'll need to first create two databases with tables and populate one table with sample seed data. The SQL statements must be entered at the mysql prompt or using the Perl DBI. To access the mysql prompt, type mysql at the command prompt on a Unix machine or from a DOS console. Another option is to download the MySQL GUI available for download at http://www.mysql.com.

Using a RDBMS for This Example

This example was developed using a MySQL database. MySQL is an open source database available for a number of platforms from http://www.mysql.com. You can follow along with any RDBMS as long as the Perl DBI supports your RDBMS (that is, a database driver is available for it). If you are using a RDBMS other than MySQL, you may need to modify the following SQL statements, but the majority of SQL standard-compliant databases process the SQL statements without any problems. An excellent MySQL book is MySQL , by Paul DuBois , available from New Riders.

Listing 6.5 SQL statements to create and populate required databases and tables. (Filename: ch6_sql_statements.txt)
 1.   create database from_db;  2.   create database to_db;  3.   use from_db;  4.  5.   create table if not exists membership (  6.   user_id varchar(20),  7.   first_name varchar(40),  8.   last_name varchar(40));  9.  10.  insert into membership values ('isterin', 'Ilya', 'Sterin');  11.  insert into membership values ('mriehl', 'Mark', 'Riehl');  12.  insert into membership values ('pdubois', 'Paul', 'DuBois');  13.  14.  use to_db; 15.  16.  create table if not exists user_info (  17.  unique_user varchar(20),  18.  first_name varchar(40),  19.  last_name varchar(40)); 

Let's walk through the SQL statements shown in Listing 6.5. As I mentioned earlier, these are generic SQL statements that should work on any standards-compliant RDBMS.

1 “3 The first two lines create two databases named from_db and to_db . The use from_db statement selects the from_db as the current database. Until we select another database (by employing the use command again), all of our commands are directed at from_db .

 1.   create database from_db;  2.   create database to_db;  3.   use from_db; 

5 “12 This group of SQL commands creates a table named membership that has three columns : user_id , first_name , and last_name . All three columns are of the varchar data type. A varchar is a variable length character string, and the argument represents the maximum possible size . After creating the tables, we insert data. Note that the data is inserted in the same order that the columns were created (that is, user_id , first_name , and last_name ). After these statements, we have a table named membership , and the contents are shown in Table 6.3.

 5.   create table if not exists membership (  6.   user_id varchar(20),  7.   first_name varchar(40),  8.   last_name varchar(40));  9.  10.  insert into membership values ('isterin', 'Ilya', 'Sterin');  11.  insert into membership values ('mriehl', 'Mark', 'Riehl');  12.  insert into membership values ('pdubois', 'Paul', 'DuBois'); 
Table 6.3. Contents of the membership table.

user_id

first_name

last_name

Isterin

Ilya

Sterin

Mriehl

Mark

Riehl

Pdubois

Paul

DuBois

a more flexible approach. 185XML::DBMS Perl 3. <XMLToDBMS Version="1.0"> 4. <Options> 5.

14 “19 As I mentioned earlier, the use statement selects the current database. Now our commands are directed at the to_db database. We then create a table named user_info in the to_db database that has unique_user , first_name , and last_name columns defined. Note that the name of the first column has changed from user_id in the membership table to unique_user in the user_info table.

 14.  use to_db;  15.  16.  create table if not exists user_info (  17.  unique_user varchar(20),  18.  first_name varchar(40),  19.  last_name varchar(40)); 
XML::DBMS Perl Application

The development of this application has three distinct steps. First, we create an XML document that holds the data pulled from the source database. Second, we generate an XML document that contains the mapping relationship between the source and destination databases. Finally, we develop the Perl program that actually performs the mapping. Let's take a look at the first step in the process.

Develop the XML Storage Document

We'll first need to create an XML::DBMS map file for the from_db database and specify the way it maps to the resulting XML file. The resulting XML file, as shown in Listing 6.6, is used as a temporary storage facility, which we'll then use to upload to the to_db database.

Listing 6.6 XML document that will be used to store the contents of the source database. (Filename: ch6_from_db.xml)
 1.   <?xml version="1.0" ?>  2.   <!DOCTYPE XMLToDBMS SYSTEM "xmldbms.dtd">  3.   <XMLToDBMS Version="1.0">  4.     <Options>  5.       <EmptyStringIsNull/>  6.     </Options>  7.     <Maps>  8.       <IgnoreRoot>  9.         <ElementType Name="users"/>  10.        <PseudoRoot>  11.          <ElementType Name="user"/>  12.          <CandidateKey Generate="No">  13.            <Column Name="user_id"/>  14.          </CandidateKey>  15.        </PseudoRoot>  16.      </IgnoreRoot>  17.      <ClassMap>  18.        <ElementType Name="user"/>  19.        <ToClassTable>  20.          <Table Name="membership"/>  21.        </ToClassTable>  22.        <PropertyMap>  23.          <ElementType Name="userid"/>  24.          <ToColumn>  25.            <Column Name="user_id"/>  26.          </ToColumn> 27.        </PropertyMap>  28.        <PropertyMap>  29.          <ElementType Name="firstname"/>  30.          <ToColumn>  31.            <Column Name="first_name"/>  32.          </ToColumn>  33.        </PropertyMap>  34.        <PropertyMap>  35.          <ElementType Name="lastname"/>  36.          <ToColumn>  37.            <Column Name="last_name"/>  38.          </ToColumn>  39.        </PropertyMap>  40.      </ClassMap>  41.    </Maps>  42.  </XMLToDBMS> 

1 “3 Here we declare the XML document, specify an external DTD, and then declare the XMLToDBMS root tag, with which we specify the XML-DBMS version number.

 1.   <?xml version="1.0" ?>  2.   <!DOCTYPE XMLToDBMS SYSTEM "xmldbms.dtd">  3.   <XMLToDBMS Version="1.0"> 

4 “6 The <Options> branch enables you to specify global options. In this example, the EmptyStringIsNull element indicates that we want to map empty values in the XML document to NULL columns in the database.

 4.     <Options>  5.       <EmptyStringIsNull/>  6.     </Options> 

7 “16 The IgnoreRoot element enables us to define the root element and the row element that will be used to wrap each individual row of data extracted from the database. Setting the ElementType 's Name attribute to "users" indicates that we want the <users> element to be our root tag. The PseudoRoot element deals with the row element instances. We first define the element name we want to use for each row of data, so we set our ElementType 's Name attribute to "user" . Notice that we used the ElementType tag twice. We're allowed to do this because each use is in a different context because each case belongs to a different parent node. The CandidateKey construct enables us to specify the database column to use as our key. To define the key, we define a Column tag and set its Name attribute to "user_id" . As you can see, "user_id" is one of our columns names in the from_db database.

 7.     <Maps>  8.       <IgnoreRoot>  9.         <ElementType Name="users"/>  10.        <PseudoRoot>  11.          <ElementType Name="user"/>  12.          <CandidateKey Generate="No">  13.            <Column Name="user_id"/>  14.          </CandidateKey>  15.        </PseudoRoot>  16.      </IgnoreRoot> 

17 “42 We then proceed to define our ClassMap structure. The number of class maps should equal the number of tables from which the data is extracted. So, if data is extracted from two tables, we'll have to define two ClassMap structures, with each containing each table's map information. The ElementType tag appears again, and here it acts as a hint to map the rows from this particular ClassMap to the <user> element, which we already specified earlier to act as our row instance tag. The ToClassTable definition is used to specify the name of the table to which this ClassMap is mapping, therefore, we define the Table element and set its Name attribute to "member ship" . Our last task is to define the PropertyMap construct(s) to map the column names to XML tag names. We must define one PropertyMap for each column name we want to map. The first instance of PropertyMap maps the user_id column within the membership table to userid tag in the resulting XML. The next two do the same for the first_name and last_name database columns.

 17.      <ClassMap> 18.        <ElementType Name="user"/>  19.        <ToClassTable>  20.          <Table Name="membership"/>  21.        </ToClassTable>  22.        <PropertyMap>  23.          <ElementType Name="userid"/>  24.          <ToColumn>  25.            <Column Name="user_id"/>  26.          </ToColumn>  27.        </PropertyMap>  28.        <PropertyMap> 29.          <ElementType Name="firstname"/>  30.          <ToColumn>  31.            <Column Name="first_name"/>  32.          </ToColumn>  33.        </PropertyMap>  34.        <PropertyMap>  35.          <ElementType Name="lastname"/>  36.          <ToColumn>  37.            <Column Name="last_name"/>  38.          </ToColumn>  39.        </PropertyMap>  40.      </ClassMap>  41.    </Maps>  42.  </XMLToDBMS> 
Develop the XML Mapping Document

You now have a pretty good idea of how to map a database to a resulting XML file. Now, let's look at the other half of the problem, mapping the resulting XML document to a different database. After that has been finished, you can easily transfer data from the source database to an XML file and back into the destination database. Listing 6.7 shows you how we can map the XML document that was built from the source database membership table to the destination database user_info table.

Listing 6.7 XML document that defines the mapping from the old database to the new database. (Filename: ch6_to_db.xml)
 1.   <?xml version="1.0" ?>  2.   <!DOCTYPE XMLToDBMS SYSTEM "xmldbms.dtd">  3.   <XMLToDBMS Version="1.0"> 4.    <Options>  5.                   <EmptyStringIsNull/>  6.      </Options>  7.      <Maps>  8.         <IgnoreRoot>  9.            <ElementType Name="users"/>  10.           <PseudoRoot>  11.              <ElementType Name="user"/>  12.              <CandidateKey Generate="No">  13.                 <Column Name="unique_user"/>  14.              </CandidateKey>  15.           </PseudoRoot>  16.        </IgnoreRoot> 17.           <ClassMap>  18.           <ElementType Name="user"/>  19.           <ToClassTable>  20.              <Table Name="user_info"/>  21.           </ToClassTable>  22.           <PropertyMap>  23.              <ElementType Name="userid"/>  24.              <ToColumn>  25.                 <Column Name="unique_user"/>  26.              </ToColumn>  27.           </PropertyMap>  28.           <PropertyMap>  29.              <ElementType Name="firstname"/>  30.              <ToColumn>  31.                 <Column Name="first_name"/>  32.              </ToColumn>  33.           </PropertyMap>  34.           <PropertyMap>  35.              <ElementType Name="lastname"/>  36.              <ToColumn>  37.                 <Column Name="last_name"/>  38.              </ToColumn>  39.           </PropertyMap>  40.        </ClassMap>  41.     </Maps>  42.  </XMLToDBMS> 

8 “16 Starting with the IgnoreRoot element, we notice there is some similarity to Listing 6.6 (ch6_from_db.xml). The only change is that the Name attribute in the <Column> element is set to "unique_user" instead of "user_id" . This column is the key column in the to_db database. The root tags and the record tag remain the same as users and user , respectively.

 8.         <IgnoreRoot>  9.            <ElementType Name="users"/>  10.           <PseudoRoot>  11.              <ElementType Name="user"/>  12.              <CandidateKey Generate="No">  13.                 <Column Name="unique_user"/>  14.              </CandidateKey>  15.           </PseudoRoot>  16.        </IgnoreRoot> 

19 “21 The Table 's Name attribute is also modified to contain the table name in the destination ( to_db) database.

 19.           <ToClassTable>  20.              <Table Name="user_info"/>  21.           </ToClassTable> 

22 “42 The PropertyMap elements are used as they were in the previous map file to define the relationships between the tag names from the XML file containing the data and the database column's receiving the data.

 22.           <PropertyMap>  23.              <ElementType Name="userid"/>  24.              <ToColumn>  25.                 <Column Name="unique_user"/>  26.              </ToColumn>  27.           </PropertyMap>  28.           <PropertyMap>  29.              <ElementType Name="firstname"/>  30.              <ToColumn>  31.                 <Column Name="first_name"/>  32.              </ToColumn>  33.           </PropertyMap>  34.           <PropertyMap>  35.              <ElementType Name="lastname"/>  36.              <ToColumn>  37.                 <Column Name="last_name"/>  38.              </ToColumn>  39.           </PropertyMap>  40.        </ClassMap> 41.     </Maps>  42.  </XMLToDBMS> 

After we define both map files ”the first to map our from_db database to the XML file and the second to map the XML file's data to the to_db database ”we can write a program to test our mappings and see if all is working as expected.

Developing the XML::XMLtoDBMS Perl Program

Now that we've defined the XML document for data storage and data mapping, we can focus on the Perl program that actually does the transformation, as shown in Listing 6.8. Our program will first read the data from the source database ( from_db ), and then insert it into the destination database ( to_db ).

Keep in mind that in a real world application, this process might be divided into two steps. Remember, I mentioned the case where both databases might not be available due to a firewall or some other reason. In that case, we would retrieve the information and create the intermediate XML file using the XML document shown in Listing 6.6 (ch6_from_db.xml) then send it using the network connection to another location (possibly behind the destination RDBMS firewall). Then, we would use the XML document shown in Listing 6.7 (ch6_to_db.xml) to perform the mapping and to upload the contents into the destination database.

Listing 6.8 Perl program that performs mapping between two databases. (Filename: ch6_xmltodbms_app.pl)
 1.   use strict;  2.   use DBI;  3.   use XML::XMLtoDBMS;  4.  5.   # Instantiate a DBI connection to the source database.  6.   my $dbh_from = DBI->connect(  7.                     "dbi:mysql:database=from_db;host=localhost",  8.                     "username",  9.                     "password",  10.                    {RaiseError => 1, PrintError => 0}); 11.  12.  # Instantiate a new XMLtoDBMS object.  13.  my $xmlToDbms = new XML::XMLtoDBMS($dbh_from);  14.  15.  # Identify the first XML document required.  16.  $xmlToDbms->setMap('ch6_from_db.xml');  17.  18.  my $doc = $xmlToDbms->retrieveDocument('membership');  19.  20.  # Open an intermediate output file.  21.  open (XML, ">ch_6_middleware.xml") or  22.    die "could not open file for writing: $! \n";  23.  print XML $doc->toString; 24.  close XML or die "error closing file 'middleware.xml': $!";  25.  26.  # Instantiate a DBI connection to the destination database.  27.  my $dbh_to = DBI->connect(  28.                  "dbi:mysql:database=to_db;host=localhost",  29.                  "username",  30.                  "password",  31.                  {RaiseError => 1, PrintError => 0});  32.  33.  # Perform the mapping.  34.  $xmlToDbms->{DBh} = $dbh_to;  35.  $xmlToDbms->setMap('ch6_to_db.xml');  36.  $doc = $xmlToDbms->storeDocument( Source => {File =>  'middleware.xml'} );  37.  38.  $xmlToDbms->destroy;  39.  40.  # Disconnect from the source and destination databases.  41.  $dbh_to->disconnect;  42.  $dbh_from->disconnect; 

1 “3 The opening section of the Perl program contains the standard use strict pragma. For this particular application, we need to load both the DBI and XML::XMLtoDBMS Perl modules.

 1.   use strict;  2.   use DBI;  3.   use XML::XMLtoDBMS; 

5 “10 We first make a database connection to the from_db database, from which we'll be retrieving our data. We also set the RaiseError attribute to 1 to turn on error checking, and the PrintError attribute is set to to turn off printing errors to the console. Our program will now exit with an error message if anything goes wrong while communicating with the database. For more information on DBI methods and attributes (there are quite a few), see perldoc DBI.

 5.   # Instantiate a DBI connection to the source database.  6.   my $dbh_from = DBI->connect(  7.                     "dbi:mysql:database=from_db;host=localhost",  8.                     "root",  9.                     "",  10.                    {RaiseError => 1, PrintError => 0}); 

12 “18 Now we create a new XMLtoDBMS object and pass it the database handle that we've just created, so that it will use this database connection handle to retrieve the data. Then we set the map using the setMap() function and pass it the path to the map that we are using for the from_db database. After we have provided the XMLtoDBMS object with a database handle and the XML mapping document (ch6_from_db.xml), we call the retrieveDocument() function and pass it the name of the table that contains the source data. Because we haven't specified any criteria to the SQL query (for example, select user_id from membership ), all the contents of the table are retrieved by our query.

 12.  # Instantiate a new XMLtoDBMS object.  13.  my $xmlToDbms = new XML::XMLtoDBMS($dbh_from);  14.  15.  # Identify the first XML document required.  16.  $xmlToDbms->setMap('ch6_from_db.xml');  17.  18.  my $doc = $xmlToDbms->retrieveDocument('membership'); 

20 “24 Here we write the resulting XML data to a file named ch6_middleware.xml. We first open a file for writing, then we call a toString() function on the object that was returned by the retrieveDocument() function. The toString() function returns XML data, which we then write to the file.

 20.  # Open an intermediate output file.  21.  open (XML, ">ch_6_middleware.xml") or  22.    die "could not open file for writing: $! \n";  23.  print XML $doc->toString;  24.  close XML or die "error closing file 'middleware.xml': $!"; 

Listing 6.9 shows the contents of the intermediate XML document. This XML document stores the data retrieved from the source database ( from_db ) based on our custom XML mapping document (ch6_from_db.xml).

Listing 6.9 Contents of the XML middleware document. (Filename: ch6_middleware.xml)
 <?xml version="1.0"?>  <users>     <user>        <firstname>Ilya</firstname>        <userid>isterin</userid>        <lastname>Sterin</lastname>     </user>     <user>        <firstname>Mark</firstname>        <userid>mriehl</userid>        <lastname>Riehl</lastname>     </user>     <user>        <firstname>Paul</firstname>        <userid>pdubois</userid>        <lastname>Dubois</lastname>     </user>  </users> 

26 “31 After we have generated our intermediate XML data storage file, we upload it to the destination database ( to_db ). Note that we need to create a new DBI connection because we're communicating with a different database. Using the Perl DBI, this connection could be to a different database (for example, Oracle) running on a different host.

 26.  # Instantiate a DBI connection to the destination database.  27.  my $dbh_to = DBI->connect(  28.                  "dbi:mysql:database=to_db;host=localhost",  29.                  "username",  30.                  "password",  31.                  {RaiseError => 1, PrintError => 0}); 

33 “36 This section of the application is where we finalize our data upload process. We first assign our new database connection handle for the to_db database to the DBh attribute of the $xmlToDbms instance of the XMLToDBMS object. Notice, we don't need to create a new object instance, but rather we can dynamically change the DBh attribute and use it to perform operations on a different database. We also reset the map file using the setMap() function to ch6_to_db.xml, which is the XML map file that holds mapping information between our XML file and destination ( to_db ) database. After our new environment has been set up to upload the data, we call the storeDocument() function and pass it the ch6_middleware.xml file path through the Source => {File => 'ch6_middleware.xml'} hash of hashes data structure.

 33.  # Perform the mapping.  34.  $xmlToDbms->{DBh} = $dbh_to;  35.  $xmlToDbms->setMap('ch6_to_db.xml');  36.  $doc = $xmlToDbms->storeDocument( Source => {File => 'ch6_middleware.xml'} ); 

38 “42 After all the processing has completed, we call the destroy function on the XMLToDBMS object instance, which deallocates and finalizes anything else it has outstanding and frees the resources. Finally, we disconnect from both of the databases.

 38.  $xmlToDbms->destroy;  39.  40.  # Disconnect from the source and destination databases.  41.  $dbh_to->disconnect;  42.  $dbh_from->disconnect; 

This was a fairly short example that performed a translation between two database tables. Granted, the tables were small, but you should be able to see how this process can easily be extended to larger tables, multiple tables, table filtering, and communication between multiple databases using the Perl DBI.



XML and Perl
XML and Perl
ISBN: 0735712891
EAN: 2147483647
Year: 2002
Pages: 145

flylib.com © 2008-2017.
If you may any questions please contact us: flylib@qtcs.net