Importing XML into MySQL

10.43.1 Problem

You want to import an XML document into a MySQL table.

10.43.2 Solution

Set up an XML parser to read the document. Then use the records in the document to construct and execute INSERT statements.

10.43.3 Discussion

Importing an XML document depends on being able to parse the document and extract record contents from it. The way you do this will depend on how the document is written. For example, one format might represent column names and values as attributes of elements:



 
 
 
 
 
 
 
 
 
...

Another format is to use column names as element names and column values as the contents of those elements:



 
 Jane
 A
 47
 
 
 Jane
 B
 50
 
...

Due to the various structuring possibilities, it's necessary to make some assumptions about the format you expect the XML document to have. For the example here, I'll assume the second format just shown. One way to process this kind of document is to use the XML::XPath module, which allows you to refer to elements within the document using path expressions. For example, the path //row selects all the elements under the document root, and the path * selects all children of a given element. We can use these paths with XML::XPath to obtain first a list of all the elements, and then for each row a list of all its columns.

The following script, xml_to_mysql.pl, takes three arguments:

% xml_to_mysql.pl  db_name tbl_name xml_file 

The filename argument indicates which document to import, and the database and table name arguments indicate which table to import it into.

xml_to_mysql.pl processes the command-line arguments and connects to MySQL (not shown), then processes the document:

#! /usr/bin/perl -w
# xml_to_mysql.pl - read XML file into MySQL

use strict;
use DBI;
use XML::XPath;

# ... process command-line options (not shown) ...

# ... connect to database (not shown) ...

# Open file for reading
my $xp = XML::XPath->new (filename => $file_name);
my $row_list = $xp->find ("//row"); # find set of  elements
print "Number of records: " . $row_list->size ( ) . "
";
foreach my $row ($row_list->get_nodelist ( )) # loop through rows
{
 my @name; # array for column names
 my @val; # array for column values
 my $col_list = $row->find ("*"); # children (columns) of row
 foreach my $col ($col_list->get_nodelist ( )) # loop through columns
 {
 # save column name and value
 push (@name, $col->getName ( ));
 push (@val, $col->string_value ( ));
 }
 # construct INSERT statement, then execute it
 my $stmt = "INSERT INTO $tbl_name ("
 . join (",", @name)
 . ") VALUES ("
 . join (",", ("?") x scalar (@val))
 . ")";
 $dbh->do ($stmt, undef, @val);
}

$dbh->disconnect ( );

exit (0);

The script creates an XML::XPath object, which opens and parses the document. Then the object is queried for the set of elements, using the path //row. The size of this set indicates how many records the document contains.

To process each row, the script uses the path * to ask for all the children of the row object. Each child corresponds to a column within the row; using * as the path for get_nodelist( ) this way is convenient because we need not know in advance which columns to expect. xml_to_mysql.pl obtains the name and value from each column and saves them in the @name and @value arrays. After all the columns have been processed, the arrays are used to construct an INSERT statement that names those columns that were found to be present in the row and that includes a placeholder for each data value. (Recipe 2.7 discusses placeholder list construction.) Then the script issues the statement, passing the column values to do( ) to bind them to the placeholders.

In the previous section, we used mysql_to_xml.pl to export the contents of the expt table as an XML document. xml_to_mysql.pl can be used to perform the converse operation of importing the document back into MySQL:

% xml_to_mysql.pl cookbook expt expt.xml

As it processes the document, the script generates and executes the following set of statements:

INSERT INTO expt (subject,test,score) VALUES ('Jane','A','47')
INSERT INTO expt (subject,test,score) VALUES ('Jane','B','50')
INSERT INTO expt (subject,test) VALUES ('Jane','C')
INSERT INTO expt (subject,test) VALUES ('Jane','D')
INSERT INTO expt (subject,test,score) VALUES ('Marvin','A','52')
INSERT INTO expt (subject,test,score) VALUES ('Marvin','B','45')
INSERT INTO expt (subject,test,score) VALUES ('Marvin','C','53')
INSERT INTO expt (subject,test) VALUES ('Marvin','D')

Note that these statements do not all insert the same number of columns. Statements with "missing" columns correspond to rows with NULL values.

Using the mysql Client Program

Writing MySQL-Based Programs

Record Selection Techniques

Working with Strings

Working with Dates and Times

Sorting Query Results

Generating Summaries

Modifying Tables with ALTER TABLE

Obtaining and Using Metadata

Importing and Exporting Data

Generating and Using Sequences

Using Multiple Tables

Statistical Techniques

Handling Duplicates

Performing Transactions

Introduction to MySQL on the Web

Incorporating Query Resultsinto Web Pages

Processing Web Input with MySQL

Using MySQL-Based Web Session Management

Appendix A. Obtaining MySQL Software

Appendix B. JSP and Tomcat Primer

Appendix C. References



MySQL Cookbook
MySQL Cookbook
ISBN: 059652708X
EAN: 2147483647
Year: 2005
Pages: 412
Authors: Paul DuBois

Flylib.com © 2008-2020.
If you may any questions please contact us: flylib@qtcs.net