The Photo Dumper Application | Perl Database Programming

This program takes the data from our photo-album script, which comes from the Web-Based Photo Album you’ll be reading about later in the book. (Yes, this is only Chapter 11, but the data derived from the photo album is ideal for dumping into the XML format.)

This program is designed to be fairly generic; it should be able to work with other simple databases with minimal changes. This is not an end-all XML dumper however. It is just very simple.

photo_dumper.pl

01: #!/usr/bin/perl -w 02: use strict; 03: use DBI;

Line 1 tells the system where to find Perl and turns on the warning flag.

Line 2 loads the strict module.

Line 3 loads the DBI module so that we can connect to the database.

04: my $dbh = DBI->connect("DBI:mysql:photoalbum",                  "bookuser","testpass")  05:     or die("Cannot connect: $DBI::errstr\n");

Line 4 declares a scalar variable named $dbh that is used to store a handle to the database connection. The DBI->connect method passes the arguments to the DBI and makes a connection to the database.

Line 5 is a continuation of line 4 and causes the program to abort if there is a problem connecting to the database.

06: my %db; 07: $db{name}  = "photoalbum"; 08: $db{album} = "album_id"; 09: $db{photo} = "img_id"; 10: $db{f_key} = "album_id";

Line 6 declares a hash named %db.

Lines 7–10 populate the %db hash with some data we’ll be using shortly.

11: my $main_table = "album"; 12: my @sub_tables = qw{photo};

Line 11 declares a scalar variable named $main_table and sets it to the string album.

Line 12 declares an array and adds one item to it: photo. This array can be populated with more tables to loop through, if your database has more tables.

13: Gen_Output();

Line 13 is simply a call to the Gen_Output function. This function begins gathering the data and printing the results.

14: sub Gen_Output { 15:     print qq(<?xml version="1.0"?>\n); 16:     print start_tag($db{name}), "\n";

Line 14 begins the Gen_Output function.

Line 15 prints the XML header tag. This is needed for the output to be a valid XML document.

Line 16 prints the value returned by the start_tag subroutine with $db{name} being passed to it. This is the tag that wraps the entire document.

The start_tag and end_tag subroutines are used to ensure that we create valid XML tags.

17:     my @albums = Get_Keys($main_table); 18:     for my $tmp (@albums) { 19:         print start_tag($main_table), "\n";

Line 17 calls the Get_Keys function and passes the value in $main_table. This function queries the database and gets all of the index fields for the $main_table, storing them in the array @albums. This causes the array @albums to contain a list of all of the photo albums in the database.

Line 18 begins a for loop that iterates through each of the albums; the current album index is stored in the $tmp variable.

Line 19 prints the value returned by the start_tag subroutine with $main_table being passed to it. This wraps each album with the tag <album>.

20:         Print_Data($main_table, $tmp, 3); 21:         for my $sub_tbl (@sub_tables){ 22:             Do_SubTable($sub_tbl, $tmp);     23:         }

Line 20 calls the Print_Data function and passes it three values, $main_table, $tmp, and 3. 3 is the number of spaces to indent the output (which is optional, but makes the file easier to look at when printed).

Line 21 begins a loop. This loop iterates through all of the values in the array @sub_tables. Each time through the loop, the current table name is stored in the $sub_tbl scalar variable.

Line 22 calls the Do_SubTable subroutine and passes the $sub_tbl and $tmp values.

Line 23 ends the loop that begins on line 21.

24:         print end_tag($main_table),"\n"; 25:     } 26:     print end_tag($db{name}), "\n";  27: }

Line 24 prints the value returned by the end_tag subroutine when it is passed $main_table. The end_tag subroutine returns a closing tag to match the tag we begin on line 19.

Line 25 closes the block that begins on line 18.

Line 26 uses the end_tag subroutine to close the tag that wraps the entire output.

Line 27 ends this subroutine.

28: sub Do_SubTable { 29:     my $table = shift; 30:     my $rel   = shift;

Line 28 begins the Do_SubTable subroutine. This subroutine handles getting the data from the subordinate tables and printing their values.

Lines 29–30 declare some scalar variables and use the shift function to store the values passed to this function into these variables.

31:     my @keys = Get_Keys($table, $rel);

Line 31 calls the Get_Keys function and stores the result into the @arr array.

32:     for my $key (@keys) { 33:         print " ", start_tag($table), "\n"; 34:         Print_Data($table, $key, 4); 35:         print " ", end_tag($table), "\n\n"; 36:     } 37: }

Line 32 begins a loop that iterates through all of the values in the @keys array. The current value is stored in the $key variable.

Line 33 prints an XML tag with the value in the $table variable.

Line 34 calls the Print_Data function to print the data for this item. The values $table, $key, and 4 are passed to Print_Data.

Line 35 prints the closing XML tag to correspond to the tag we print on line 33.

Line 36 closes this for loop.

Line 37 closes this subroutine.

38: sub Get_Keys { 39:     my $table     = shift; 40:     my $index     = shift; 41:     my @values    = ();

Line 38 begins the Get_Keys subroutine.

Lines 39–40 declare a scalar variable and use the shift function to read in the first two values passed to this subroutine.

Line 41 declares an array named @values and ensures that it is empty by setting it to ().

42:     my $sql = qq{SELECT $db{$table} FROM $table};  43:     $sql   .= qq{ WHERE $db{f_key} = $index} if($index); 44:     my $sth = $dbh->prepare($sql);

Line 42 declares a scalar variable named $sql and fills it with a string that makes up the SQL statement we want to execute.

Line 43 adds a WHERE clause to the SQL statement if the $index variable contains data.

Line 44 prepares the SQL statement and stores a handle to the prepared statement in the new scalar variable $sth.

45:     $sth->execute(); 46:     while(my @arr = $sth->fetchrow_array)             { push @values, @arr; } 47:     return(@values); 48: }

Line 45 calls the execute() method on the statement handle to run the SQL query.

Line 46 is a while loop that loops through the values returned by the SQL call and stores all of the results into the array named @values. The first part of this line calls the fetchrow_array method on the data returned by the SQL query and stores the array of data in the @arr array. Then, the push @values, @arr takes the data in @arr and pushes it onto the @values array. This effectively concatenates the data onto the array. So, after this line has executed, all of the data returned by the SQL query should be in the @values array.

Line 47 returns the @values array.

Line 48 closes this subroutine.

49: sub Print_Data { 50:     my $table   = shift; 51:     my $key     = shift; 52:     my $indent  = shift; 53:     my @values  = ();

Line 49 begins the Print_Data subroutine.

Lines 50–52 declare new scalar variables and use the shift function to read in the data passed to the subroutine.

Line 53 declares an array named @values and ensures it is empty by setting it to ().

54:     my $sql = qq{SELECT * FROM $table WHERE $db{$table}           = ?}; 55:     my $sth = $dbh->prepare($sql); 56:     $sth->execute($key);

Line 54 creates the SQL query needed to get all of the data from the table and stores the SQL query string in $sql.

Line 55 uses the prepare method to get the SQL statement ready for execution. The result of the prepare call is stored in $sth.

Line 56 executes the query and passes the value in $key to replace the placeholder (?).

57:     while(my $data = $sth->fetchrow_hashref){  58:        for(keys(%$data)){ 59:         $data->{$_} = "" unless defined $data->{$_}; 60:         print " " x $indent, start_tag($_), 61:           xml_esc($data->{$_}), end_tag($_), "\n"; 62:        } 63:     }

Line 57 begins a while loop that gets one row of data and passes that row back as a reference to a hash. We store this hash reference in $data.

Line 58 begins a for loop that iterates through each of the keys in the hash referenced by $data.

Line 59 sets the value in $data->{$_} to an empty string if it is not yet defined.

Line 60 prints $indent number of spaces and then, using the start_tag subroutine to generate a valid XML tag, prints out the next starting XML tag. Line 61 is a continuation of line 60. At this point, we should have a valid XML opening tag. Next we use the xml_esc subroutine to print the data that $data->{$_} holds. The call to xml_esc ensures that we will not have and invalid characters in our output. Finally, we make a call to the end_tag subroutine to create a valid XML closing tag for us.

So, lines 60 and 61 create something like this:

<valid_tag>valid xml data</valid_tag>

Line 62 ends the for loop that begins on line 58.

Line 63 ends the while loop that begins on line 57.

64:     return 1; 65: }

Line 64 returns a true value.

Line 65 closes this subroutine.

66: sub start_tag { "<"  . xml_esc_name($_[0]) . ">" } 67: sub end_tag   { "</" . xml_esc_name($_[0]) . ">" }

Lines 66 and 67 are two different subroutines that are used to create a beginning and an ending XML tag, respectively. These subroutines simply take the value that is passed to them ($_[0]), and pass it to the xml_esc_name subroutine. The value returned by xml_esc_name is a valid XML tag. This valid XML tag is then wrapped in the <> or </> characters to create the entire, valid, XML tag.

68: sub xml_esc_name { 69:     my $name = shift; 70:     return ‘_’ unless length $name;

Line 68 begins the xml_esc_name subroutine. This subroutine processes the string passed to it and ensures that it is a valid name for an XML tag.

Line 69 creates a my variable called $name and uses the shift function to load it with the value that was passed to this subroutine.

Line 70 returns an underscore if there was no data passed to this subroutine. This ensures that even though no data was passed, a valid XML tag is still generated. (A lone underscore is a valid XML tag.)

71:     $name =~ s/[^-\._a-zA-Z0-9]/_/g; 72:     $name =~ s/^(\d)/_$1/;

Line 71 checks to see if $name contains anything that is not a dash, period, underscore, or an alphanumeric character; the caret ^ means not. Any characters that do not meet this criteria are replaced with an underscore character.

Line 72 then checks to see if $name begins with a digit. If so, we add an underscore to the beginning of $name. The caret ^ in this context means the beginning of the string. And, since we placed parentheses around the character we are matching, any matching data gets captured and stored in the $1 variable.

73:     return $name; 74: }

Line 73 returns the value stored in $name.

Line 74 ends this subroutine.

75: sub xml_esc { 76:     my $it = shift;

Line 75 begins the xml_esc subroutine. This subroutine is used to turn anything even remotely odd into &#nnn; entities.

Line 76 creates a my variable named $it and uses the shift function to load it with the data that was passed to this subroutine.

77:     $it =~ s{([^\x20\x21\x23-\x25\x28-\x3b\x3d\x3F-\x5B\x5D-\x7E])} 78:             {‘&#’.(ord($1)).’;’}eg;

Lines 77 and 78 are a single Perl line, split onto two lines. This substitution command looks for any characters that do not match those listed in between the square brackets []. Anything that does not match gets changed into an &#nnn; entity. The ord($1) command returns the numeric value of the item passed to it. The value that was returned is wrapped between the &# and the ; characters. The e at the end of the substitution command tell Perl to evaluate the right side as an expression (execute the ord()), and the g means to make the substitutions globally to all values that match the criteria.

79:     return $it; 80: }

Line 79 returns the current value stored in $it.

Line 80 ends this subroutine.

The output of this program should look something like this:

<?xml version="1.0"?> <photoalbum> <album>    <album_id>1020740752</album_id>    <comments>BOKTWD, round "2".</comments>    <name>BOKTWD 2</name>   <photo>     <img_date>2002-04-25 09:37:58</img_date>     <album_id>1020740752</album_id>     <img_title>untitled</img_title>     <focal_length>6.6</focal_length>     <aperture>2.83</aperture>     <width>1600</width>     <fnumber>2.5</fnumber>     <iso_speed>100</iso_speed>     <img_location>/photos/Family_Photos_1/24250020.jpg</img_location>     <height>1200</height>     <img_id>200204250937581020740773</img_id>     <exposure>10/881</exposure>     <resolution>375/1 dpi</resolution>     <make>SEIKO EPSON CORP.</make>     <flash>Yes</flash>     <model>PhotoPC 850Z</model>   </photo>   <photo>     <img_date>2002-04-25 09:38:38</img_date>     <album_id>1020740752</album_id>     <img_title>untitled</img_title>     <focal_length>6.6</focal_length>     <aperture>2.83</aperture>     <width>1600</width>     <fnumber>2.5</fnumber>     <iso_speed>100</iso_speed>     <img_location>/photos/Family_Photos_1/24250021.jpg</img_location>     <height>1200</height>     <img_id>200204250938381020740795</img_id>     <exposure>10/1113</exposure>     <resolution>375/1 dpi</resolution>     <make>SEIKO EPSON CORP.</make>     <flash>Yes</flash>     <model>PhotoPC 850Z</model>   </photo> </album> </photoalbum>

That is it for this simple database-to-XML dumper. Once this program is run, you should have a nice XML output. You can then use other XML applications to transform the data, convert the data, or do whatever you wish with it.