Recipe 10.17. Writing Your Own Export Programs


Problem

MySQL's built-in export capabilities don't suffice.

Solution

Write your own utilities.

Discussion

When existing export software doesn't do what you want, you can write your own programs. This section describes a Perl script, mysql_to_text.pl, that executes an arbitrary statement and exports it in the format you specify. It writes output to the client host and can include a row of column labels (two things that SELECT ... INTO OUTFILE cannot do). It produces multiple output formats more easily than by using mysql with a postprocessor, and it writes to the client host, unlike mysqldump, which can write only SQL-format output to the client. You can find mysql_to_text.pl in the transfer directory of the recipes distribution.

mysql_to_text.pl is based on the Text::CSV_XS module, which you'll need to obtain if it's not installed on your system. Once that module has been installed, you can read its documentation like so:

% perldoc Text::CSV_XS             

This module is convenient because it makes conversion of query output to CSV format relatively trivial. All you have to do is provide an array of column values, and the module packages them up into a properly formatted output line. This makes it relatively trivial to convert query output to CSV format. But the real benefit of using the Text::CSV_XS module is that it's configurable; you can tell it what kind of delimiter and quote characters to use. This means that although the module produces CSV format by default, you can configure it to write a variety of output formats. For example, if you set the delimiter to tab and the quote character to undef, Text::CSV_XS generates tab-delimited output. We'll take advantage of that flexibility in this section for writing mysql_to_text.pl, and later in Section 10.18 to write a file-processing utility that converts files from one format to another.

mysql_to_text.pl accepts several command-line options. Some of them are used for specifying MySQL connection parameters (such as --user, --password, and --host). You're already familiar with these, because they're used by the standard MySQL clients like mysql. The script also can obtain connection parameters from an option file, if you specify a [client] group in the file. mysql_to_text.pl also accepts the following options:


--execute = query, -e query

Execute query and export its output.


--table = tbl_name, -t tbl_name

Export the contents of the named table. This is equivalent to using --execute to specify a query value of SELECT * FROM tbl_name.


--labels

Include an initial row of column labels in the output


--delim = str

Set the column delimiter to str. The option value can consist of one or more characters. The default is to use tabs.


--quote = c

Set the column value quote character to c. The default is to not quote anything.


--eol = str

Set the end-of-line sequence to str. The option value can consist of one or more characters. The default is to use linefeeds.

The defaults for the --delim, --quote, and --eol options correspond to those used by LOAD DATA and SELECT ... INTO OUTFILE.

The final argument on the command line should be the database name, unless it's implicit in the statement. For example, these two commands are equivalent; each exports the passwd table from the cookbook database in colon-delimited format:

% mysql_to_text.pl --delim=":" --table=passwd cookbook % mysql_to_text.pl --delim=":" --table=cookbook.passwd             

To generate CSV output with CRLF line terminators instead, use a command like this:

% mysql_to_text.pl --delim="," --quote="\"" --eol="\r\n" \                --table=cookbook.passwd             

That's a general description of how you use mysql_to_text.pl. Now let's discuss how it works. The initial part of the mysql_to_text.pl script declares a few variables, and then processes the command-line arguments, using option-processing techniques developed in Section 2.8. As it happens, most of the code in the script is devoted to processing the command-line arguments and getting set up to run the query. Very little of it involves interaction with MySQL.

#!/usr/bin/perl # mysql_to_text.pl - export MySQL query output in user-specified text format # Usage: mysql_to_text.pl [ options ] [db_name] > text_file use strict; use warnings; use DBI; use Text::CSV_XS; use Getopt::Long; $Getopt::Long::ignorecase = 0; # options are case sensitive $Getopt::Long::bundling = 1;   # allow short options to be bundled # ... construct usage message variable $usage (not shown) ... # Variables for command line options - all undefined initially # except for options that control output structure, which is set # to be tab-delimited, linefeed-terminated. my $help; my ($host_name, $password, $port_num, $socket_name, $user_name, $db_name); my ($stmt, $tbl_name); my $labels; my $delim = "\t"; my $quote; my $eol = "\n"; GetOptions (   # =i means an integer argument is required after the option   # =s means a string value is required after the option   "help"          => \$help,        # print help message   "host|h=s"      => \$host_name,   # server host   "password|p=s"  => \$password,    # password   "port|P=i"      => \$port_num,    # port number   "socket|S=s"    => \$socket_name, # socket name   "user|u=s"      => \$user_name,   # username   "execute|e=s"   => \$stmt,        # statement to execute   "table|t=s"     => \$tbl_name,    # table to export   "labels|l"      => \$labels,      # generate row of column labels   "delim=s"       => \$delim,       # column delimiter   "quote=s"       => \$quote,       # column quoting character   "eol=s"         => \$eol          # end-of-line (record) delimiter ) or die "$usage\n"; die  "$usage\n" if defined $help; $db_name = shift (@ARGV) if @ARGV; # One of --execute or --table must be specified, but not both die "You must specify a query or a table name\n\n$usage\n"   if !defined ($stmt) && !defined ($tbl_name); die "You cannot specify both a query and a table name\n\n$usage\n"   if defined ($stmt) && defined ($tbl_name); # If table name was given, use it to create query that selects entire table $stmt = "SELECT * FROM $tbl_name" if defined ($tbl_name); # interpret special chars in the file structure options $quote = interpret_option ($quote); $delim = interpret_option ($delim); $eol = interpret_option ($eol); 

The interpret_option⁠(⁠ ⁠ ⁠) function (not shown) processes escape and hex sequences for the --delim, --quote, and --eol options. It interprets \n, \r, \t, and \0 as linefeed, carriage return, tab, and the ASCII NUL character. It also interprets hex values, which can be given in 0x nn form (for example, 0x0d indicates a carriage return).

After processing the command-line options, mysql_to_text.pl constructs the data source name (DSN) and connects to the server:

my $dsn = "DBI:mysql:"; $dsn .= ";database=$db_name" if $db_name; $dsn .= ";host=$host_name" if $host_name; $dsn .= ";port=$port_num" if $port_num; $dsn .= ";mysql_socket=$socket_name" if $socket_name; # read [client] group parameters from standard option files $dsn .= ";mysql_read_default_group=client"; my %conn_attrs = (PrintError => 0, RaiseError => 1, AutoCommit => 1); my $dbh = DBI->connect ($dsn, $user_name, $password, \%conn_attrs); 

The database name comes from the command line. Connection parameters can come from the command line or an option file. (Use of MySQL option files is covered in Section 2.8.)

After establishing a connection to MySQL, the script is ready to execute the query and produce some output. This is where the Text::CSV_XS module comes into play. First, create a CSV object by calling new⁠(⁠ ⁠ ⁠), which takes an optional hash of options that control how the object handles data lines. The script prepares and executes the query, prints a row of column labels (if the user specified the --labels option), and writes the rows of the result set:

my $csv = Text::CSV_XS->new ({               sep_char => $delim,               quote_char => $quote,               escape_char => $quote,               eol => $eol,               binary => 1             }); my $sth = $dbh->prepare ($stmt); $sth->execute (); if ($labels)                # write row of column labels {   $csv->combine (@{$sth->{NAME}}) or die "cannot process column labels\n";   print $csv->string (); } my $count = 0; while (my @val = $sth->fetchrow_array ()) {   ++$count;   $csv->combine (@val) or die "cannot process column values, row $count\n";   print $csv->string (); } 

The sep_char and quote_char options in the name⁠(⁠ ⁠ ⁠) call set the column delimiter and quoting character. The escape_char option is set to the same value as quote_char so that instances of the quote character occurring within data values are doubled in the output. The eol option indicates the line-termination sequence. Normally, Text::CSV_XS leaves it to you to print the terminator for output lines. By passing a non-undef eol value to new⁠(⁠ ⁠ ⁠), the module adds that value to every output line automatically. The binary option is useful for processing data values that contain binary characters.

The column labels are available in $sth->{NAME} after invoking execute⁠(⁠ ⁠ ⁠). Each line of output is produced using combine⁠(⁠ ⁠ ⁠) and string⁠(⁠ ⁠ ⁠). The combine⁠(⁠ ⁠ ⁠) method takes an array of values and converts them to a properly formatted string. string⁠(⁠ ⁠ ⁠) returns the string so we can print it.




MySQL Cookbook
MySQL Cookbook
ISBN: 059652708X
EAN: 2147483647
Year: 2004
Pages: 375
Authors: Paul DuBois

flylib.com © 2008-2017.
If you may any questions please contact us: flylib@qtcs.net