Converting Datafiles from One Format to Another

10.19.1 Problem

You want to convert a file to a different format to make it easier to work with, or so that another program can understand it.

10.19.2 Solution

Use the cvt_file.pl converter script described here.

10.19.3 Discussion

The mysql_to_text.pl script discussed in Recipe 10.18 uses MySQL as a data source and produces output in the format you specify via the --delim, --quote, and --eol options. This section describes cvt_file.pl, a utility that provides similar formatting options, but for both input and output. It reads data from a file rather than from MySQL, and converts it from one format to another. For example, to read a tab-delimited file data.txt, convert it to colon-delimited format, and write the result to tmp, you would invoke cvt_file.pl like this:

% cvt_file.pl --idelim="	" --odelim=":" data.txt > tmp

The cvt_file.pl script has separate options for input and output. Thus, whereas mysql_to_text.pl has just a --delim for specifying the column delimiter, cvt_file.pl has separate --idelim and --odelim options to set the input and output line column delimiters. But as a shortcut, --delim is also supported; it sets the delimiter for both input and output. The full set of options that cvt_file.pl understands is as follows:

--idelim= str, --odelim= str, --delim= str

Set the column delimiter sequence for input, output, or both. The option value may consist of one or more characters.

--iquote= c, --oquote= c, --quote= c

Set the column quote character for input, output, or both.

--ieol= str, --oeol= str, --eol= str

Set the end-of-line sequence for input, output, or both. The option value may consist of one or more characters.

--iformat= format, --oformat= format, --format= format,

Specify an input format, an output format, or both. This option is shorthand for setting the quote and delimiter values. For example, --iformat=csv sets the input quote and delimiter characters to double quote and comma. --iformat=tab sets them to "no quotes" and tab.

--ilabels, --olabels, --labels

Expect an initial line of column labels for input, write an initial line of labels for output, or both. If you request labels for the output but do not read labels from the input, cvt_file.pl uses column labels of c1, c2, and so forth.

cvt_file.pl assumes the same default file format as LOAD DATA and SELECT INTO ... OUTFILE, that is, tab-delimited lines terminated by linefeeds.

cvt_file.pl can be found in the transfer directory of the recipes distribution. If you expect to use it regularly, you should install it in some directory that's listed in your search path so that you can invoke it from anywhere. Much of the source for the script is similar to mysql_to_text.pl, so rather than showing the code and discussing how it works, I'll just give some examples illustrating how to use it:

  • Read a file in CSV format with CRLF line termination, write tab-delimited output with linefeed termination:

    % cvt_file.pl --iformat=csv --ieol="
    " --oformat=tab --oeol="
    " 
     data.txt > tmp
  • Read and write CSV format, converting CRLF line terminators to carriage returns:

    % cvt_file.pl --format=csv --ieol="
    " --oeol="
    " data.txt > tmp
  • Produce a tab-delimited file from the colon-delimited /etc/passwd file:

    % cvt_file.pl --idelim=":" /etc/passwd > tmp
  • Convert tab-delimited query output from mysql into CSV format:

    % mysql -e "SELECT * FROM profile" cookbook 
     | cvt_file.pl --oformat=csv > profile.csv

Using the mysql Client Program

Writing MySQL-Based Programs

Record Selection Techniques

Working with Strings

Working with Dates and Times

Sorting Query Results

Generating Summaries

Modifying Tables with ALTER TABLE

Obtaining and Using Metadata

Importing and Exporting Data

Generating and Using Sequences

Using Multiple Tables

Statistical Techniques

Handling Duplicates

Performing Transactions

Introduction to MySQL on the Web

Incorporating Query Resultsinto Web Pages

Processing Web Input with MySQL

Using MySQL-Based Web Session Management

Appendix A. Obtaining MySQL Software

Appendix B. JSP and Tomcat Primer

Appendix C. References



MySQL Cookbook
MySQL Cookbook
ISBN: 059652708X
EAN: 2147483647
Year: 2005
Pages: 412
Authors: Paul DuBois

Flylib.com © 2008-2020.
If you may any questions please contact us: flylib@qtcs.net