Hack 19. Convert Strings to Dates

The SQL standard includes a complete set of rules which govern how dates should be represented and manipulated. Each vendor implementation of SQL has a variation of these rules.

The SQL standard has a DATE type for days and a TIMESTAMP type to represent a date and time. Examples of literals are DATE '2006-05-20' and TIMESTAMP '2006-06-18 10:09:05'. The ISO format used in both examples (the year followed by the month followed by the day) has the advantage of sorting correctly even when it's represented as a string data type. It is also visibly different from both the American convention that puts the month first, and the European style that puts the day first.

Oracle, PostgreSQL, and MySQL adhere to the SQL standard for representing dates and timestamps, but Microsoft's SQL Server and Access use a slightly different approach. SQL Server and Access will accept a date literal such as '2006-06-08', but they cannot handle the DATE prefix.

The DATE type does not exist in SQL Server; you should use the DATETIME type to represent both a date and a moment in time. SQL Server uses the term TIMESTAMP for an entirely different purpose.

4.1.1. Convert Your Dates

Suppose you have dates in user-supplied input in this format6/18/2006and you need to create date literals for an INSERT statement such as this one: DATE '2006-06-18'. Here's how you can accomplish this in Perl:

foreach ('6/18/2006', '12/13/2006'){
 if (/(d+)/(d+)/(dddd)/){ # Capture date parts into $1, $2, $3
 my $m = substr("0$1", -2); # Left-pad with zeros if needed
 my $d = substr("0$2", -2);
 my $y = $3;
 $sql = "INSERT INTO d VALUES (DATE '$y-$m-$d')";
 print "$sql
";
 } else {
 warn "Could not parse date: $!";
 }
}

Note that we in-lined the user-supplied values directly into the INSERT statement. In theory, this would have opened us up to an SQL injection attack [Hack #48]. However, the input is fully sanitized in that the regular expression guarantees that $y, $m, and $d contain only digits (d matches any one character between 0 and 9).

The output from this code is ready for use in MySQL, Oracle, PostgreSQL, or another engine that uses the SQL standard:

INSERT INTO d VALUES (DATE '2006-06-18');
INSERT INTO d VALUES (DATE '2006-12-13');

For Microsoft SQL Server, you need only drop the word DATE.

Table 4-1 shows some common variations of date formats.

Table 4-1. Finding a common date format

Engine DATE '2006-06-01' '2006-6-1' '1 JUN 2006'
MySQL OK OK Error
SQL Server Error OK OK
Oracle OK Error OK
PostgreSQL OK OK OK
DB2 Error OK Error
Mimer OK Error Error
Standard OK Error Error

No single format works with every engine; you can't do better than satisfy any two of the three most popular platforms (SQL Server, MySQL, and Oracle).

You also cannot publish even the simplest SQL data in a format that everyone can read. For a nasty solution you can publish dates, but you must capitalize the word date in an odd wayfor example, DaTe '2006-06-01'. SQL Server and DB2 users must do a case-sensitive search and replace to remove the string DaTe, but users of other engines can just slurp the file into their engines directly. The advantage of using an unusual form of capitalization is that the SQL engines don't care, but the string DaTe is unlikely to occur in any other part of the file, so it's really easy to pick up with a conversion script (you also could pipe or redirect your SQL to a one-liner, such as perl -pe 's/DaTe//g').

If MySQL comes across a date format it doesn't recognize (such as '1 JUN 2006'), it accepts it without raising an error and puts in the value DATE '0000-00-00'. However, if you check your warnings, you'll see that something went wrong:

mysql> insert into d values ('1 JUN 2006');Query OK, 1 row affected, 1 warning (0.13 sec)mysql> show warningsG*************** 1. row *************** Level: Warning Code: 1265Message: Data truncated for column 'd' at row 11 row in set (0.00 sec)

If you are reading data in from another system you may be able to pass date strings in their original format and do the parsing in SQL. The Oracle example shows the general technique, but you'll need to use different SQL functions for your database. We'll show you those after the Oracle example.

4.1.2. Parse Dates with Oracle

Oracle has a neat function called TO_DATE, which allows you to specify the pattern used in your input string:

INSERT INTO d VALUES (TO_DATE('1 Jun 2006', 'dd Mon yyyy'))

You can specify a wide range of formats that include "filler" characters other than a space.

Using this technique, you could write a simple Perl script, for example. If Perl has read a string such as '1 Jun 2006' into the variable $v, you could generate the SQL as:

my $sql = "INSERT INTO d VALUES (TO_DATE('$v', 'dd Mon yyyy'))";

If your dates are coming from an untrusted source, you should still check the pattern to guard against SQL injection attacks:

if ($v !~ /^d+ www dddd$/) {
 warn "Injection attack.";
}

If you were using XSLT, you might want to use code such as this:

 
 INSERT INTO dd VALUES (
 TO_DATE('',
 ,'dd Mon yyyy'))
 

That sheet would take care of input such as bar='1 Jun 2006'/>.

4.1.3. Parse Dates with MySQL

MySQL has a similar function, called STR_TO_DATE. This works with the format strings in MySQL format:

INSERT INTO d VALUES (STR_TO_DATE('1 Jun 2006', '%d %b %Y'));

%b represents the abbreviated month name, %d is the day of the month, and %Y is a four-digit year.

4.1.4. Parse Dates with SQL Server

If your input format is a fixed size (with leading zeros), combine the SUBSTRING function to build the string. Convert a string such as '06/18/2006' into a date:

INSERT INTO d
 SELECT SUBSTRING(x,7,4)+'-'+
 SUBSTRING(x,1,2)+'-'+
 SUBSTRING(x,4,2)
 FROM (SELECT '06/18/2006' AS x) y;


Hack 20 Uncover Trends in Your Data





SQL Hacks
SQL Hacks
ISBN: 0596527993
EAN: 2147483647
Year: 2004
Pages: 147
Simiral book on Amazon

Flylib.com © 2008-2017.
If you may any questions please contact us: flylib@qtcs.net