Recipe 9.8. Using Table Structure Information in Applications

Problem

It's all well and good to be able to obtain table structure information, but what can you use it for?

Solution

Lots of things are possible: you can display lists of table columns, create web form elements, produce ALTER TABLE statements for modifying ENUM or SET columns, and more.

Discussion

This section describes some uses for the table structure information that MySQL makes available.

Displaying column lists

Probably the simplest use of table information is to present a list of the table's columns. This is common in web-based or GUI applications that allow users to construct statements interactively by selecting a table column from a list and entering a value against which to compare column values. The get_column_names⁠(⁠ ⁠ ⁠) routines shown in Section 9.6 can serve as the basis for such list displays.

Interactive record editing

Knowledge of a table's structure can be very useful for interactive record-editing applications. Suppose that you have an application that retrieves a record from the database, displays a form containing the record's content so a user can edit it, and then updates the record in the database after the user modifies the form and submits it. You can use table structure information for validating column values. For example, if a column is an ENUM, you can find out the valid enumeration values and check the value submitted by the user against them to determine whether it's legal. If the column is an integer type, check the submitted value to make sure that it consists entirely of digits, possibly preceded by a + or - sign character. If the column contains dates, look for a legal date format.

But what if the user leaves a field empty? If the field corresponds to, say, a CHAR column in the table, do you set the column value to NULL or to the empty string? This too is a question that can be answered by checking the table's structure. Determine whether the column can contain NULL values. If it can, set the column to NULL; otherwise, set it to the empty string.

Mapping column definitions onto web page elements

Some data types such as ENUM and SET correspond naturally to elements of web forms:

An ENUM has a fixed set of values from which you choose a single value. This is analogous to a group of radio buttons, a pop-up menu, or a single-pick scrolling list.
A SET column is similar, except that you can select multiple values; this corresponds to a group of checkboxes or a multiple-pick scrolling list.

By using the table metadata to access the definitions for these types of columns, you can easily determine a column's legal values and map them onto the appropriate form element automatically. This enables you to present users with a list of applicable values from which selections can be made easily without any typing. Section 9.7 discussed how to get definitions for these types of columns. The methods developed there are used in Chapter 19, which discusses form generation in more detail.

Adding elements to ENUM or SET column definitions

When you need to modify a column definition, you can use ALTER TABLE. However, it's really a pain to add a new element to an ENUM or SET column definition because you must list not only the new element, but all the existing elements, the default value, and NOT NULL if the column cannot contain NULL values. Suppose that you want to add "hot pink" to the colors column of an item table that has this structure:

CREATE TABLE item (   id      INT UNSIGNED NOT NULL AUTO_INCREMENT,   name    CHAR(20),   colors  SET('chartreuse','mauve','lime green','puce') DEFAULT 'puce',   PRIMARY KEY (id) );

To change the column definition, use ALTER TABLE as follows:

ALTER TABLE item   MODIFY colors   SET('chartreuse','mauve','lime green','puce','hot pink')   DEFAULT 'puce';

The ENUM definition doesn't contain many elements, so that statement isn't very difficult to enter manually. However, the more elements a column has, the more difficult and error prone it is to type statements like that. To avoid retyping the existing definition just to add a new element, you have a choice of strategies:

Write a script that does the work for you. It can examine the table definition and use the column metadata to generate the ALTER TABLE statement.
Use mysqldump to get a CREATE TABLE statement that contains the current column definition, and modify the statement in a text editor to produce the appropriate ALTER TABLE statement that changes the definition.

As an implementation of the first approach, let's develop a Python script add_element.py that generates the appropriate ALTER TABLE statement automatically when given database and table names, an ENUM or SET column name, and the new element value. add_element.py will use that information to figure out the correct ALTER TABLE statement and display it:

% add_element.py cookbook item colors "hot pink" ALTER TABLE `cookbook`.`item`   MODIFY `colors`   set('chartreuse','mauve','lime green','puce','hot pink')   DEFAULT 'puce';

By having add_element.py produce the statement as its output, you have the choice of shoving it into mysql for immediate execution or saving the output into a file:

% add_element.py cookbook item colors "hot pink" | mysql cookbook % add_element.py cookbook item colors "hot pink" > stmt.sql

The first part of the add_element.py script imports the requisite modules and checks the command-line arguments. This is fairly straightforward:

#!/usr/bin/python # add_element.py - produce ALTER TABLE statement to add an element # to an ENUM or SET column import sys import MySQLdb import Cookbook if len (sys.argv) != 5:   print "Usage: add_element.py db_name tbl_name col_name new_element"   sys.exit (1) (db_name, tbl_name, col_name, new_elt) = (sys.argv[1:5])

After connecting to the MySQL server (code not shown, but is present in the script), the script checks INFORMATION_SCHEMA to retrieve the column definition, whether it allows NULL values, and its default value. The following code does this, checking to make sure that the column really exists in the table:

stmt = """          SELECT COLUMN_TYPE, IS_NULLABLE, COLUMN_DEFAULT          FROM INFORMATION_SCHEMA.COLUMNS          WHERE TABLE_SCHEMA = %s AND TABLE_NAME = %s AND COLUMN_NAME = %s          """ cursor = conn.cursor () cursor.execute (stmt, (db_name, tbl_name, col_name)) info = cursor.fetchone () cursor.close if info == None:   print "Could not retrieve information for table %s.%s, column %s" \                       % (db_name, tbl_name, col_name)   sys.exit (1)

At this point, if the SELECT statement succeeded, the information produced by it is available as a tuple stored in the info variable. We'll need to use several elements from this tuple. The most important is the COLUMN_TYPE value, which provides the enum(...) or set(...) string containing the column's definition. We can use this string to verify that the column really is an ENUM or SET, and then add the new element to the string just before the closing parenthesis. For the colors column, we want to change this:

set('chartreuse','mauve','lime green','puce')

To this:

set('chartreuse','mauve','lime green','puce','hot pink')

It's also necessary to check whether column values can be NULL and what the default value is so that the program can add the appropriate information to the ALTER TABLE statement. The code that does all this is as follows:

# get data type string; make sure it begins with ENUM or SET type = info[0] if type[0:5].upper() != "ENUM(" and type[0:4].upper() != "SET(":   print "table %s.%s, column %s is not an ENUM or SET" % \         (db_name, tbl_name, col_name)   sys.exit(1) # insert comma and properly quoted new element just before closing paren type = type[0:len(type)-1] + "," + conn.literal (new_elt) + ")" # if column cannot contain NULL values, add "NOT NULL" if info[1].upper() == "YES":   nullable = "" else:   nullable = "NOT NULL "; # construct DEFAULT clause (quoting value as necessary) default = "DEFAULT " + conn.literal (info[2]) print "ALTER TABLE `%s`.`%s`\n  MODIFY `%s`\n  %s\n  %s%s;" \         % (db_name, tbl_name, col_name, type, nullable, default)

That's it. You now have a working ENUM- or SET-altering program. Still, add_element.py is fairly basic and can be improved in various ways:

Make sure that the element value you're adding to the column isn't already there.
Modify add_element.py to take more than one argument after the column name and add all of them to the column definition at the same time.
Add an option to indicate that the named element should be deleted rather than added.

Another approach to altering ENUM or SET columns involves capturing the current definition in a file and editing the file to produce the proper ALTER TABLE statement.

Run mysqldump to get the CREATE TABLE statement that contains the column definition:

% mysqldump --no-data cookbook item > test.txt

The --no-data option tells mysqldump not to dump the data from the table; it's used here because only the table-creation statement is needed. The resulting file, test.txt, should contain this statement:

CREATE TABLE `item` (   `id` int(10) unsigned NOT NULL AUTO_INCREMENT,   `name` char(20) DEFAULT NULL,   `colors` set('chartreuse','mauve','lime green','puce') DEFAULT 'puce',   PRIMARY KEY (`id`) ) ENGINE=MyISAM DEFAULT CHARSET=latin1;

Edit the test.txt file to remove everything but the definition for the colors column:
```
`colors` set('chartreuse','mauve','lime green','puce') DEFAULT 'puce', 
```
Modify the definition to produce an ALTER TABLE statement that has the new element and a semicolon at the end:
```
ALTER TABLE item MODIFY `colors` set('chartreuse','mauve','lime green','puce','hot pink') DEFAULT 'puce'; 
```
Write test.txt back out to save it, and then get out of the editor and feed test.txt as a batch file to mysql:
```
% mysql cookbook < test.txt                      
```

For simple columns, this procedure is more work than just typing the ALTER TABLE statement manually. However, for ENUM and SET columns with long and ungainly definitions, using an editor to create a mysql batch file from mysqldump output makes a lot of sense. This technique also is useful when you want to delete or reorder members of an ENUM or SET column, or to add or delete members from the column definition.

Selecting all except certain columns

Sometimes you want to retrieve "almost all" the columns from a table. Suppose that you have an image table that contains a BLOB column named data used for storing images that might be very large, and other columns that characterize the BLOB column, such as its ID, a description, and so forth. It's easy to write a SELECT * statement that retrieves all the columns, but if all you need is the descriptive information about the images and not the images themselves, it's inefficient to drag the BLOB values over the connection along with the other columns. Instead, you want to select everything in the row except the data column.

Unfortunately, there is no way to say directly in SQL, "select all columns except this one." You must explicitly name all the columns except data. On the other hand, it's easy to construct that kind of statement by using table structure information. Extract the list of column names, delete the one to be excluded, and then construct a SELECT statement from those columns that remain. The following example shows how to do this in PHP, using the get_column_names⁠(⁠ ⁠ ⁠) function developed earlier in the chapter to obtain the column names for a table:

$names = get_column_names ($conn, $db_name, $tbl_name); $stmt = ""; # construct list of columns to select: all but "data" foreach ($names as $index => $name) {   if ($name == "data")     continue;   if ($stmt != "") # put commas between column names     $stmt .= ", ";   $stmt .= "`$name`"; } $stmt = "SELECT $stmt FROM `$db_name`.`$tbl_name`";

The equivalent Perl code for constructing the statement is a bit shorter (and correspondingly more cryptic):

my @names = get_column_names ($dbh, $db_name, $tbl_name); my $stmt = "SELECT `"             . join ("`, `", grep (!/^data$/, @names))             . "` FROM `$db_name`.`$tbl_name`";

Whichever language you use, the result is a statement that you can use to select all columns but data. It will be more efficient than SELECT * because it won't pull the BLOB values over the network. Of course, this process does involve an extra round trip to the server to execute the statement that retrieves the column names, so you should consider the context in which you plan to use the SELECT statement. If you're going to retrieve only a single row, it might be more efficient simply to select the entire row than to incur the overhead of the extra round trip. But if you're selecting many rows, the reduction in network traffic achieved by skipping the BLOB columns will be worth the overhead of the additional query for getting table structure.