You can use SQL to solve the kinds of pattern-matching riddles that are typical of crossword puzzles. But first you have to load in a dictionary.
Suppose you have a table called words that contains a few thousand words. If you know some characters in some positions, you can use the underscore (_) wildcard. For example, say you are looking for an eight-letter word with the following pattern:
Second letter: a
Fourth letter: l
Seventh letter: o
An underscore means "any character" in LIKE:
mysql> SELECT * FROM words WHERE word LIKE '_a_l_ _o_'; +------+----------+ | id | word | +------+----------+ | 3823 | ballroom | | 3826 | ballyhoo | | 7255 | Carleton | | 7480 | cauldron | +------+----------+ 4 rows in set (0.04 sec)
|
Here's how to find words in your dictionary that have the same three letters at the beginning and at the end:
mysql> SELECT word FROM words -> WHERE word LIKE CONCAT('%',SUBSTR(word,1,3)) -> AND LENGTH(word) > 3; +---------------+ | word | +---------------+ | Ababa | | antiformant | | booboo | | Einstein | | entertainment | | Giorgio | | Ionicization | | murmur | | Oshkosh | | redeclared | | restores | | restructures | | Tsunematsu | | underground | +---------------+ 14 rows in set (0.09 sec)
1.5.1. SQL Server Variation
SELECT word FROM words WHERE (word LIKE '%' + SUBSTRING(word,1,3)) AND LEN(word) > 3
1.5.2. Access Variation
SELECT word FROM words WHERE (word LIKE '*' + LEFT(word,3)) AND LEN(word) > 3
1.5.3. PostgreSQL Variation
PostgreSQL will accept the ANSI standard syntax:
SELECT word FROM words WHERE word LIKE '%' || SUBSTR(word,1,3) AND LENGTH(word) > 3
1.5.4. Filling a Table with Words
To perform word searches you need to build the words table. You can create it with a statement such as CREATE TABLE WORDS (word VARCHAR(255)). If you start with a plain-text file and you want to put it into the database, you have many options. Perhaps the simplest is to "top and tail" each line to make it into an INSERT statement. You need to go from this:
Aarhus Aaron Ababa aback O'Brien
to this:
INSERT INTO words VALUES ('Aarhus') INSERT INTO words VALUES ('Aaron') INSERT INTO words VALUES ('Ababa') INSERT INTO words VALUES ('aback') INSERT INTO words VALUES ('O''Brien')
Notice that the single quote must be "escaped." The name O'Brien becomes O''Brien. The following Perl one-liner will take care of that (you could pipe it into your SQL command-line utility [Hack #1] if you want):
$ perl pe "s/'/''/g;s/.*/INSERT INTO words VALUES ('$&');/" words
This command assumes words is a text file containing a list of words, such as /usr/share/dict/words found on most Linux, Unix, and Mac OS X systems. Various word lists are available from http://wordlist.sourceforge.net.
Another approach is to use a spreadsheet such as Excel to manipulate the data, as shown in Figure 1-2.
Figure 1-2. Using Excel to preprocess SQL
The first column, A, contains the original data from a text file. You can enter this data using the copy and paste tools or by selecting Fileimages/U2192.jpg border=0>Open. Column B uses the SUBSTITUTE function to escape the single quotes:
=SUBSTITUTE(A1,"'","''")
Column C uses the append operator, &, to construct the required SQL INSERT statement:
="INSERT INTO words VALUES ('" & B1 & "');"
When you've copied both formulas down the whole word list, you can copy and paste column C into your SQL command prompt or into a .sql file for later use.
SQL Fundamentals
Joins, Unions, and Views
Text Handling
Date Handling
Number Crunching
Online Applications
Organizing Data
Storing Small Amounts of Data
Locking and Performance
Reporting
Users and Administration
Wider Access
Index