Performing Phrase Searches with a FULLTEXT Index

4.15.1 Problem

You want to perform a FULLTEXT search for a phrase, that is, for words that occur adjacent to each other and in a specific order.

4.15.2 Solution

Use the FULLTEXT phrase search capability, or combine a non-phrase FULLTEXT search with regular pattern matching.

4.15.3 Discussion

To find records that contain a particular phrase, you can't use a simple FULLTEXT search:

mysql> SELECT COUNT(*) FROM kjv
 -> WHERE MATCH(vtext) AGAINST('still small voice');
+----------+
| COUNT(*) |
+----------+
| 548 |
+----------+

The query returns a result, but it's not the result you're looking for. A FULLTEXT search computes a relevance ranking based on the presence of each word individually, no matter where it occurs within the vtext column, and the ranking will be nonzero as long as any of the words are present. Consequently, this kind of query tends to find too many records.

As of MySQL 4.0.2, FULLTEXT searching supports phrase searching in Boolean mode. To use it, just place the phrase within double quotes.

mysql> SELECT COUNT(*) FROM kjv
 -> WHERE MATCH(vtext) AGAINST('"still small voice"' IN BOOLEAN MODE);
+----------+
| COUNT(*) |
+----------+
| 1 |
+----------+

Prior to 4.0.2, a workaround is necessary. You could use an IN BOOLEAN MODE search to require each word to be present, but that doesn't really solve the problem, because the words can still occur in any order:

mysql> SELECT COUNT(*) FROM kjv
 -> WHERE MATCH(vtext)
 -> AGAINST('+still +small +voice' IN BOOLEAN MODE);
+----------+
| COUNT(*) |
+----------+
| 3 |
+----------+

If you use a SQL pattern match instead, it returns the correct result:

mysql> SELECT COUNT(*) FROM kjv
 -> WHERE vtext LIKE '%still small voice%';
+----------+
| COUNT(*) |
+----------+
| 1 |
+----------+

However, using a SQL pattern match is likely to be slower than a FULLTEXT search. So it seems you have the unpleasant choice of using a method that is faster but doesn't produce the desired results, or one that works properly but is slower. Fortunately, those aren't your only options. You can combine both methods in the same query:

mysql> SELECT COUNT(*) FROM kjv
 -> WHERE MATCH(vtext) AGAINST('still small voice')
 -> AND vtext LIKE '%still small voice%';
+----------+
| COUNT(*) |
+----------+
| 1 |
+----------+

What this gains you is the best of both types of matching:

  • Using the MATCH( ) expression, MySQL can perform a FULLTEXT search to produce a set of candidate rows that contain words in the phrase. This narrows the search considerably.
  • Using the SQL pattern test, MySQL can search the candidate rows to produce only those records that have all the words arranged in the proper order.

This technique will fail if all the words are less than the indexing engine's minimum word length or occur in more than half the records. In that case, the FULLTEXT search returns no records at all. You can find the records using a SQL pattern match.

Using the mysql Client Program

Writing MySQL-Based Programs

Record Selection Techniques

Working with Strings

Working with Dates and Times

Sorting Query Results

Generating Summaries

Modifying Tables with ALTER TABLE

Obtaining and Using Metadata

Importing and Exporting Data

Generating and Using Sequences

Using Multiple Tables

Statistical Techniques

Handling Duplicates

Performing Transactions

Introduction to MySQL on the Web

Incorporating Query Resultsinto Web Pages

Processing Web Input with MySQL

Using MySQL-Based Web Session Management

Appendix A. Obtaining MySQL Software

Appendix B. JSP and Tomcat Primer

Appendix C. References



MySQL Cookbook
MySQL Cookbook
ISBN: 059652708X
EAN: 2147483647
Year: 2005
Pages: 412
Authors: Paul DuBois

Flylib.com © 2008-2020.
If you may any questions please contact us: flylib@qtcs.net