Recipe 5.9. Controlling Case Sensitivity in String Comparisons


Problem

You want to know whether strings are equal or unequal, or which one appears first in lexical order.

Solution

Use a comparison operator. But remember that strings have properties such as case sensitivity that you must take into account. For example, a string comparison might be case-sensitive when you don't want it to be, or vice versa.

Discussion

As for other data types, you can compare string values for equality, inequality, or relative ordering:

mysql> SELECT 'cat' = 'cat', 'cat' = 'dog'; +---------------+---------------+ | 'cat' = 'cat' | 'cat' = 'dog' | +---------------+---------------+ |             1 |             0 | +---------------+---------------+ mysql> SELECT 'cat' != 'cat', 'cat' != 'dog'; +----------------+----------------+ | 'cat' != 'cat' | 'cat' != 'dog' | +----------------+----------------+ |              0 |              1 | +----------------+----------------+ mysql> SELECT 'cat' < 'awk', 'cat' < 'dog'; +---------------+---------------+ | 'cat' < 'awk' | 'cat' < 'dog' | +---------------+---------------+ |             0 |             1 | +---------------+---------------+ mysql> SELECT 'cat' BETWEEN 'awk' AND 'egret'; +---------------------------------+ | 'cat' BETWEEN 'awk' AND 'egret' | +---------------------------------+ |                               1 | +---------------------------------+ 

However, comparison and sorting properties of strings are subject to certain complications that don't apply to other types of data. For example, sometimes you need to make sure a string operation is case-sensitive that would not otherwise be, or vice versa. This section describes how to do that for ordinary comparisons. Section 5.12 covers case sensitivity in pattern-matching operations.

String comparison properties depend on whether the operands are binary or nonbinary strings:

  • A binary string is a sequence of bytes and is compared using numeric byte values. Lettercase has no meaning. However, because letters in different cases have different byte values, comparisons of binary strings effectively are case-sensitive. (That is, a and A are unequal.) If you want to compare binary strings so that lettercase does not matter, convert them to nonbinary strings that have a case-insensitive collation.

  • A nonbinary string is a sequence of characters and is compared in character units. (Depending on the character set, some characters might have multiple bytes.) The string has a character set that defines the legal characters and a collation that defines their sort order. The collation also determines whether to consider characters in different lettercases the same in comparisons. If the collation is case-sensitive, and you want a case-insensitive collation (or vice versa), convert the strings to use a collation with the desired case-comparison properties.

By default, strings have a character set of latin1 and a collation of latin1_swedish_ci. This results in case-insensitive string comparisons.

The following example shows how two binary strings that compare as unequal can be handled so that they are equal when compared as case-insensitive nonbinary strings:

mysql> SET @s1 = BINARY 'cat', @s2 = BINARY 'CAT'; mysql> SELECT @s1 = @s2; +-----------+ | @s1 = @s2 | +-----------+ |         0 | +-----------+ mysql> SET @s1 = CONVERT(@s1 USING latin1) COLLATE latin1_swedish_ci; mysql> SET @s2 = CONVERT(@s2 USING latin1) COLLATE latin1_swedish_ci; mysql> SELECT @s1 = @s2; +-----------+ | @s1 = @s2 | +-----------+ |         1 | +-----------+ 

In this case, because latin1_swedish_ci is the default collation for latin1, you can omit the COLLATE operator:

mysql> SET @s1 = CONVERT(@s1 USING latin1); mysql> SET @s2 = CONVERT(@s2 USING latin1); mysql> SELECT @s1 = @s2; +-----------+ | @s1 = @s2 | +-----------+ |         1 | +-----------+ 

The next example shows how to compare two strings that are not case-sensitive (as demonstrated by the first SELECT) in case-sensitive fashion (as demonstrated by the second):

mysql> SET @s1 = _latin1 'cat', @s2 = _latin1 'CAT'; mysql> SELECT @s1 = @s2; +-----------+ | @s1 = @s2 | +-----------+ |         1 | +-----------+ mysql> SELECT @s1 COLLATE latin1_general_cs = @s2 COLLATE latin1_general_cs     ->   AS '@s1 = @s2'; +-----------+ | @s1 = @s2 | +-----------+ |         0 | +-----------+ 

If you compare a binary string with a nonbinary string, the comparison treats both operands as binary strings:

mysql> SELECT _latin1 'cat' = BINARY 'CAT'; +------------------------------+ | _latin1 'cat' = BINARY 'CAT' | +------------------------------+ |                            0 | +------------------------------+ 

Thus, if you want to compare two nonbinary strings as binary strings, apply the BINARY operator to either one when comparing them:

mysql> SET @s1 = _latin1 'cat', @s2 = _latin1 'CAT'; mysql> SELECT @s1 = @s2, BINARY @s1 = @s2, @s1 = BINARY @s2; +-----------+------------------+------------------+ | @s1 = @s2 | BINARY @s1 = @s2 | @s1 = BINARY @s2 | +-----------+------------------+------------------+ |         1 |                0 |                0 | +-----------+------------------+------------------+ 

If you find that you've declared a column using a type that is not suitable for the kind of comparisons for which you typically use it, use ALTER TABLE to change the type. Suppose that you have a table in which you store news articles:

CREATE TABLE news (   id      INT UNSIGNED NOT NULL AUTO_INCREMENT,   article BLOB,   PRIMARY KEY (id) ); 

Here the article column is declared as a BLOB, which is a binary string type. This means that if you store text in the column, comparisons are made without regard to character set. (In effect, they are case-sensitive.) If that's not what you want, you can convert the column to a nonbinary type that has a case-insensitive collation using ALTER TABLE:

ALTER TABLE news   MODIFY article TEXT CHARACTER SET utf8 COLLATE utf8_general_ci; 




MySQL Cookbook
MySQL Cookbook
ISBN: 059652708X
EAN: 2147483647
Year: 2004
Pages: 375
Authors: Paul DuBois

flylib.com © 2008-2017.
If you may any questions please contact us: flylib@qtcs.net