MySQL's string functions fall into two categories: the string processing functions and the string comparison functions. You will find the latter more useful than the former. String Processing FunctionsA list of the more useful string functions is shown in Table 8.3. There are many more in the manual. Table 8.3. String Processing Functions
String Comparison FunctionsIn addition to offering the equality operator for comparing two strings, MySQL provides various comparison functions we can also use:
We will discuss LIKE , RLIKE , and STRCMP in this section. Full-text searching is a feature specific to MyISAM tables. It is discussed in Chapter 9, "Understanding MySQL's Table Types." Using LIKE for Wildcard MatchingLet's consider an example using LIKE : select * from department where name like '%research%'; Rather than looking for names that are equal to 'research' , we are looking for names containing 'research' . This produces the following results: +---------------+--------------------------+ departmentID name +---------------+--------------------------+ 128 Research and Development +---------------+--------------------------+ 1 row in set (0.04 sec) The LIKE function supports two kinds of wildcard matching. The percentage sign ( % ), as used in the preceding example, matches any number of characters (including zero). Hence, the expression '%research%' matches all strings that have the word research in them somewhere. Note that string matching is generally case insensitive, as discussed earlier in this chapter. The other wildcard character is the underscore ( _ ). This matches any single character. For example '_at' matches the strings 'cat' , 'mat' , 'bat' , and so on. Using RLIKE for Regular Expression MatchingThe RLIKE function can be used to match on the basis of regular expressions. A regular expression is a pattern that describes the general shape of a string. There is a special notation for describing the features we would like to see in matching strings. We will take a brief look at this notation. First, a literal string matches that string. So, the pattern 'cat' matches 'cat' . However, it also matches 'catacomb' and 'the cat sat on the mat' . The pattern 'cat' matches 'cat' anywhere inside the target string. If we want to match only the word 'cat' , then the pattern would need to be '^cat$' . The caret ( ^ ) means "anchor to the start of the string;" in other words, the first thing at the start of a matching string is the word 'cat' . The dollar sign ( $ ) means "anchor to the end of the string;" in other words, the last thing in the string must be the word 'cat' . So, the pattern '^cat$' can match only the string 'cat' and nothing else. Regular expressions also support wildcards, just as LIKE does. However, the wildcard is different. We have only one, the dot ( . ), that will match any single character. So, '.at' matches 'cat' , 'bat' , 'mat' and so on. We only need a single wildcard character because we can also specify how often characters (including wildcards) can appear in a string. For example, the special * character after a character means that character may appear zero or more times. So, the pattern 'n*' matches '' , 'n' , 'nn' , 'nnn' , and so on. We can group characters with parentheses, so '(cat)*' matches '' , 'cat' , 'catcat' , 'catcatcat' , and so on. We can also use a wildcard, so '.*' matches any number of any characters ”basically anything. Similarly, the plus sign ( + ) means that the character or string before it should be repeated one or more times, and the question mark ( ? ) means to match zero times or one time. You can also list a specific range, so, for example, '(cat)(2,4)' matches 'catcat' , 'catcatcat' , and 'catcatcatcat' . As well as listing specific characters and strings, we can list sets of characters. These appear in square brackets. For example, the pattern '[a-z]' matches any single letter, and '[a-z]*' matches any number of letters. Finally, there are a number of character classes, which are predefined sets. For example, [[:alnum:]] matches any alphanumeric character. If you are a Perl programmer, it is worth noting that MySQL uses POSIX-style regular expressions, which have a different syntax from Perl regular expressions. Now, let's look at an example of how to use these patterns with RLIKE . Consider the following query: select * from department where name rlike 'an'; This will find all the department names that contain the string 'an' somewhere inside them: +---------------+--------------------------+ departmentID name +---------------+--------------------------+ 42 Finance 128 Research and Development 129 Human Resources +---------------+--------------------------+ 3 rows in set (0.00 sec) Regular expressions can be very powerful and can get quite complex. If you would like more examples and syntax detail, there are many good tutorials on the Web. Using STRCMP() for String ComparisonThe STRCMP() function in MySQL uses the same conventions as it does in other programming languages, such as C or PHP. It has the prototype STRCMP( s1, s2 ) and returns the following values:
For example, the following queries return the following results: mysql> select strcmp('cat', 'cat'); +----------------------+ strcmp('cat', 'cat') +----------------------+ 0 +----------------------+ 1 row in set (0.42 sec) mysql> select strcmp('cat', 'dog'); +----------------------+ strcmp('cat', 'dog') +----------------------+ -1 +----------------------+ 1 row in set (0.00 sec) mysql> select strcmp('cat', 'ant'); +----------------------+ strcmp('cat', 'ant') +----------------------+ 1 +----------------------+ 1 row in set (0.00 sec) Note that sort order changes with the character set. If you are using a non-English language such as German, strings will still sort in the order you expect them to, as long as you have set the character encoding when creating the table. |