MySQL supports a vast number of character sets. To see the complete list of supported character sets, use this statement: • Input SHOW CHARACTER SET; • Analysis This statement displays all available character sets, along with the description and default collation for each. To see the complete list of supported collations, use this statement: • Input SHOW COLLATION; • Analysis This statement displays all available collations, along with the character sets to which they apply. You will notice that several character sets have more than one collation. latin1, for example, has several for different European languages, and many appear twice, once case sensitive (designated by _cs) and once not case sensitive (designated by _ci). A default character set and collation are defined (usually by the system administration at installation time). In addition, when databases are created, default character sets and collations may be specified, too. To determine the character sets and collations in use, use these statements: • Input SHOW VARIABLES LIKE 'character%'; SHOW VARIABLES LIKE 'collation%'; In practice, character sets can seldom be server-wide (or even database-wide) settings. Different tables, and even different columns, may require different character sets, and so both may be specified when a table is created. To specify a character set and collation for a table, CREATE TABLE (seen in Chapter 21, "Creating and Manipulating Tables") is used with additional clauses: • Input CREATE TABLE mytable ( columnn1 INT, columnn2 VARCHAR(10) ) DEFAULT CHARACTER SET hebrew COLLATE hebrew_general_ci; • Analysis This statement creates a two column table, and specifies both a character set and a collate sequence. In this example both CHARACTER SET and COLLATE were specified, but if only one (or neither) is specified, this is how MySQL determines what to use:
In addition to being able to specify character set and collation table wide, MySQL also allows these to be set per column, as seen here: • Input CREATE TABLE mytable ( columnn1 INT, columnn2 VARCHAR(10), column3 VARCHAR(10) CHARACTER SET latin1 COLLATE latin1_general_ci ) DEFAULT CHARACTER SET hebrew COLLATE hebrew_general_ci; • Analysis Here CHARACTER SET and COLLATE are specified for the entire table as well as for a specific column. As mentioned previously, the collation plays a key role in sorting data that is retrieved with an ORDER BY clause. If you need to sort specific SELECT statements using a collation sequence other than the one used at table creation time, you may do so in the SELECT statement itself: • Input SELECT * FROM customers ORDER BY lastname, firstname COLLATE latin1_general_cs; • Analysis This SELECT uses COLLATE to specify an alternate collation sequence (in this example, a case-sensitive one). This will obviously affect the order in which results are sorted. Tip Occasional Case Sensitivity The SELECT statement just seen demonstrates a useful technique for performing case-sensitive searches on a table that is usually not case sensitive. And of course, the reverse works just as well. Note Other SELECT COLLATE Clauses In addition to being used in ORDER BY clauses, as seen here, COLLATE can be used with GROUP BY, HAVING, aggregate functions, aliases, and more. One final point worth noting is that strings may be converted between character sets if absolutely needed. To do this, use the Cast() or Convert() functions. |