Recipe 5.5. Checking a String's Character Set or CollationProblemYou want to know the character set or collation of a string. SolutionUse the CHARSET( ) or COLLATION( ) function. DiscussionIf you create a table using the following definition, you know that values stored in the column will have a character set of utf8 and a collation of utf8_danish_ci: CREATE TABLE t (c CHAR(10) CHARACTER SET utf8 COLLATE utf8_danish_ci); But sometimes it's not so clear what character set or collation applies to a string. Server configuration affects literal strings and some string functions, and other string functions return values in a specific character set. Symptoms that you have the wrong character set or collation are that a collation-mismatch error occurs for a comparison operation, or a lettercase conversion doesn't work properly. This section shows how to check what character set or collation a string has. Section 5.6 shows how to convert strings to a different character set or collation. To find out what character set or collation a string has, use the CHARSET( ) or COLLATION( ) function. For example, did you know that the USER( ) function returns a Unicode string? mysql> SELECT USER(), CHARSET(USER()), COLLATION(USER()); +------------------+-----------------+-------------------+ | USER() | CHARSET(USER()) | COLLATION(USER()) | +------------------+-----------------+-------------------+ | cbuser@localhost | utf8 | utf8_general_ci | +------------------+-----------------+-------------------+ String values that take their character set and collation from the current configuration may change properties if the configuration changes. This is true for literal strings: mysql> SET NAMES 'latin1'; mysql> SELECT CHARSET('abc'), COLLATION('abc'); +----------------+-------------------+ | CHARSET('abc') | COLLATION('abc') | +----------------+-------------------+ | latin1 | latin1_swedish_ci | +----------------+-------------------+ mysql> SET NAMES latin7 COLLATE 'latin7_bin'; mysql> SELECT CHARSET('abc'), COLLATION('abc'); +----------------+------------------+ | CHARSET('abc') | COLLATION('abc') | +----------------+------------------+ | latin7 | latin7_bin | +----------------+------------------+ For a binary string, the CHARSET( ) or COLLATION( ) functions return a value of binary, which means that the string is compared and sorted based on numeric byte values, not character collation values. Several functions return binary strings, such as MD5( ) and PASSWORD( ): mysql> SELECT CHARSET(MD5('a')), COLLATION(MD5('a')); +-------------------+---------------------+ | CHARSET(MD5('a')) | COLLATION(MD5('a')) | +-------------------+---------------------+ | binary | binary | +-------------------+---------------------+ mysql> SELECT CHARSET(PASSWORD('a')), COLLATION(PASSWORD('a')); +------------------------+--------------------------+ | CHARSET(PASSWORD('a')) | COLLATION(PASSWORD('a')) | +------------------------+--------------------------+ | binary | binary | +------------------------+--------------------------+ It can be useful to know that a function or string expression produces a binary string if you're trying to perform lettercase conversion on the result and it's not working. See Section 5.8 for details. |