Using Different Languages in MySQL

Of course, the data you put into MySQL can be any language you want, but many people around the world who do not speak English as a first language use MySQL. MySQL AB, the company now responsible for MySQL, is based in Sweden, and most of the primary developers are Scandinavian. So, it comes as no surprise that MySQL distributions come with support for other languages. The following languages are currently supported, and more are likely to be added: Czech, Danish, Dutch, English (the default), Estonian, French, German, Greek, Hungarian, Italian, Korean, Norwegian, Norwegian-ny, Polish, Portuguese, Romanian, Russian, Slovak, Spanish, and Swedish.

Displaying Error Messages in Another Language

Starting MySQL so that it displays error messages in one of these languages is as easy as using
the --language or -L options at startup. To do so in the config file, simply add a line such as the following:

 language=french 

You can also edit the error messages yourself (perhaps you want your database to have that personal touch) or contribute your own set in another language, and give something back to the MySQL community. To change the error messages, simply edit the errmsg.txt file in the appropriate language directory (usually /share/language_name from the MySQL base directory), run the cmp_error utility, and restart the server. For example:

 % cp errmsg.txt errmsg.bak % vi errmsg.txt 

Here I edited the error message that read as follows:

 "No Database Selected", 

to read as follows instead:

 "Haven't you forgotten something - No Database Selected", 

and then saved it:

"errmsg.txt" 229 lines, 12060 characters written % comp_err errmsg.txt errmsg.sys Found 226 messages in language file errmsg.sys

Then restart the server, and the new error messages will take effect:

 % mysqladmin shutdown % /etc/rc.d/init.d/mysql start % mysql -uroot -pg00r002b mysql> SELECT * FROM a; ERROR 1046: Haven't you forgotten something - No Database Selected

You'll have to repeat the changes if you upgrade to a newer version of MySQL.

Using a Different Character Set

By default, MySQL uses the Latin1 (ISO-8859-1) character set. The character set determines what characters can be used, as well as the sorting order for queries. You can change the character set by changing the value of the --default-character-set option when you start the server. The available character sets currently include the following:

latin1

dos

estonia

big5

german1

hungarian

czech

hp8

koi8_ukr

euc_kr

koi8_ru

win1251ukr

gb2312

latin2

greek

gbk

swe7

win1250

latin1_de

usa7

croat

sjis

cp1251

cp1257

tis620

danish

latin5

ujis

hebrew

 

dec8

win1251

 

You can see what character sets are available in your distribution by looking at the value of the character_sets variable.

When you change a character set, you'll need to rebuild your indexes to ensure they sort according to the rules of the new character set.

By default, MySQL is compiled with --with-extra-charsets=complex, which makes the other character sets available if necessary. If you are compiling MySQL yourself, and you know you are never going to need other character sets, you can use the --with-extra-charsets=none option.

Adding Your Own Character Set

You can add your own character set as well. If it is a simple character set and does not need multibyte character support or string collating routines for sorting, adding it is easy. It becomes more complex if these extras are required. To add a character set, perform the following steps:

  1. Add the new character set to the sql/share/charsets/Index file, and give it a unique ID. The path may differ on some distributions, but it'll always be the Index file. Here, you can call the new character set martian, with an ID of 31:

    # sql/share/charsets/Index # # This file lists all of the available character sets. big5               1 czech              2 dec8               3 dos                4 german1            5 hp8                6 koi8_ru            7 latin1             8 latin2             9 swe7              10 usa7              11 ujis              12 sjis              13 cp1251            14 danish            15 hebrew            16 # The win1251 character set is deprecated.  Please use cp1251 instead. win1251           17 tis620            18 euc_kr            19 estonia           20 hungarian         21 koi8_ukr          22 win1251ukr        23 gb2312            24 greek             25 win1250           26 croat             27 gbk               28 cp1257            29 latin5            30 martian           31 

  2. Create the .conf and place it in the directory, for example, sql/share/charsets/martian.conf. Use one of the existing .conf files as a starting point for this.

    In the .conf file, lines beginning with a # are comments, words are separated by any amount of whitespace, and every word must be in hexadecimal format. There are four arrays. In order, they are ctype (containing 257 elements), to_lower and to_upper (each containing 256 elements), and sort_order (also containing 256 elements). The following is a sample .conf file (this is the standard latin1.conf):

    # Configuration file for the latin1 character set # ctype array (must have 257 elements)   00   20  20  20  20  20  20  20  20  20  28  28  28  28  28  20  20   20  20  20  20  20  20  20  20  20  20  20  20  20  20  20  20   48  10  10  10  10  10  10  10  10  10  10  10  10  10  10  10   84  84  84  84  84  84  84  84  84  84  10  10  10  10  10  10   10  81  81  81  81  81  81  01  01  01  01  01  01  01  01  01   01  01  01  01  01  01  01  01  01  01  01  10  10  10  10  10   10  82  82  82  82  82  82  02  02  02  02  02  02  02  02  02   02  02  02  02  02  02  02  02  02  02  02  10  10  10  10  20   00  00  00  00  00  00  00  00  00  00  00  00  00  00  00  00   00  00  00  00  00  00  00  00  00  00  00  00  00  00  00  00   48  10  10  10  10  10  10  10  10  10  10  10  10  10  10  10   10  10  10  10  10  10  10  10  10  10  10  10  10  10  10  10   01  01  01  01  01  01  01  01  01  01  01  01  01  01  01  01   01  01  01  01  01  01  01  10  01  01  01  01  01  01  01  02   02  02  02  02  02  02  02  02  02  02  02  02  02  02  02  02   02  02  02  02  02  02  02  10  02  02  02  02  02  02  02  02 # to_lower array (must have 256 elements)   00  01  02  03  04  05  06  07  08  09  0A  0B  0C  0D  0E  0F   10  11  12  13  14  15  16  17  18  19  1A  1B  1C  1D  1E  1F   20  21  22  23  24  25  26  27  28  29  2A  2B  2C  2D  2E  2F   30  31  32  33  34  35  36  37  38  39  3A  3B  3C  3D  3E  3F   40  61  62  63  64  65  66  67  68  69  6A  6B  6C  6D  6E  6F   70  71  72  73  74  75  76  77  78  79  7A  5B  5C  5D  5E  5F   60  61  62  63  64  65  66  67  68  69  6A  6B  6C  6D  6E  6F   70  71  72  73  74  75  76  77  78  79  7A  7B  7C  7D  7E  7F   80  81  82  83  84  85  86  87  88  89  8A  8B  8C  8D  8E  8F   90  91  92  93  94  95  96  97  98  99  9A  9B  9C  9D  9E  9F   A0  A1  A2  A3  A4  A5  A6  A7  A8  A9  AA  AB  AC  AD  AE  AF   B0  B1  B2  B3  B4  B5  B6  B7  B8  B9  BA  BB  BC  BD  BE  BF   E0  E1  E2  E3  E4  E5  E6  E7  E8  E9  EA  EB  EC  ED  EE  EF   F0  F1  F2  F3  F4  F5  F6  D7  F8  F9  FA  FB  FC  FD  FE  DF   E0  E1  E2  E3  E4  E5  E6  E7  E8  E9  EA  EB  EC  ED  EE  EF   F0  F1  F2  F3  F4  F5  F6  F7  F8  F9  FA  FB  FC  FD  FE  FF # to_upper array (must have 256 elements)   00  01  02  03  04  05  06  07  08  09  0A  0B  0C  0D  0E  0F   10  11  12  13  14  15  16  17  18  19  1A  1B  1C  1D  1E  1F   20  21  22  23  24  25  26  27  28  29  2A  2B  2C  2D  2E  2F   30  31  32  33  34  35  36  37  38  39  3A  3B  3C  3D  3E  3F   40  41  42  43  44  45  46  47  48  49  4A  4B  4C  4D  4E  4F   50  51  52  53  54  55  56  57  58  59  5A  5B  5C  5D  5E  5F   60  41  42  43  44  45  46  47  48  49  4A  4B  4C  4D  4E  4F   50  51  52  53  54  55  56  57  58  59  5A  7B  7C  7D  7E  7F   80  81  82  83  84  85  86  87  88  89  8A  8B  8C  8D  8E  8F   90  91  92  93  94  95  96  97  98  99  9A  9B  9C  9D  9E  9F   A0  A1  A2  A3  A4  A5  A6  A7  A8  A9  AA  AB  AC  AD  AE  AF   B0  B1  B2  B3  B4  B5  B6  B7  B8  B9  BA  BB  BC  BD  BE  BF   C0  C1  C2  C3  C4  C5  C6  C7  C8  C9  CA  CB  CC  CD  CE  CF   D0  D1  D2  D3  D4  D5  D6  D7  D8  D9  DA  DB  DC  DD  DE  DF   C0  C1  C2  C3  C4  C5  C6  C7  C8  C9  CA  CB  CC  CD  CE  CF   D0  D1  D2  D3  D4  D5  D6  F7  D8  D9  DA  DB  DC  DD  DE  FF # sort_order array (must have 256 elements)   00  01  02  03  04  05  06  07  08  09  0A  0B  0C  0D  0E  0F   10  11  12  13  14  15  16  17  18  19  1A  1B  1C  1D  1E  1F   20  21  22  23  24  25  26  27  28  29  2A  2B  2C  2D  2E  2F   30  31  32  33  34  35  36  37  38  39  3A  3B  3C  3D  3E  3F   40  41  42  43  44  45  46  47  48  49  4A  4B  4C  4D  4E  4F   50  51  52  53  54  55  56  57  58  59  5A  5B  5C  5D  5E  5F   60  41  42  43  44  45  46  47  48  49  4A  4B  4C  4D  4E  4F   50  51  52  53  54  55  56  57  58  59  5A  7B  7C  7D  7E  7F   80  81  82  83  84  85  86  87  88  89  8A  8B  8C  8D  8E  8F   90  91  92  93  94  95  96  97  98  99  9A  9B  9C  9D  9E  9F   A0  A1  A2  A3  A4  A5  A6  A7  A8  A9  AA  AB  AC  AD  AE  AF   B0  B1  B2  B3  B4  B5  B6  B7  B8  B9  BA  BB  BC  BD  BE  BF   41  41  41  41  5C  5B  5C  43  45  45  45  45  49  49  49  49   44  4E  4F  4F  4F  4F  5D  D7  D8  55  55  55  59  59  DE  DF   41  41  41  41  5C  5B  5C  43  45  45  45  45  49  49  49  49   44  4E  4F  4F  4F  4F  5D  F7  D8  55  55  55  59  59  DE  FF 

    The ctype array contains bit values, with one element for one character. The to_lower and to_upper arrays simply hold the uppercase and lowercase characters that correspond to each member of the character set. For example, to_lower['A'] contains a, while to_upper['z'] contains Z.

    The sort_order array indicates the order that characters are to be sorted (it usually corresponds to to_upper, in which case the sorting will be case insensitive. All of the arrays are indexed by character value, except ctype, which is indexed by character value + 1 (an old legacy).

  3. Add the new character set (martian.conf) to the CHARSETS_AVAILABLE and COMPILED_CHARSETS lists in the file configure.in.

  4. Reconfigure and recompile MySQL, and test the new character set.

If you're brave enough to tackle adding a new complex character set, there are a few more steps to this process. See the MySQL documentation for what is required (as well as the documentation in the existing complex character sets: czech, gbk, sjis, and tis160).

Summary

To understand how to get the most out of your database server, it's important to understand the number of options you have when fine-tuning the server. To see how an existing server has been set up, use the SHOW VARIABLES statement, as well as SHOW STATUS to see how it's been handling. The output of these two statements can reveal many hidden problems, including queries that are not optimized, poor use of available memory, or simply that it's time for an upgrade.

MySQL supplies four configuration files that can help to get better performance from the server. Just choose the closest of my-huge.cnf, my-large.cnf, my-medium.cnf or my-small.cnf for your server situation.

Two of the easiest and most important variables to tweak are table_cache (the number of tables MySQL can keep open) and the key_buffer_size (how much of the indexes MySQL can keep in memory, minimizing disk access).

InnoDB databases have their own vagaries and work in a fundamentally different way than MyISAM tables, where each table is related to specific files. InnoDB configuration requires careful planning because disk space is allocated in advance.

Hardware too can be an easy way of improving the performance of a server, with memory, CPU, and disks being of primary importance.

MySQL comes with a benchmark suite, which can be used to compare the performance of various platforms, including other databases.

MySQL was developed in Scandinavia and has had good support for other languages besides English. It is easy to display error messages in other languages or add a character set.



Mastering MySQL 4
Mastering MySQL 4
ISBN: 0782141625
EAN: 2147483647
Year: 2003
Pages: 230
Authors: Ian Gilfillan

flylib.com © 2008-2017.
If you may any questions please contact us: flylib@qtcs.net