38.3. MyISAM-Specific Optimizations


38.3. MyISAM-Specific Optimizations

Several of the structural features of MyISAM tables enable you to optimize how you use them:

  • The MyISAM storage engine supports several different table storage formats that have differing characteristics. You can take advantage of these characteristics by choosing the storage format that best matches how you intend to use a table. For example, if you have a table that you'll only read and never update, you can make it a compressed table. It will take less disk space, and internal index optimizations might make retrievals faster. A dynamic-row table for which most queries are made against fixed-length columns can be split into fixed and dynamic tables.

  • Perform table maintenance operations to keep optimizer information up to date and eliminate wasted space.

  • Create FULLTEXT indexes to enable fast text searching.

  • Specify the MAX_ROWS table option to size the internal row pointers appropriately. Increase the pointer size to allow a table to contain more data than the default amount of 256TB. Decrease the pointer size for smaller tables to save on pointer storage.

  • To distribute disk activity, use table symlinking to move some of your MyISAM tables to different disks than the one where the data directory is located.

The following discussion covers most of the items in the preceding list. Section 41.2.2, "MyISAM Table Symlinking Table Symlinking," describes how to use table symlinking.

38.3.1. MyISAM Row-Storage Formats

The MyISAM storage engine supports three row formats for storing table contents. These row-storage formats have an impact on query efficiency. The three allowable formats are fixed-length, dynamic-length, and compressed:

  • With fixed-length row format, every row in a table has the same size. Consequently, every row in the table's data file is stored at a position that is a multiple of the row size. This makes it easier to look up rows, with the result that MySQL typically can process fixed-row tables more quickly than dynamic-row tables. However, fixed-row tables on average take more space than dynamic-row tables. Fixed-row tables are not very subject to fragmentation when deletes occur, because the hole left by any deleted row can be exactly filled by any new row.

  • With dynamic-length row format, rows in a table use varying amounts of storage. As a result, rows are not stored at fixed positions within data files. Each row has extra information that indicates how long the row is, and it's also possible for a row to be stored non-contiguously with different pieces in different locations. This makes retrievals more complex, and thus slower. Dynamic-row tables generally take less space than fixed-row tables. However, if a table is updated frequently, this storage format can result in fragmentation and wasted space. It can be useful to run OPTIMIZE TABLE from time to time to defragment the table.

  • Compressed tables are packed to save space and stored in optimized form that allows quick retrievals. Compressed tables are read-only, so this table format cannot be used for tables that will be updated. To create a compressed table, use the myisampack utility. It can create compressed tables from either fixed-row or dynamic-row MyISAM tables, and can compress columns of any data type.

Before MySQL 5, a MyISAM table could use fixed-row table format only if the table contained no columns with variable-length data types (VARCHAR, VARBINARY, TEXT, or BLOB). If any column had a variable-length type, dynamic-row format was used. Those rules describe the row format you should expect to see for MyISAM tables that were created with an older version of MySQL.

As of MySQL 5, fixed-row format can be used as long as the table does not contain any TEXT or BLOB columns.

To specify a row format explicitly for a new table, include a ROW_FORMAT table option in the CREATE TABLE statement. The value can be FIXED or DYNAMIC. The following statement creates t as a fixed-row table:

 CREATE TABLE t (c CHAR(50)) ROW_FORMAT = FIXED; 

To convert a MyISAM table from one format to another, use the ROW_FORMAT option with ALTER TABLE:

 ALTER TABLE t ROW_FORMAT = DYNAMIC; 

To determine what storage format a table has, use the SHOW TABLE STATUS statement and examine the value of the Row_format field:

 mysql> SHOW TABLE STATUS LIKE 'Country'\G *************************** 1. row ***************************            Name: Country          Engine: MyISAM         Version: 10      Row_format: Fixed            Rows: 239  Avg_row_length: 261     Data_length: 62379 Max_data_length: 1120986464255    Index_length: 5120       Data_free: 0  Auto_increment: NULL     Create_time: 2005-05-05 12:30:25     Update_time: 2005-05-05 12:30:25      Check_time: NULL       Collation: latin1_swedish_ci        Checksum: NULL  Create_options:          Comment:  

The Row_format value will be Fixed, Dynamic, or Compressed. (Only the myisampack utility can set the format to Compressed.)

You can also obtain storage format information from the INFORMATION_SCHEMA database:

 mysql> SELECT TABLE_NAME, ROW_FORMAT FROM INFORMATION_SCHEMA.TABLES     -> WHERE TABLE_SCHEMA = 'world'; +-----------------+------------+ | TABLE_NAME      | ROW_FORMAT | +-----------------+------------+ | City            | Fixed      | | Country         | Fixed      | | CountryLanguage | Fixed      | +-----------------+------------+ 

38.3.1.1 Using Compressed MyISAM Tables

A fixed-row or dynamic-row MyISAM table can be converted to compressed form to save storage space. In many cases, compressing a table improves lookup speed as well because the compression operation optimizes the internal structure of the table to make retrievals faster.

To be a good candidate for compression, a table should contain records that will not be updated in the future, such as archival data or log records. If you log records into different tables by year or month, for example, you can compress all the log tables except the one for the current year or month. To treat the tables as a single logical table, group them by using a MERGE table.

A compressed table is read-only, so a MyISAM table should be compressed only if its content will not change after it has been populated. If you must modify a compressed table, you can uncompress it, modify it, and compress it again. But if you have to do this often, the extra processing tends to negate the benefits of using a compressed table, especially because the table is unavailable for querying while it is being uncompressed and recompressed.

To compress a MyISAM table, use the myisampack utility. It's also necessary to use myisamchk afterward to update the indexes. The following example demonstrates how to perform this procedure, using the tables in the world database. Note: A table must not be in use by other programs (including the server) while you compress or uncompress it. The easiest thing to do is to stop the server while using myisampack or myisamchk.

1.

Back up the tables, just in case:

 shell> mysqldump world > world.sql 

2.

Stop the server so that it won't use the tables while you're packing them.

3.

Change location into the database directory where the world tables are stored, and then use myisampack to compress them:

 shell> myisampack Country City CountryLanguage Compressing Country.MYD: (239 records) - Calculating statistics - Compressing file 72.95% Compressing City.MYD: (4079 records) - Calculating statistics - Compressing file 70.94% Compressing CountryLanguage.MYD: (984 records) - Calculating statistics - Compressing file 71.42% Remember to run myisamchk -rq on compressed tables 

myisampack also understands index filenames as arguments:

 shell> myisampack *.MYI 

Using index filenames does not affect the way myisampack works. It simply gives you an easier way to name a group of tables, because you can use filename patterns.

4.

After compressing a table, you should run myisamchk to rebuild the indexes (as the final line of myisampack output indicates). Like myisampack, myisamchk understands index filename arguments for naming tables, so you can rebuild the indexes as follows:

 shell> myisamchk -rq *.MYI 

The equivalent long-option command is:

 shell> myisamchk --recover --quick *.MYI 

5.

Restart the server.

If you want to assess how effective a table-packing operation is, use SHOW TABLE STATUS before and after. (The server must be running when you use this statement.) The Data_length and Index_length values should be smaller afterward, and the Row_format value should change from Fixed or Dynamic to Compressed. The following examples show the results for the City table.

Before packing:

 mysql> SHOW TABLE STATUS FROM world LIKE 'City'\G *************************** 1. row ***************************            Name: City          Engine: MyISAM         Version: 10      Row_format: Fixed            Rows: 4079  Avg_row_length: 67     Data_length: 273293 Max_data_length: 18858823439613951    Index_length: 43008       Data_free: 0  Auto_increment: 4080     Create_time: 2005-06-09 11:53:30     Update_time: 2005-06-09 11:53:30      Check_time: NULL       Collation: latin1_swedish_ci        Checksum: NULL  Create_options:          Comment:  

After packing:

 mysql> SHOW TABLE STATUS FROM world LIKE 'City'\G *************************** 1. row ***************************            Name: City          Engine: MyISAM         Version: 10      Row_format: Compressed            Rows: 4079  Avg_row_length: 19     Data_length: 79418 Max_data_length: 281474976710655    Index_length: 30720       Data_free: 0  Auto_increment: 4080     Create_time: 2005-06-09 11:53:30     Update_time: 2005-06-09 11:53:30      Check_time: 2005-06-09 11:54:12       Collation: latin1_swedish_ci        Checksum: 2011482258  Create_options:          Comment:  

The results show that compressing the City table compressed the index moderately and that data storage requirements became less than a third of the uncompressed amount.

To uncompress a compressed table, use myisamchk in the database directory where the table files are located:

 shell> myisamchk --unpack table_name 

table_name should be either the table name or the name of its index (.MYI) file.

If you do not run myisampack or myisamchk in the database directory where the table files are located, you must specify the pathname to the files, using either absolute pathnames or pathnames relative to your current directory.

Another way to uncompress a table is to dump it, drop it, and re-create it. Do this while the server is running. For example, if the Country table is compressed, you can uncompress it with the following commands:

 shell> mysqldump world Country > dump.sql shell> mysql world < dump.sql 

By default, mysqldump output written to the dump file includes a DROP TABLE statement. When you process the file with mysql, that statement drops the compressed table, and the rest of the dump file re-creates the table in uncompressed form.

38.3.1.2 Splitting Dynamic-Row MyISAM Tables

If a dynamic-row MyISAM table contains a mix of fixed-length and variable-length columns but many of the queries on the table access only its fixed-length columns, it is sometimes possible to gain advantages both of fixed-row tables (faster retrieval) and of dynamic-row tables (lower storage requirements) by splitting the table into two tables. Use a fixed-row table to hold the fixed-length columns and a dynamic-row table to hold the variable-length columns. Use the following procedure to split a table into two tables:

1.

Make sure that the table contains a primary key that allows each record to be uniquely identified. (You might use an AUTO_INCREMENT column, for example.)

2.

Create a second table that has columns for all the variable-length columns in the original table, plus a column to store values from the primary key of the original table. (This column should be a primary key as well, but should not be an AUTO_INCREMENT column.) Specify the ROW_FORMAT = DYNAMIC option when you create the table.

3.

Copy the contents of the primary key column and the variable-length columns from the original table to the second table.

4.

Use ALTER TABLE to drop the variable-length columns (but not the primary key) from the original table. Include a ROW_FORMAT = FIXED option as well, to make sure the table is converted to fixed-row format.

After modifying the table structure this way, queries that retrieve only fixed-length columns can use the fixed-row table, and will be quicker. For queries that retrieve both fixed-length and variable-length columns, join the two tables using the primary key values to match up rows.

38.3.2. Keep Optimizer Information Up to Date

You can help the optimizer process queries for a MyISAM table more effectively if you keep the table's internal index statistics up to date. Use the ANALYZE TABLE statement for this:

 ANALYZE TABLE table_name; 

Update the index statistics after a table has been has been loaded initially, and periodically thereafter if the table continues to be modified.

If the table contains columns with variable-length data types such as BLOB, TEXT, or VARCHAR, updates and deletes can cause the table to become fragmented. Optimizing the table periodically reorganizes its contents by defragmenting it to eliminate wasted space and coalescing values that might have gotten split into non-contiguous pieces. To optimize a MyISAM table, use OPTIMIZE TABLE:

 OPTIMIZE TABLE table_name; 

For MyISAM tables, the OPTIMIZE TABLE statement also updates index statistics; you don't need to use ANALYZE TABLE if you already are using OPTIMIZE TABLE.

38.3.3. FULLTEXT Indexes

FULLTEXT searching is a feature that can be used with MyISAM tables. FULLTEXT indexes are designed to make text searching fast and easy. They have the following characteristics:

  • Each column in a FULLTEXT index must have a non-binary string data type (CHAR, VARCHAR, or TEXT). You cannot use binary string data types (BINARY, VARBINARY, or BLOB).

  • FULLTEXT indexes can be case sensitive or not, depending on the collation of the indexed columns.

  • The syntax for defining a full-text index is much like that for other indexes: an index-type keyword (FULLTEXT), an optional index name, and a parenthesized list of one or more column names to be indexed. A FULLTEXT index may be created with CREATE TABLE, added to a table with ALTER TABLE or CREATE INDEX, and dropped from a table with ALTER TABLE or DROP INDEX. The following are all legal statements for FULLTEXT index manipulation:

     CREATE TABLE t (name CHAR(40), FULLTEXT (name)); ALTER TABLE t ADD FULLTEXT name_idx (name); ALTER TABLE t DROP INDEX name_idx; CREATE FULLTEXT INDEX name_idx ON t (name); DROP INDEX name_idx ON t; 

    See Section 8.6, "Indexes," for general information on index-creation syntax.

  • Column prefixes are not applicable to FULLTEXT indexes, which always index entire columns. If you specify a prefix length for a column in a FULLTEXT index, MySQL ignores it.

  • FULLTEXT index indexes can be constructed on multiple columns, allowing searches to be conducted simultaneously on all the indexed columns. However, leftmost index prefixes are not applicable for FULLTEXT indexes. You must construct one index for every column or combination of columns you want to search. Suppose that you want to search for text sometimes only in column c1 and sometimes in both columns c1 and c2. You must construct two FULLTEXT indexes: one on column c1 and another on columns c1 and c2.

To perform a FULLTEXT search, use MATCH and AGAINST(). For example, to search the table t for records that contain 'Wendell' in the name column, use this query:

 SELECT * FROM t WHERE MATCH(name) AGAINST('Wendell'); 

The MATCH operator names the column or columns you want to search. As mentioned earlier, there must be a FULLTEXT index on exactly those columns. If you want to search different sets of columns, you'll need one FULLTEXT index for each set. If a table people has name and address columns and you want to search them either separately or together, three FULLTEXT indexes are needed:

 CREATE TABLE people (     name    CHAR(40),     address CHAR(40),     FULLTEXT (name),        # index for searching name only     FULLTEXT (address),     # index for searching address only     FULLTEXT (name,address) # index for searching name and address ); 

The indexes allow queries such as the following to be formulated:

 SELECT * FROM people WHERE MATCH(name) AGAINST('string'); SELECT * FROM people WHERE MATCH(address) AGAINST('string'); SELECT * FROM people WHERE MATCH(name,address) AGAINST('string'); 

For more information on FULLTEXT indexing and searching, see the MySQL Reference Manual.

38.3.4. Specifying MyISAM Maximum Row Count

Internally, the MyISAM storage engine represents pointers to rows within a table using values that take from two to seven bytes each. The size for a given table is determined at table-creation time, but can be changed with ALTER TABLE.

Before MySQL 5, the default pointer size was four bytes, which allows for up to 4GB of data in MyISAM tables . In MySQL 5, the default was increased to six bytes to better accommodate the trend toward use of larger tables. The six-byte size allows for up to 256TB of data. You can provide hints to MyISAM about how large the table might become, or set the pointer size directly. With larger pointer sizes, MyISAM tables can contain up to 65,536TB of data.

When you expect a table to contain many rows, MAX_ROWS is useful for telling MyISAM that it needs to use larger internal row pointers so that the amount of table data can be larger than the 256TB allowed by the default six-byte pointer size. Conversely, if you know a table will be small, specifying a small MAX_ROWS value tells MyISAM to use smaller pointers. This saves space and improves table processing efficiency.

To provide the server a hint when you create the table, specify an option in the CREATE TABLE statement that indicates how many rows the table must be able to hold. You can change the option later with ALTER TABLE should the table need to become larger.

To "pre-size" a table when you create it, use a MAX_ROWS option to indicate how many rows the table must be able to hold. The following statement indicates to MySQL that the table must be able to contain at least two million rows:

 CREATE TABLE t (i INT) MAX_ROWS = 2000000; 

If a table reaches the row limit allowed by its row pointer size, a data file full error occurs and you cannot add any more rows. This error is unrelated to running of out disk space or reaching the maximum file size allowed by MyISAM or the filesystem. It means that you need to increase the row pointer size. To set or change the MAX_ROWS value for an existing table, use ALTER TABLE:

 ALTER TABLE t MAX_ROWS = 4000000; 

MAX_ROWS = n does not place an absolute limit of n on the number of rows a table can contain. Rather, it means that the table must be able to contain a maximum of at least n rows. The table might well be able to hold more than n rows.

A related option, AVG_ROW_LENGTH, also gives the server information that it can use to estimate how large the table may become. This option might be helpful for tables with variable-length rows. It is unnecessary for tables with fixed-length rows because the server knows how long each row is.

The MAX_ROWS and AVG_ROW_LENGTH options may be used separately or together. For example, if a table has a BIGINT column (8 bytes each) and a VARCHAR(200) column where you expect the average string length to be 100 bytes, you can specify an AVG_ROW_LENGTH value of 108. If you also want to make sure that the table can hold four million rows, create it like this:

 CREATE TABLE t (i BIGINT, c VARCHAR(200)) AVG_ROW_LENGTH = 108 MAX_ROWS = 4000000; 

Note that using MAX_ROWS and AVG_ROW_LENGTH does not allow the size of MyISAM table files to be expanded beyond the limit of what the filesystem allows. For example, if you create a MyISAM table on a filesystem that only allows file sizes up to 2GB, you cannot add more rows once the data file or index file reaches 2GB, no matter what value you set MAX_ROWS to.

To determine the values of MAX_ROWS and AVG_ROW_LENGTH for a table, use SHOW TABLE STATUS and check the Create_options field of the output. If the field is empty, the options have never been set explicitly.

 mysql> SHOW TABLE STATUS LIKE 't'\G *************************** 1. row ***************************            Name: t          Engine: MyISAM         Version: 10      Row_format: Dynamic            Rows: 0  Avg_row_length: 0     Data_length: 0 Max_data_length: 4294967295    Index_length: 1024       Data_free: 0  Auto_increment: NULL     Create_time: 2005-06-09 12:00:01     Update_time: 2005-06-09 12:00:01      Check_time: NULL       Collation: latin1_swedish_ci        Checksum: NULL  Create_options: max_rows=4000000 avg_row_length=108         Comment:  

The default row pointer size is determined from the value of the myisam_data_pointer_size system variable. This variable has a value of 6 initially, but you can set it to any value from 2 to 7. For example, if you routinely create tables that must be larger than 256TB, make the value larger than 6. You can do this at server startup by setting the value in an option file. The following lines increase the default size to seven bytes:

 [mysqld] myisam_data_pointer_size=7 



MySQL 5 Certification Study Guide
MySQL 5.0 Certification Study Guide
ISBN: 0672328127
EAN: 2147483647
Year: 2006
Pages: 312

flylib.com © 2008-2017.
If you may any questions please contact us: flylib@qtcs.net