Optimal Data Storage

 < Day Day Up > 

Prior to exploring how you can best configure MyISAM's data storage features, you can look at several key administrative reports and utilities at your disposal.

Table Reporting and Control

Recall that to help get a better understanding of the information MySQL tracks about MyISAM tables, the first part of Chapter 7, "Indexing Strategies," examined the SHOW INDEX output for a sample table, filled with 1.25 million rows, and then updated MySQL's internal statistics via OPTIMIZE TABLE. Of course, this command does much more than simply analyze and update statistics, including repairing any damage to the table as well as sorting index pages. To simply refresh statistics, you can use ANALYZE TABLE instead. In either case, here is the structure for this table:

 CREATE TABLE demo_show_index (         col1 INT UNSIGNED AUTO_INCREMENT PRIMARY KEY,         col2 VARCHAR(30) NOT NULL,         col3 DATE NOT NULL,         col4 ENUM ('Mercury', 'Venus', 'Mars', 'Earth', 'Jupiter') NOT NULL,         col5 TEXT NOT NULL,          col6 DATETIME NOT NULL,         col7 BLOB NOT NULL,         INDEX (col2),         INDEX (col3),         INDEX (col4),         FULLTEXT (col5),         INDEX (col6),         INDEX (col7(150)),         INDEX (col3,col2) ) ENGINE = MYISAM; 

The following is a portion of the output from the SHOW INDEX command for this table:

 mysql> SHOW INDEX FROM demo_show_index\ G *************************** 1. row ***************************        Table: demo_show_index   Non_unique: 0     Key_name: PRIMARY Seq_in_index: 1  Column_name: col1    Collation: A  Cardinality: 1250000     Sub_part: NULL       Packed: NULL         Null:   Index_type: BTREE      Comment: 

Although this is helpful, you can get a much more informative report by running the myisamchk utility. Remember that myisamchk is designed to work only when the mysqld process is not running; otherwise you may corrupt your tables:

 myisamchk -i demo_show_index Checking MyISAM file: demo_show_index Data records: 1250000   Deleted blocks:       0 - check file-size - check record delete-chain - check key delete-chain - check index reference - check data record references index: 1 Key:  1:  Keyblocks used:  98%  Packed:    0%  Max levels:  4 - check data record references index: 2 Key:  2:  Keyblocks used:  54%  Packed:   84%  Max levels:  4 - check data record references index: 3 Key:  3:  Keyblocks used:  81%  Packed:    0%  Max levels:  4 - check data record references index: 4 Key:  4:  Keyblocks used:  87%  Packed:    0%  Max levels:  3 - check data record references index: 5 Key:  5:  Keyblocks used:  83%  Packed:    0%  Max levels:  4 - check data record references index: 6 Key:  6:  Keyblocks used:  62%  Packed:    6%  Max levels:  9 - check data record references index: 7 Key:  7:  Keyblocks used:  68%  Packed:   70%  Max levels:  4 - check data record references index: 8 Key:  8:  Keyblocks used:  50%  Packed:   96%  Max levels:  4 Total:    Keyblocks used:  56%  Packed:   93% - check record links Records:           1250000    M.recordlength:      885   Packed:             0% Recordspace used:      100%   Empty space:           0%  Blocks/Record:   1.00 Record blocks:     1250000    Delete blocks:         0 Record data:    1106732346    Deleted data:          0 Lost space:         937638    Linkdata:        4687784 myisamchk  -verbose  -description demo_show_index MyISAM file:         demo_show_index Record format:       Packed Character set:       latin1_general_ci (8) File-version:        1 Creation time:       2004-11-26 19:28:59 Recover time:        2005-05-27  5:40:42 Status:              checked,analyzed,sorted index pages Auto increment key:              1  Last value:               1250000 Data records:              1250000  Deleted blocks:                 0 Datafile parts:            1250000  Deleted data:                   0 Datafile pointer (bytes):        4  Keyfile pointer (bytes):        4 Datafile length:        1112357768  Keyfile length:         950180864 Max datafile length:    4294967294  Max keyfile length: 4398046510079 Recordlength:                   67 table description: Key Start Len Index   Type                     Rec/key         Root  Blocksize 1   1     4   unique  unsigned long                  1         1024       1024 2   5     30  multip. char packed stripped         625     10260480       1024 3   35    3   multip. uint24                       579     22009856       1024 4   38    1   multip. binary                    250000     32770048       1024 5   49    8   multip. ulonglong                      1     39958528       1024 6   57    150 multip. varbin prefix BLOB             1     57952256       1024 7   35    3   multip. uint24 prefix                579    344858624       1024     5     30          char stripped                  1 8   5     254 fulltext varchar packed              8621    364686336       2048     1     4           float                          0 

As you can see, these reports provide more extensive details about the table and its indexes than the SHOW INDEX command reviewed earlier. In fact, you'll get even more information if you pass in added "v" characters along with the dv directive (such as dvv). Additional items of interest from a performance point of view include the following:

  • Status You can see that this table has already received an integrity check as well as a full statistical analysis. In addition, the table has been sorted by the primary key.

  • Deleted blocks Recall that a MyISAM table can end up in a fragmented state after a period of data modifications in which blocks in the middle of the table are deleted. You can remove this fragmentation with OPTIMIZE TABLE. However, this is a new table that has not had any data alterations, so there aren't any deleted blocks.

  • Index key compression The packed portion of the report shows the percentage compression that MySQL was able to achieve on this key.

  • Index depth The max_levels section reports on the depth of the index B-tree the number of non-leaf index nodes before reaching the leaf node. In the case of key number six, which is a large BLOB column that contains all unique values, the index is nine levels deep.

  • Datafile parts and Data records The ratio of these two numbers can provide insight as to the degree of fragmentation for this table. For example, if the number of Datafile parts significantly exceeds the number of Data records, then there is a good chance that the table is fragmented. On the other hand, equivalency between these two values means that the table has no fragmentation.

Storing a Table in Index Order

If you frequently retrieve large ranges of indexed data from a table or consistently sort results on the same index key, you might want to consider running myisamchk with the -sort-records option. Doing so tells MySQL to store the table's data in the same physical order as the index, and can help speed these kinds of operations. Alternatively, you can combine the ALTER TABLE statement with an ORDER BY a particular column option to achieve the same results.

For example, imagine that you have a table containing millions of rows of frequent flyer detail information:

 CREATE TABLE customer_mileage_details (         customer_id INT NOT NULL,         ff_number CHAR(10) NOT NULL,         transaction_date DATE NOT NULL,         mileage SMALLINT NOT NULL,         INDEX (customer_id),         INDEX (ff_number, transaction_date) ) ENGINE = MYISAM; 

As you analyze usage patterns for the table, you realize that users typically employ the second index when running queries that locate and retrieve large blocks of consecutive rows. To speed access, you choose to have the table sorted by the second index:

 myisamchk  v  R 2 

From this point forward, all rows in the table will be stored in ff_number, transaction_date order until new rows are added. If you want to make this sequencing more permanent, consider running this command periodically.

This is just a fraction of the information that myisamchk provides. Although a full exploration of this vital utility is beyond the scope of this book, it's a good investment for any MySQL administrator to spend the time to learn more about myisamchk.

Table Maintenance and Repair

Table and index maintenance and repair are vital functions. The following sections look at a few examples of how to improve the speed of these necessary operations.

Compressing Tables

MySQL features a number of internal compression algorithms that can shrink the amount of space required by certain fields within indexes. However, administrators can also elect to compress an entire table by using the myisampack utility. Note that this option is only available for read-only tables, so a heavily updated transactional table is not a good candidate for compression.

Compression provides two main benefits. First, if your query uses either a primary key or unique index to locate a row, MySQL only decompresses the specific row in question. This happens very quickly and efficiently. Secondly, a compressed table consumes less disk space, freeing up resources for other uses.

If you still want to realize the benefits of compression from a frequently modified table, one idea is to occasionally create an empty MyISAM table, insert relevant records from your transactional table, and then compress the newly filled table. You can also elect to decompress a MyISAM table, fill it with additional data, and then rerun myisampack to recompress the table. However, these steps may be too time consuming if the table is very large or very frequently updated.

Parallel Index Creation

Although currently considered an alpha-quality feature, you may elect to have myisamchk build indexes in parallel. To do so, pass in the p or --parallel-recover parameters. Another option would be to boost the value of the myisam_repair_threads server setting, which will affect the behavior of the REPAIR TABLE statement, forcing it to attempt index rebuilding in parallel.

In terms of choosing a value, don't raise this value to more than the number of CPUs found on your database server.

Defragmenting Tables

Over time, tables that feature variable length rows (such as those containing TEXT, VARCHAR, or BLOB columns) can become fragmented. This happens because these variable-length rows often span noncontiguous pages if there isn't available space on consecutive pages. Locating and retrieving this dispersed information taxes the database server. Fortunately, administrators have at least options to defragment these types of tables:

  • The OPTIMIZE TABLE command Periodically running this command causes MySQL to restructure the table, reclaiming space and removing fragmentation at the same time. Note that this statement locks the table for the duration of the process, so be mindful of when you choose to initiate defragmentation.

  • The myisamchk utility Like its preceding counterpart, you can use myisamchk to reclaim space and defragment the table. However, to run this command, you must shut down the database server, possibly inconveniencing your users.

  • The mysqlchk utility The main drawback of myisamchk is that you must shut down the mysqld process to avoid potential corruption. However, mysqlchk addresses this limitation, letting you keep your database server active while still defragmenting its tables.

Controlling Sorting Resources

MyISAM lets administrators dictate the amount of resources available for a number of important database operations, including sorting. The following list describes two ways to affect this behavior:

  • Setting up a memory buffer for sorting or creating indexes Provide a value for myisam_sort_buffer_size to control the amount of memory available to MySQL when either building a new index or sorting an index during table repairs.

    Because this variable is configurable at both the GLOBAL and SESSION levels, it's a good idea to leave the setting at its default globally; you can then raise it substantially for a session if you're about to embark on a significant table repair or big index creation operation.

  • Determining temporary storage selection during index creation This book has continually touted the fast processing benefits of working in memory versus on disk. However, there is at least one exception in which your goal should be to use the disk. This happens when you create or repair an index.

    If MyISAM estimates that its temporary storage needs will exceed the value of myisam_max_sort_file_size, it performs the sorting necessary to build the index via the memory-based key cache. If you are experiencing sluggish performance when undertaking an index build or rebuild, try boosting this setting. However, be certain you have enough disk space before forcing MySQL to use a disk-based sort file.

    Finally, if you want to ensure that MySQL uses your requested disk-based approach, make sure that the myisam_max_extra_sort_file_size variable is not set too low. In fact, it should be set to the same value (or even larger) as myisam_max_sort_file_size. Otherwise, MySQL might still elect to use the more costly key cache method of index creation. This determination is made as follows: If the sort file is larger than the key_buffer_size setting by myisam_max_extra_sort_file_size bytes, then use the slower keycache method.

    How can you tell if you have indeed forced MySQL to use the disk-based method? Simply run either the MySQL Administrator or the SHOW PROCESSLIST command:

     *************************** 3. row ***************************      Id: 4    User: enorton    Host: db-server1      db: high_hat Command: Query    Time: 439   State: Repair by sorting    Info: create fulltext index cl_ix1 on customer_letters(letter_body) 

    If you have configured things correctly, you will see "Repair by sorting" in the "State" column rather than "Repair by Keycache." For administrators using myisamchk, you may force this behavior with either the n/--sort-recover or p/--parallel-recover options.

    The correct settings can make an enormous difference when building a large index. This chapter's section on FULLTEXT indexing provides a dramatic example of this.

     < Day Day Up > 


    MySQL Database Design and Tuning
    MySQL Database Design and Tuning
    ISBN: 0672327651
    EAN: 2147483647
    Year: 2005
    Pages: 131

    flylib.com © 2008-2017.
    If you may any questions please contact us: flylib@qtcs.net