29.2. The MyISAM Engine

29.2. The `MyISAM` Engine

The MyISAM storage engine manages tables that have the following characteristics:

On disk, MySQL represents each MyISAM table using three files: a format file that stores the definition of the table structure, a data file that stores the contents of table rows, and an index file that stores any indexes on the table. These files are distinguished from one another by their suffixes. For example, the format, data, and index files for a table named mytable are called mytable.frm, mytable.MYD, and mytable.MYI. MySQL normally stores all three files in the database directory for the database that contains the table. On systems that support appropriate symlinking capabilities, MyISAM table data and index files can be placed in a different location than the database directory.
MyISAM has the most flexible AUTO_INCREMENT column handling of all the storage engines.
MyISAM tables can be used to set up MERGE tables.
MyISAM tables can be converted into fast, compressed, read-only tables to save space.
MyISAM supports FULLTEXT searching and spatial data types.
MySQL manages contention between queries for MyISAM table access using table-level locking. Query performance is very fast for retrievals. Multiple queries can read the same table simultaneously. For a write query, an exclusive table-level lock is used to prevent use of the table by other read or write queries, leading to reduced performance in environments with a mix of read and write queries. Deadlock cannot occur with table-level locking. (Deadlock occurs when two or more queries are blocked, or stopped from completing, because each is waiting for one of the others to finish.)
You can influence the scheduling mechanism for queries that use MyISAM tables by using a query modifier such as LOW_PRIORITY or HIGH_PRIORITY. Inserts into a table can be buffered on the server side until the table isn't busy by using INSERT DELAYED; this allows the client to proceed immediately instead of blocking until the insert operation completes.
The table storage format is portable, so table files can be copied directly to another host and used by a server there. (The conditions for MyISAM portability are given at Section 32.3.4, "Conditions for Binary Portability.")
You can specify that a MyISAM table must be able to hold at least a certain number of rows, which allows MyISAM to adjust the table's internal row pointer size accordingly. It's also possible to configure the default pointer size that the server uses.
When loading data into an empty MyISAM table, you can disable updating of non-unique indexes and enable the indexes after loading. This is faster than updating the indexes for each row inserted. In fact, when LOAD DATA INFILE is used for loading an empty MyISAM table, it automatically disables and enables index updating. LOAD DATA INFILE is faster than INSERT anyway, and this optimization speeds it up even more.
If you run out of disk space while adding rows to a MyISAM table, no error occurs. The server suspends the operation until space becomes available, and then completes the operation.

MyISAM tables use the indexed sequential access method for indexing, as did the older ISAM table format. MyISAM offers better performance and more features than ISAM, so MyISAM is preferred over ISAM, and ISAM is unavailable as of MySQL 5.

MyISAM was introduced in MySQL 3.23.0 and has been the built-in default storage engine since (although you can change the default engine at server startup or while the server runs). Because MyISAM is the built-in default engine, it is always available and cannot be disabled.

29.2.1. `MyISAM` Locking Characteristics

MyISAM locking occurs at the table level. This is not as desirable as page or row locking for concurrency in a mixed read/write environment. However, deadlock cannot occur with table locking as it can with page or row locking.

When processing queries on MyISAM tables, the server manages contention for the tables by simultaneous clients by implicitly acquiring any locks it needs. You can also lock tables explicitly with the LOCK TABLES and UNLOCK TABLES statements. Explicit table locking has concurrency and performance advantages over implicit locking in certain situations, as discussed in Chapter 28, "Locking."

MyISAM tables support concurrent inserts. If a MyISAM table has no holes in the middle resulting from deleted or updated records, inserts always take place at the end of the table and can be performed while other clients are reading the table. Concurrent inserts can take place even for a table that has been read-locked explicitly if the locking client acquired a READ LOCAL lock rather than a regular READ lock.

If a table does have holes, concurrent inserts cannot be performed. However, you can remove the holes by using OPTIMIZE TABLE to defragment the table. (Note that a record deleted from the end of the table does not create a hole and does not prevent concurrent inserts.)

For applications that use MyISAM tables, you can change the priority of statements that retrieve or modify data. This can be useful in situations where the normal scheduling priorities do not reflect the application's requirements.

By default, the server schedules queries for execution as follows:

Write requests (such as UPDATE and DELETE statements) take priority over read requests (such as SELECT statements).
The server tries to perform write requests in the order that it receives them.

However, if a table is being read from when a write request arrives, the write request cannot be processed until all current readers have finished. Any read requests that arrive after the write request must wait until the write request finishes, even if they arrive before the current readers finish. That is, a new read request by default does not jump ahead of a pending write request.

When working with MyISAM tables, certain scheduling modifiers are available to change the priority of requests:

The LOW_PRIORITY modifier may be applied to statements that update tables (INSERT, DELETE, REPLACE, or UPDATE). A low-priority write request waits not only until all current readers have finished, but for any pending read requests that arrive while the write request itself is waiting. That is, it waits until there are no pending read requests at all. It is therefore possible for a low-priority write request never to be performed, if read requests keep arriving while the write request is waiting.
HIGH_PRIORITY may be used with a SELECT statement to move it ahead of updates and ahead of other SELECT statements that do not use the HIGH_PRIORITY modifier.
DELAYED may be used with INSERT (and REPLACE). The server buffers the rows in memory and inserts them when the table is not being used. Delayed inserts increase efficiency because they're done in batches rather than individually. While inserting the rows, the server checks periodically to see whether other requests to use the table have arrived. If so, the server suspends insertion of delayed rows until the table becomes free again. Using DELAYED allows the client to proceed immediately after issuing the INSERT statement rather than waiting until it completes.

Consider an application consisting of a logging process that uses INSERT statements to record information in a log table, and a summary process that periodically issues SELECT queries to generate reports from the log table. Normally, the server will give table updates priority over retrievals, so at times of heavy logging activity, report generation might be delayed. If the application places high importance on having the summary process execute as quickly as possible, it can use scheduling modifiers to alter the usual query priorities. Two approaches are possible:

To elevate the priority of the summary queries, use SELECT HIGH_PRIORITY rather than SELECT with no modifier. This will move the SELECT ahead of pending INSERT statements that have not yet begin to execute.
To reduce the priority of record logging statements, use INSERT with either the LOW_PRIORITY or DELAYED modifier.

If you use DELAYED, keep the following points in mind:

Delayed rows tend to be held for a longer time on a very busy server than on a lightly loaded one.
If a crash occurs while the server is buffering delayed rows in memory, those rows are lost.

The implication is that DELAYED is more suitable for applications where loss of a few rows is not a problem, rather than applications for which each row is critical. For example, DELAYED can be appropriate for an application that logs activity for informational purposes only and for which it is not important if a small number of rows is lost.

29.2.2. `MyISAM` Row-Storage Formats

The MyISAM storage engine has the capability of storing rows in three formats: fixed-row, dynamic-row, and compressed. These formats have differing characteristics:

Fixed-row format:

All rows have the same size.
Rows are stored within the table at positions that are multiples of the row size, making them easy to look up.
Fixed-size rows take more space.

Dynamic-row format:

Rows take varying amounts of space.
Rows cannot be looked up as efficiently.
Dynamic-rows tables usually take less space because rows are not padded to a fixed size.
Fragmentation can occur more easily than for fixed-row tables.

Compressed format:

Tables are packed to save space.
Storage is optimized for quick retrieval.
Tables are read-only.

For more information, see Section 38.3.1, "MyISAM Row-Storage Formats Row-Storage Formats," which also describes how to use the myisampack utility to create compressed MyISAM tables.