Section 22.2. MySQL Backup and Recovery Methodologies | Backup & Recovery: Inexpensive Backup Solutions for Open Systems

22.2. MySQL Backup and Recovery Methodologies

There are two methodologies you can use to back up MySQL databases without spending money. The first is to create a series of SQL statements that rebuild one or more databases. This is usually accomplished via the mysqldump script. The second is to copy the database files themselves if you can ensure they're not being changed during the backup. This is currently possible only for MyISAM archive tables using the mysqlhotcopy script.

Whichever backup method you choose, you should ensure that you have enabled the binary log prior to beginning backups. This allows you to perform a point-in-time recovery if you choose to do so.

If you're using InnoDB tables, you can also purchase a commercial hot backup tool from Oracle. This tool is not covered in this book, but it offers a number of advantages that help warrant its price.

22.2.1. SQL-Level Backup and Recovery

mysqldump is designed to automate the creation of a SQL-level backup of one or more databases in a MySQL environment. It produces a human-readable text file containing all the SQL commands needed to recover MySQL.

To do a SQL-level dump, you should lock all tables before you start and unlock them when you're done. This can be done using SQL commands.

If you're backing up MyISAM tables, you can use the --lock-tables or --lock-all-tables options to mysqldump. The --lock-tables option locks tables one at a time and so does not guarantee table consistency across multiple tables. The --lock-all-tables option locks all tables in all databases during the time of the backup, ensuring consistency across multiple databases; however, this may have a greater impact on your applications because the lock remains in effect for a much longer time. Although the databases stay up while backing up with these options, the locking of tables can have a dramatic impact on the usability of the database. For example, inserts, updates, and deletes are not allowed to execute when a table is locked. They lock, or freeze, and then execute once the backup is completed.

If you're backing up InnoDB tables, you should use the --single-transaction option instead. It creates a short lock in the beginning of the dump but maintains consistency through the rest of the backup without having to lock any tables.

The --lock-tables, --lock-all-tables, and --single- TRansaction options are mutually exclusive. Specifying more than one of these options may cause your dump to fail. It may also cause one or more of the specified options to be ignored.

If you are using the binary log to restore your database up to the point of failure, you'll also want to add --flush-logs and --master-data=2. Flushing the logs when you make your full backup starts a new binary logfile, ensuring that the new binary log starts with SQL statements that were issued since the backup. The --master-data=2 option ensures the dump file contains the name of the new current binary log.

22.2.1.1. Backing up MyISAM tables

The following command creates a SQL dump of all MySQL databases in a given instance, locks all tables as necessary, flushes the logs, and writes the name of the current binary log to the output. The second command causes MySQL to switch binary logfiles so that you can back up the logfile that was used during the backup.

$ mysqldump --all-databases --lock-all-tables --flush-logs --master-data=2  > all_databases.sql $ mysqladmin --flush-logs

22.2.1.2. Backing up InnoDB tables

If you are backing up InnoDB tables, you should replace the --lock-all-tables option with -- single-transaction. This treats the mysqldump process as a transaction, complete with begin transaction and end transaction statements. This first command backs up all databases, flushes the logs, and write the name of the current binary log to the output. The second command backs up just the mysql database using the --lock-all-tables option, which is the proper way to back up that database because it uses the MyISAM storage engine. The third command causes MySQL to switch binary logfiles so that you can back up the logfile that was used during the backup.

$ mysqldump --all-databases --single-transaction --flush-logs --master-data=2  > all_databases.sql $ mysqldump --database=mysql --lock-all-tables --flush-logs --master-data=2  > system.sql $ mysqladmin --flush-logs

22.2.1.3. Repairing corrupted MyISAM tables

Databases are restored for two reasons. The first reason is physical damage; the database was deleted, or the device on which it was being stored was damaged. The second reason is logical damage; one or more tables in the database got corrupted. If you have corrupted MyISAM tables, you should try to repair the tables before restoring them. It should be faster. The following command fixes most problems. First, cd to the directory that holds the MyISAM datafiles, then run this:

$ myisamchk -r -q table_name

If that doesn't work, try this command; it fixes some problems that the quick repair just shown does not:

$ myisamchk --safe-recover table_name

22.2.1.4. SQL-level MySQL restores

Restoring MySQL databases that were backed up with mysqldump is easy. Simply tell the mysql command to read the file:

$ mysql u username p < all_databases.sql

Of course, you can edit the output of mysqldump, deleting any databases or tables that you don't want to restore.

If you want to apply the changes made to the database since the time of the backup, follow the instructions in the section "Using Point-in-Time Recovery" later in this chapter.

22.2.2. File-Level Backup and Recovery

Since MyISAM tables are ultimately stored in files on the filesystem, you can copy them using a backup program if you can ensure that they're not being changed.

22.2.2.1. Build your own file-level backup

There are dozens of ways that you can use this method to back up your MySQL databases, but they're all essentially the same: get MySQL to stand still for a moment while you copy its datafiles. This can be done in three different ways:

Flush tables and logs, get a read lock on the tables, back up the datafiles, then unlock the tables.
Flush tables and logs, get a read lock on the tables, make a snapshot, then unlock the tables.
Shut down the database, back up the datafiles (or take a snapshot), then start the database.

If you're going to back up MyISAM files live, you need to make sure MySQL isn't changing them. To do this, make sure the database has been flushed to disk and that you've obtained a read lock on the appropriate tables. Then you can do whatever you want to back up the files.

To obtain a read lock and flush MySQL to disk, issue the following commands:

mysql> flush tables with read lock; mysql> flush logs

Once you've done that, you can do whatever you want to create the backup, realizing that any inserts, updates, or deletes will be blocked during this operation. (Therefore, you might want to do the backup as quickly as possible, depending on your environment.) Suggestions offered by the MySQL community include a typical backup or the creation of a snapshot.

You can do the typical backup just about any way you want to. You can create a tar or cpio backup image on disk or tape. You can also just run your typical open-source or commercial backup automation program.

Make sure to unlock the tables when you're done:

mysql> unlock tables;

Another interesting idea is to use the Linux Logical Volume Manger (LVM) to quickly create a snapshot. This allows you to take a quick virtual backup, then take as long as you want to back up that snapshot to other media. You still need to use mysql to lock the tables and flush the logs:

mysql> flush tables with read lock; mysql> flush logs

Next, tell LVM to create the snapshot, where somename is a string you assign to the snapshot, somesize is how large you allow the cache to get for this snapshot, and volgroup/lvol is the name of the volume group and logical volume, such as vg01/lvol1:

# lvcreate --snapshot --size=somesize --name=somename /dev/volgroup/lvol

Once you've created the snapshot, mount it:

# mkdir -p /mnt/somename # mount -o -ro /dev/volgroup/somename /mnt/somename

Once you've done that, you can take all the time you want to back up /mnt/ somename, unmounting it when you're done. Then remove the snapshot with the following command:

# lvremove -f /dev/volgroup/somename

Finally, make sure you unlock the tables when you have finished. The following command does that:

mysql> unlock tables;

You could, of course, also perform a cold backup. That is, you could back up the database while it is down. The only preparation for a cold backup is to stop mysqld. Once your backup is done, or a snapshot is taken, you can start up the database again.

22.2.2.2. File-level backup with mysqlhotcopy

If you're using MyISAM or Archive tables, the mysqlhotcopy command automates the creation of a file-level backup for you. It locks the database and copies the database files to another location. To execute a mysqlhotcopy backup, run the following command, where database_name is the database you wish to back up, and somedirectory is the name of the directory you want to back up to:

$ mysqlhotcopy database_name /somedirectory

One challenge with this method is that the default authentication method requires you to pass the username and password on the command line, which is not very secure. A workaround for this is to edit the [client] subheader in the my.cnf file to include the following lines:

[client] username=root password=yourpassword

22.2.2.3. Restoring from file-level backups

Restoring from file-level backups is easy. All you have to do is stop mysqld and follow your typical restore procedure to restore the files from the backup system into your mysql directory. One downside to this method is that there's no easy way to merge data from such backups into existing tables. You can restore them to a different directory and run mysqldump against them to create SQL statements to merge them.

If you want to apply the changes made to the database since the time of the backup, follow the instructions in the next section.

MySQL Projects

The MySQL community is working on a few things for the future. One is an online backup tool designed to make mysqldump and mysqlhotcopy obsolete. It is at least a year away from being generally available as of this writing (hopefully, it will be available in 2007), but you can read about it at http://forge.mysql.com/wiki/OnlineBackup. Another project is another ACID-compliant storage engine code-named Falcon, which should also be generally available sometime in 2007.

22.2.3. Using Point-in-Time Recovery

If you enabled the binary log, you have the option of a point-in-time recovery, regardless of which backup and recovery method you choose. If the binary log is running, you will find a series of files using the string that you specified in the --log-bin= base_name argument to mysqld. For example, if you specified /backups/binarylog as your base name, you will find a series of logs in /backups named binarylog.001, binarylog.002, and so on.

The typical practice for the binary log is to specify as the base name the same directory as datadir, where database files are kept. I prefer to store this important log in another directory, perhaps on another filesystem, along with the output from mysqldump or any other backup utilities.

There are two ways to apply the binary logfiles to a restored database. The first is to simply apply them directly without converting them to text. The second option is to convert them to text, possibly edit the text file, and apply them as SQL commands.

The first thing that you do, of course, is restore the database using one of the methods covered earlier in the chapter. Use mysqldump, mysqlhotcopy, or your own custom method to back up and recover the database in question. Once that is done, the database has been restored to the point in time the backup was taken. What you need then are all binary logs that were modified since that point.

22.2.3.1. Directly applying binary logs

The simplest method of applying binary logs to an already restored database is to pipe them into mysql. Suppose, for example, you had two binary logs that have been modified since your last backup, and they are called /backups/binarylog.093 and /backups/binarylog.094. All you have to do is tell the mysqlbinlog command to process the logs, and to pass the results of that process directly to mysql:

$ mysqlbinlog --database=database /backups/binarylog.093 /backups/binarylog.094|mysql

This command passes to mysqlbinlog's output (and mysql's input) a series of SQL statements that reflect changes that were made to the specified database since the point in time that the database was restored.

22.2.3.2. Applying binary logs via temporary SQL files

Since mysqlbinlog sends to standard output the SQL commands it finds in each log, you can also create a temporary file that contains those SQL commands using this command:

$ mysqlbinlog --database=database /backups/binarylog.093 \ /backups/binarylog.094 >/backups/myredo.sql

That temporary file can then be passed to mysql, resulting in the SQL statements being reexecuted:

$ mysql < /backups/myredo.sql

These two commands accomplish the same thing as the single command in the previous section, but they also allow you to edit the SQL file if you so desire.

22.2.3.3. Applying binary logs using date ranges

Sometimes you don't want the database restored all the way to the point of failure, especially when the "failure" was a human one. For example, suppose a DBA accidentally dropped a very important table, and you were not able to roll that transaction back. (If you use MyISAM tables, for example, you can't roll transactions back.)

What you can do is restore the database to the last full backup, then apply the binary logs until right before you dropped the table. There are two ways to accomplish this. If you know exactly when the drop table command was issued, you can pass a date range to mysqlbinlog:

$ mysqlbinlog --start-date="2006-07-09 15:00:00" --stop-date="2006-07-10 15:00:00" \ /backups/binarylog.093 /backups/binarylog.094 |mysql

Another way would be to use the command from the previous section that creates a text file containing all the SQL statements from the bin log:

$ mysqlbinlog /backups/binarylog.093 /backups/binarylog.094 >/backups/myredo.sql

Look at the text file /backups/myredo.sql to determine the position number of the drop table command. (Position numbers are logged in the binary log starting with the string log_pos.) Identify the position number of the bad command, and tell mysqlbinlog to stop before that position:

$ mysqlbinlog --start-position="337280" --stop-position="337302" \ /backups/binarylog.093 /backups/binarylog.094 |mysql

Finally, you can also create the text file with all the SQL commands and simply edit out the one command you don't like. You can then pass the remaining SQL statements to mysql. First create the file:

$ mysqlbinlog /backups/binarylog.093 /backups/binarylog.094 >/backups/myredo.sql

Edit it with the text editor of your choice, and pass the remaining commands to mysql:

$ cat /backups/myredo.sql|mysql

22.2.4. MySQL Cluster Hot Backup and Recovery

Since MySQL cluster databases actually reside on several nodes, backups work a bit differently. A single command causes each cluster node to write three files to a specified directory. Once you've received notification that the backup is complete, you can back up or copy the three files using any method you wish.

As mentioned previously, the backup consists of three files that contain different types of data. The names of these backup files start with BACKUP and contain the <backup_id> and <node_id> to which that file belongs. The <backup_id>is a unique identifier for each backup that is assigned by the backup process, and <node_id> is the unique identifier for each node that creates a backup file.

Metadata is stored in the BACKUP-<backup_id>.<node_id>.ctl file.

This backup file contains the names and definitions of all database tables. Each node backs up the definitions for all tables in the cluster to its metadata backup file.

Table records are stored in the BACKUP-<backup_id>-0.<node_id>.datafile.

This backup file contains the actual data stored in the database (at the time the backup was made). Each node stores a fragment of the entire database, and the backup file starts with a header specifying the tables to which the records in a given backup file belong.

The transaction log is stored in the BACKUP-<backup_id>.<node_id>.logfile.

This backup file contains a transactional record specifying how data was stored in the database. Each node backs up only the transaction log for the tables that it backed up in the table records backup file.

22.2.4.1. Initial configuration

If you want to perform a MySQL cluster backup, there are four configuration parameters in MySQL. Their default values typically work for most environments.

BackupDataBufferSize: The amount of memory used when writing data to disk.
BackupLogBufferSize: The amount of memory that should be used for log records before they're written to disk.
BackupMemory: The total amount of memory assigned to backup; it should be the sum of BackupDataBufferSize and BackupLogBufferSize.
BackupWriteSize: The block size of data written to disk, which applies to all parts of the backup.

Optionally, you can specify a value for BackupDataDir to specify that all cluster backup files be sent to this directory. One idea is to use an NFS directory and send there all backup datafiles using this parameter. If you do not specify a value for BackupDataDir, backups are placed in a subdirectory called BACKUP in the location specified by the FileSystemPath parameter.

22.2.4.2. Performing a backup of MySQL cluster

Performing a backup is relatively easy: you essentially have to run only one command and watch for its output:

Start the backup by issuing this command at the shell prompt:

                             ndb_mgm e "start backup"

Once the management client has submitted a request to the cluster to start the backup, it responds with the phrase Start of backup ordered. At this point it has submitted only a request, and it has not received a response from the cluster.
Once the backup actually starts, the management client responds that Backup backup_id started, where backup_id is the unique identifier assigned to this backup. As mentioned previously, this unique identifier is used as part of the name of the backup files. At this point, the backup has started, but it has not finished.

This backup may take a while to complete. If you want to cancel a backup you have started, enter the command:

ndb_mgm e "abort backup backup_id"

where backup_id is the unique identifier you were previously given.

Once the backup is completed, the management client responds with the message Backup backup_id completed.
At this point, each node has created the three files mentioned earlier in the directory you previously specified.

22.2.4.3. Restoring a MySQL cluster

MySQL cluster backups are restored with the backups mentioned previously using the ndb_restore command. The command must be executed once for each set of three backup files. In other words, it must be run as many times as there are nodes in the cluster.

To perform a MySQL cluster restore, the cluster must be operational, and you should have an empty database to restore to. There must also be a free connection to the cluster to perform the restore. This can be verified by issuing the ndb_mgm e show command.

The following steps can be used to restore a MySQL cluster database:

Run the command ndbd --initial on all storage nodes.
Place the cluster in single user mode.
Once you have an empty database to restore to and have verified that you can connect to the database, it's time to restore it.
Run ndb_restore and tell it to restore the first node. Specify the host and port number of the MySQL Cluster Management Server with the c mgm_host:port option, followed by the first node in the cluster (for example, n 1).

You can actually restore your nodes in any order. Starting with the first node just keeps it orderly. Just remember to specify the m option for the first node.

You also need to specify a backup_id number b backup_id (for example, b 7.) and the name of the directory where you created the backup files (for example, / backups/mysqlcluster). The first time you run ndb_restore, you have to specify the m option to rebuild the tables. Here is an example ndb_restore command to restore the first node in the cluster from backup_id 7, with the backups located in /backups/mysqlcluster:

$ ndb_restore -c                                                            mgm_host                                                        :                                                           port                                                         -n 1 -m b 7 -r /backups/mysqlcluster

Once that's done, repeat the command for each node in the cluster. The following code is for a four-node cluster. Since you already restored the first node in the previous step, there are three more nodes to restore.

$ ndb_restore -c                                                            mgm_host                                                        :                                                           port                                                         -n 2 -b 7 -r /backups/mysqlcluster $ ndb_restore -c                                                            mgm_host                                                        :                                                           port                                                         -n 3 -b 7 -r /backups/mysqlcluster $ ndb_restore -c                                                            mgm_host                                                        :                                                           port                                                         -n 4 -b 7 -r /backups/mysqlcluster

Once you've successfully executed the first ndb_restore, you can run the other restores in parallel if you wish.

BackupCentral.com has a wiki page for every chapter in this book. Read or contribute updated information about this chapter at http://www.backupcentral.com.