|< Day Day Up >|
As is the case with engine tuning, replication speed is largely dependent on the quality of the work you did when you designed your databases and selected an indexing strategy.
With those caveats stipulated, MySQL replication performance suggestions can be categorized into one of three classes: network, master, and slave.
The quality and speed of your network has the biggest potential impact on replication responsiveness; MySQL replication is, at its core, a distributed record-and-playback mechanism.
A slow or already overloaded network is likely to have trouble digesting the traffic generated by replication. If you have no way to improve your network or reduce competing traffic load, one idea is to take the slave servers offline until times when your network is running with a lightened load. You don't need to worry about the slaves losing their place; they reconnect to the master and resynchronize. Of course, this presents two potential concerns:
The MySQL Administrator provides details on network activity. The following example examines the network traffic for a master server. In this case, only two activities are under way. First, the server is being used to log visits to a website, so there is a steady trickle of data updates. Second, the server is replicating its information to a slave server. This accounts for most of the network traffic (see Figure 16.6).
Figure 16.6. Network traffic in a replicated environment.
Compression and Replication
One way to reduce network traffic is to employ a compression algorithm. By enabling the --slave_compressed_protocol server variable, you instruct MySQL to compress replication traffic if both master and slave are able to support this capability.
Encryption and Replication
Security is always a hot topic in distributed computing, and MySQL replication is no exception. Administrators might elect to enable encryption via SSL. This requires that the master server be configured correctly, and all slaves wanting to connect to the master need to provide several additional parameters to be able to understand the encrypted traffic.
Should you incur the extra processing and network costs imposed by encryption? The answer is very dependent on the profile of your replication topography:
Timing and Error Handling
MySQL offers two important settings to control how slave servers behave when encountering network issues. The first parameter, --slave-net-timeout, instructs the slave to wait a specified number of seconds for additional data from the master before timing out or aborting its read.
If the threshold passes, the slave treats the connection as broken, and tries to reconnect. The amount of time that is allowed to pass before this attempt is controlled by the second parameter, --master-connect-retry, which is set to a default value of 60 seconds.
Try to keep your network's unique circumstances in mind when setting these parameters. For example, if a slave is connected via a low-speed and/or unreliable dial-up, you might want to increase the -slave-net-timeout setting. Setting it too low might cause the slave to think that the connection is broken when, in fact, it is simply a bandwidth or other session problem that will pass momentarily.
Master Server Performance Considerations
As the source of all replicated information, it's important that your master server be as efficient as possible. The next section reviews several suggestions toward that goal.
Monitoring the Master
MySQL offers a collection of server status variables and graphs (via the MySQL Administrator) to help give you a better idea of what is happening in your replication environment. You saw Figure 16.6's network activity graph earlier. At this point, take a look at two additional graphs.
The first graph provides some details about the master server, including its status, log file, and position within the binary log (see Figure 16.7).
Figure 16.7. Status information for the master server.
The next example shows thread activity on the master, including information about its replication actions (see Figure 16.8).
Figure 16.8. Thread activity on master server, including replication activity.
You can also use the SHOW MASTER STATUS command to get an idea of what's happening on the master server from an examination of its binary log files.
mysql> SHOW MASTER STATUS\G *************************** 1. row *************************** File: high_hat_main-bin.000009 Position: 155326663 Binlog_Do_DB: Binlog_Ignore_DB: 1 row in set (0.00 sec)
The SHOW MASTER LOGS command gives you an overview of the current collection of binary logs stored on this master:
mysql> SHOW MASTER LOGS; +--------------------------+ | Log_name | +--------------------------+ | high_hat_main-bin.000001 | | high_hat_main-bin.000002 | | high_hat_main-bin.000003 | | high_hat_main-bin.000004 | | high_hat_main-bin.000005 | | high_hat_main-bin.000006 | | high_hat_main-bin.000007 | | high_hat_main-bin.000008 | | high_hat_main-bin.000009 | +--------------------------+ 9 rows in set (0.00 sec)
Running SHOW BINARY LOGS would provide the same details; the two commands are synonymous.
If you're curious about what operations are being processed during replication, you can view the contents of the binary log by running the SHOW BINLOG EVENTS command. Be careful when launching this request: Unless you restrict the amount of returned data, you could be swamped with information:
mysql> SHOW BINLOG EVENTS IN 'high_hat_main-bin.000003' LIMIT 100\G *************************** 1. row *************************** Log_name: high_hat_main-bin.000003 Pos: 4 Event_type: Start Server_id: 1 Orig_log_pos: 4 Info: Server ver: 4.1.8-nt-max-log, Binlog ver: 3 . . . . . . *************************** 94. row *************************** Log_name: high_hat_main-bin.000003 Pos: 10551 Event_type: Query Server_id: 1 Orig_log_pos: 10551 Info: use `high_hat_repl`; insert into transactions(transaction_date,customer_id ,amount,transaction_type) values('2004-07-03 01:15:08',11045,2,'Purchase') *************************** 95. row *************************** Log_name: high_hat_main-bin.000003 Pos: 10724 Event_type: Query Server_id: 1 Orig_log_pos: 10724 Info: use `high_hat_repl`; SET ONE_SHOT CHARACTER_SET_CLIENT=8 ,COLLATION_CONNECTION=8,COLLATION_DATABASE=33,COLLATION_SERVER=33 . . . . . .
Implementing Best Tuning Practices
It stands to reason that a badly tuned master server will see its performance problems reflected on its slaves. For an example, a poorly designed indexing strategy on the master translates into sluggish queries, not only on the primary server, but also on all slaves that receive replicated information from this machine. The same holds true for other key design and configuration decisions, such as database structure and engine parameters.
For this reason alone, it's important that you take the time and apply sufficient effort into optimization when designing your database, application code, index, and engine configuration.
Only Log Necessary Information
As you saw earlier, as one of its many roles, the binary log is the mechanism that MySQL uses to replicate data from master servers to their associated slaves. Data manipulation activity on the master is written into this log, which is then read and performed by slaves.
In many cases, you will want your slaves to look just like their master. However, what if you have master server activity that you don't want to replicate to slaves? For example, suppose that the same server is managing production, test, and development databases. Replicating the test and development databases is a waste of time and bandwidth. To reduce this unnecessary overhead, you can restrict which databases participate in the binary log (and associated replication) via the --binlog-ignore-db setting in your master server's MySQL configuration file.
With this parameter set, MySQL does not log any data modifications to these databases on the master, but it does log activity from all other databases. Slaves that connect to this master do not see any data alterations: It is as if these ignored databases don't exist. This absence of logging and replication also reduces network bandwidth consumption; statements that aren't logged can't be replicated.
Conversely, you can explicitly name any master databases that you want logged and replicated via the --binlog-do-db setting. This is the inverse of --binlog-ignore-db: When set, MySQL only logs and replicates those databases cited by --binlog-do-db. Be careful when using these options: There are potential side effects for statements that embed database names, rather than simply specifying USE their SQL.
Both of these parameters have partners with similar purposes on the slave servers. The --replicate-do-db and --replicate-ignore-db parameters are discussed a little later.
Managing Slave Resources
Just because a server is acting as a slave doesn't mean that database administrators are freed from their responsibility to tune the machine, including its caching settings. In fact, slave server platform capabilities often vary widely, and are also frequently less powerful than their masters. This means that it will likely be hard to come up with a "one size fits all" configuration. In this environment, proper tuning and monitoring is crucial. Monitoring is reviewed in a moment. For now, take a look at some configuration guidelines.
Monitoring the Slave
After you've set up a replication strategy and configured your master and slaves, MySQL offers a number of helpful tools and utilities to help you monitor the slave's health and performance.
First, even though a server is functioning as a slave doesn't mean that you can't make use of the standard monitoring tools such as the MySQL Administrator. For example, look at the following two screenshots. In the first case, observe the internal replication threads at work (see Figure 16.9).
Figure 16.9. Connection activity on a slave.
However, this server has other responsibilities. In particular, it contains its own rollup databases that are being scrutinized by users employing business intelligence tools (see Figure 16.10).
Figure 16.10. A diverse set of activities running at the same time on a slave server.
For the character-minded, the SHOW SLAVE STATUS command provides practical information for administrators:
mysql> SHOW SLAVE STATUS \G *************************** 1. row *************************** Slave_IO_State: Waiting for master to send event Master_Host: high_hat_main Master_User: root Master_Port: 3306 Connect_Retry: 60 Master_Log_File: Replication-2-bin.000009 Read_Master_Log_Pos: 155326663 Relay_Log_File: slave_server1-relay-bin.000004 Relay_Log_Pos: 155440260 Relay_Master_Log_File: Replication-2-bin.000009 Slave_IO_Running: Yes Slave_SQL_Running: Yes Replicate_Do_DB: high_hat_repl Replicate_Ignore_DB: Replicate_Do_Table: Replicate_Ignore_Table: Replicate_Wild_Do_Table: Replicate_Wild_Ignore_Table: Last_Errno: 0 Last_Error: Skip_Counter: 0 Exec_Master_Log_Pos: 155326663 Relay_Log_Space: 1229182150 Until_Condition: None Until_Log_File: Until_Log_Pos: 0 Master_SSL_Allowed: No Master_SSL_CA_File: Master_SSL_CA_Path: Master_SSL_Cert: Master_SSL_Cipher: Master_SSL_Key: Seconds_Behind_Master: 11 1 row in set (0.00 sec)
Slave Server Performance Considerations
The next section provides a collection of proposals to help your slave servers deliver optimal performance.
Sharing the Workload
With today's commodity-priced hardware and continually dropping bandwidth costs, often the easiest and most cost-effective way to improve an individual slave server's responsiveness is simply to add additional slaves to your environment.
After you have multiple slave servers in place, you can configure your clients to connect to a specific server, using whatever performance criteria you like as the deciding factor. You can even chain replication servers together, configuring a slave to act as a master to additional slave servers.
If you are concerned about erroneous updating of slave servers, use the --read-only option in the slave's configuration file. When present, this blocks any local updates of the slave server by any users lacking the SUPER privilege.
Only Replicate Necessary Data
You just reviewed how the --binlog-ignore-db and --binlog-do-db parameters affect what gets logged on the master server. Slave servers have their own parameters that affect their processing of replicated information. The --replicate-do-db and --replicate-ignore-db settings determine whether the slave will reproduce actions for a given database.
For example, suppose that a master server logs activity from all databases. However, because of operational, disk, or CPU resource issues, you decide not to process activity from a particular database on a slave server. In this case, you include --replicate-ignore-db in this slave server's configuration file. Note that the slave server still receives traffic for this unwanted database: It is inscribed in the master's binary log, after all. The slave will ignore these directives, however. If you want to eliminate the traffic altogether from the point of inception, use the --binlog-ignore-db directive on the server.
Slave servers have even more discretion over what to replicate. The --replicate-do-table, --replicate-ignore-table, --replicate-wild-do-table, and --replicate-wild-ignore-table settings give administrators fine-grained control over which tables to process and which tables to ignore. However, if used incorrectly or inconsistently, these settings can introduce data anomalies into your slave server database.
Taking the time to minimize your slave servers' overhead delivers two notable performance advantages. First, because the slave server is running more efficiently, it simply takes less time to complete the replication process. Second, a more efficient slave server can assist more users more quickly. The following takes a look at a few simple things you can do to diminish this processing burden.
First, recall that Chapter 11, "MyISAM Performance Enhancement," discussed the MyISAM engine and that MySQL gives you the option to defer key buffer synchronization to disk until a table is closed, rather than every time an index is updated. The overhead savings can be significant, especially when the master server (and, hence, the slave server) are subject to very frequent data updates. You can control this behavior via the --delay_key_write server setting.
Speaking of MyISAM, you can instruct MySQL to only use this rapid storage engine on one or more slave servers. To do so, include the --skip-innodb and -skip-bdb directives on your configuration file. When these options are enabled, MySQL replicates all master tables, regardless of their storage engine, into its most efficient disk-based storage engine: MyISAM. Note that this introduces the potential for transaction and referential integrity anomalies should something fail during replication. The InnoDB and BDB storage engines will be able to maintain integrity; MyISAM will not.
Finally, during Chapter 8's ("Advanced SQL Tips") appraisal of optimal SQL, you learned that MySQL gives you the option to set data modification priority (INSERT/UPDATE/DELETE) below that of queries. When you include the --low-priority-updates directive in your slave's MySQL configuration file, database readers take precedence over database writers. However, be aware of one potential latency-causing side effect: The slave server(s) might not be able to catch up on its replication work in a timely manner if the query load is very high.
|< Day Day Up >|