You've saved the most difficult-to-pin-down performance obstacle for last. The IT help desk receives high volumes of support calls at sporadic times throughout the day. During these periods of intense activity, users complain of system response problems across the board. These delays affect everything from saving new transaction records to updating existing package status details to running reports. To make matters worse, there doesn't appear to be a correlation with user activity load: Some of the most severe slowdowns happen during off-peak hours.
Fortunately, when faced with such a fuzzy, hard-to-define problem, you have a wide variety of tools at your disposal. These range from operating system monitors to network traffic indicators to MySQL utilities. In circumstances in which there doesn't appear to be a consistent problem, it's often best to arrive at a diagnosis by the process of elimination. You can work through a list of possible causes of the transient performance issue:
Insufficient hardware If your server is underpowered, it's likely that this deficiency is most prominent during periods of peak activity. That isn't the case here. To be certain, it is wise to turn on server load tracking and then correlate that with MySQL response issues.
Network congestion This is a little harder to rule out, but the performance problems are not always happening during busy hours. Still, a slight possibility exists that some user or process is hogging the network at seemingly random times, which incorrectly gives the appearance of a MySQL problem. Matching a saved trace of network activity with reports of performance problems goes a long way toward completely eliminating this as a possible cause.
Poor database design Performance problems are the norm, rather than the exception, if the database designers made key strategic errors when laying out the schema. The same holds true for indexes: Generally, an inefficient index strategy is easy to identify and correct.
Badly designed queries Given the broad constituency that is complaining about these sporadic slowdowns, it seems unlikely that a single protracted query, or even a group of sluggish queries, could be the culprit. The slow query log goes a long ways toward definitively ruling this out.
Unplanned user data access The widespread availability of user-driven data access tools has brought untold joys into the lives of many IT professionals. Nothing can drag a database server down like poorly constructed, Cartesian product-generating unrestricted queries written by untrained users.
Aside from examining the username or IP address of the offending client, it's difficult to quickly identify these types of query tools within the slow query log or active user list. However, by asking around, you learn that a number of marketing analysts have been given business intelligence software and unrestricted access to the production database server.
Now that you've established that decision support users are likely the root cause of these issues, you look at several alternatives at your disposal to reduce the impact of this class of user. Think of the choices as the "Four R" strategy: replication, rollup, and resource restriction. The following list looks at these choices in descending order of convenience.
Replication This is, by far, the most convenient solution to your problem. Dedicating one or more slave servers is a great way to satisfy these hard-to-please users. There will be some initial setup and testing, but no code or server settings need to be changed, significantly minimizing the workload for developers and administrators.
Rollup Another way to diminish the negative performance impact of open-ended reports is to aggregate and summarize information into rollup tables. This approach requires no application or configuration changes, but it does necessitate some effort on your part, and might not completely solve the resource contention issues. Moreover, reporting users will be faced with a lag between live data and their view of this information.
Resource restriction MySQL offers a variety of mechanisms to constrain user resource consumption. These options were discussed in Chapter 10, "General Server Performance and Parameters Tuning," exploration of general engine configuration. They include max_queries_per_hour, max_updates_per_hour, max_connections_per_hour, max_join_size, SQL_BIG_SELECTS, and max_tmp_tables.
This is the least desirable approach. First, it likely requires configuring these settings at either a global or session level. Second, there is a significant possibility that these changes will prove ineffective in your particular environment.