Patching for Performance

Patching for Performance

Now that you know what's to blame and where it is, you can concentrate on patching your code to improve performance.

Mercifully, there is one thing of which you can be certain: You won't be the only PHP developer who's ever encountered bottlenecks like yours. As a result, there are some tried and tested techniques you can follow to try to patch them.

Fixing Database Holdups

Earlier, you saw how to identify which queries were holding up execution of your script. But why a query is running badly could be a result of any number of factors. Optimizing SQL statements is a massive topic, and there simply isn't the space to go into much detail here. There are numerous excellent volumes on PostgreSQL, MySQL, and other databases that deal with query optimization. However, the following headline points are worth bearing in mind:

  • Primary keys that are numbers will always be faster as a primary index than primary keys that are alphanumeric, even if there are unique alphanumeric values that would ordinarily be an obvious candidate for primary key. In general, an "id'' column is what is required.

  • Ensure that columns and, perhaps more important, combinations of columns by which you will filter or order in SELECT statements have corresponding indices defined.

  • Scheduled scripts to clean up tables (using, for example, the VACUUM statement in PostgreSQL) whose contents are updated frequently may well prove to be sound investments.

  • Never use SELECT *. If you know the names of the columns you want, specify them. This will improve performance.

  • Try to minimize the number of queries you are performing to achieve your desired result. Selecting a string of matching id values and then using a brace of queries to retrieve their contents may fit better with your object model, but it is very inefficient. Is there a way to find a middle ground? In Chapter 7, for example, you met the GenericObjectCollection class, which let you do just that adopt an OOP-compliant approach to development while minimizing the number of SQL queries necessary to effect that approach.

  • Subselects are easier to put together than INNER or OUTER joins, but nowhere near as efficient. Avoid them if possible (this will also, as a side effect, increase RDBMS portability). If you suspect that subselects may be slowing your application down, try replacing them with either INNER or OUTER joins as appropriate. Consult the documentation of your preferred flavor of database for more information.

Most tips such as the preceding increase speed by only fractions of a second. But this really is a case of every little bit helping, especially when you are dealing with pages with a large number of queries.

Don't forget that you will almost certainly need to look outside the PHP script itself to track down the offending SQL query, particularly if you are following the best practice principles as set out in this book. If you have devised an object model that is heavily dependent on a database backend, you may find that performance tweaks in queries used in that model have unexpected performance gains elsewhere. By the same token, however, when changes are being made to such high-level classes, care should be taken to thoroughly test all components that depend on them to ensure that functionality and behavior remain consistent with expectations.

Fixing Code Holdups

Ultimately, the same principles as those for optimizing database queries apply here. It's not good enough to know simply which block of code is causing a holdup; you need to work out which PHP statement is being slow and under what circumstances.

Logic Errors

However, it is sometimes the case that simple errors in application logic are to blame. In such cases, the application does function as expected, so ordinary QA techniques have not registered a bug. However, some human error in the code has caused performance to be adversely affected. These are the easiest to fix.

Consider the following example:

   switch ($i) {     case 0:      array_pop($arMediumArray);      break;     case 1:      array_reverse($arAnotherHugeArray);     case 2:      $arHugeArray = array_unique($arHugeArray);      break;    }; 

As you can probably spot, we are missing a break statement under the case where $i is equal to 1. This means that the case where $i is equal to 2 will also be executed when $i is equal to 1. In that case, all that is necessary is for $arAnotherHugeArray to be reversed. But, in fact, $arHugeArray is also being made unique, as a side effect. It is quite possibly the case that $arHugeArray will never be used again after this switch statement should $i equal 1, so this method being erroneously called may not cause a malfunction per se. What it likely does, however, is double the execution time for the block. Adding the break statement stops the redundant method call and speeds execution as a result.

The methods alluded to previously should still allow you to track down such problems. By printing the execution time for the entire block, and the execution time for the statements you think should be getting executed, you will often see a discrepancy between the two. Although this discrepancy may be the result of any number of problems, it is entirely possible that a logic error is to blame. Accordingly, it is worthwhile checking for such problems, particularly when using prone constructs such as the switch statement, before getting too in-depth with query optimization and other more intense performance tuning.

Bottleneck Avoidance

At all other times, judicious use of the techniques met earlier should allow you to narrow your bottleneck down to a single offender, or at least, a collection of offenders.

The most common methods built into PHP, which can be very slow for one reason or another, include:

There are many more examples of methods that can cause bottlenecks. The PHP online documentation often refers to potential performance problems in its entries for particular methods and classes.

When you're relying on PEAR classes, it is also worth checking that your server is installed with the latest supported versions. As with all distributed components, bugs do crop up, and it's worth ensuring that you have the latest versions to rule out any performance problems that might arise from those bugs.

Testing

Any changes you make to your code or SQL queries as a result of the discoveries discussed previously need to be tested thoroughly. Make sure that you apply the same basic first principles of QA as those discussed in Chapter 23, "Quality Assurance,'' when testing the scripts and components you have changed as you would when testing the application as a whole.

This includes testing not just the input parameters and usage scenarios you used while making your changes, but all conceivable parameters and scenarios where it is conceivably possible that your changes could have an impact. Err on the side of overtesting rather than undertesting.

If you have made changes on a development environment rather than a live environment, try to snatch a copy of your live database or data store and test locally. If this is not possible, try to set up a Virtual Server in Apache (or your preferred Web server) to use as a temporary staging environment, using your new code base against the live database but running distinctly from the live environment.

Either way, be sure that when you go live, you are confident not only that your changes will have the desired effect in improving performance but also that they will not adversely affect functionality in any way.



Professional PHP5 (Programmer to Programmer Series)
Professional PHP5 (Programmer to Programmer Series)
ISBN: N/A
EAN: N/A
Year: 2003
Pages: 182
BUY ON AMAZON

flylib.com © 2008-2017.
If you may any questions please contact us: flylib@qtcs.net