Section 14.2. Design for Performance


14.2. Design for Performance

The right place to start planning for the required performance is in the design process. You should avoid belated code optimization, which could lead to unwanted side effects, bugs, or code that is harder to read and maintain.

Although the design gives you a more abstract impression of your application, you need to align it with constraints, such as hardware capacity or operational budgets, as well as the scaling characteristics you want and the expected amount of initial traffic.

Whether you are a cowboy coder or process geek, this section contains useful information because this chapter discusses designing PHP 5 applications in particular.

14.2.1. PHP Design Tip #1: Beware of State

This is the first design rule because avoiding a server-side state between requests as far as possible is helpful to scaling your application. State is information carried over from one request to the next, ranging from simple things such as a user id and password, to more complex requests such as the user's progress in a multi-page form.

Of course, an application without any kind of state would be useless; this design rule is about moving state to the right place rather than eliminating it. This allows you to scale your application efficiently by simply adding servers as traffic grows.

14.2.1.1 Session State

The most common form of a server-side state is sessions, where the browser obtains a cookie that refers to information stored on the server. By default, PHP stores session information in local files, so when you deploy that second server, each session may end up having different information stored on each server, as shown in Figure 14.1.

Figure 14.1. Locally stored session data (state) causes problems after you go beyond one server.


This application is running on two servers that are load balanced by a simple round-robin rule in the router. Both use the default (file) storage back-end for PHP sessions. The user's browser first sends a request (Request1) that is redirected to Web Server 1, along with the session id "1234abc…" When Web Server 1 responds, the session variables a and b have the values 1 and 2, respectively. Then, the browser sends another request (Request2) that the load balancer sends to Web Server 2. However, this server has different values stored for the session variables a and b, so the user receives a different result. In fact, the result may vary every time the user reloads the page.

14.2.1.2 Isolating State

So, how do you fix this problem? One possibility is to store data in the user's browser via cookies. Doing so would avoid the entire state issue on the server side, but you should not store any confidential information in cookies. Cookies are easily faked and stored in plain-text files on the user's computer.

The other option is to isolate the data comprising the state on the server side. You can store the session data in a database on a dedicated server, or use a dedicated session back-end server such as msession. Figure 14.2 shows how this architecture would look using a custom session handler that stores session data in a MySQL database on a different machine.

Figure 14.2. Session data is moved off web server machines, which allows you to scale by adding hardware.


This makes the database server the single point of failure, but you can at least handle replication and failover for the database separate from scaling web servers.

14.2.2. PHP Design Tip #2: Cache!

Caching is a great way to reduce the response time of your site. By having caching in mind during the design phase, you can layer your application so that adding caching is straightforward. When you design for caching, consider issues like expiration schemes from the beginning rather than hacking it in as an afterthought.

Figure 14.3 shows a high-level diagram of an application separated into a Database Server, an Application Logic layer, and a Display Logic layer.

Figure 14.3. A cleanly layered web application.


Here, the Database Server includes the database itself (such as MySQL or Oracle). The Application Logic layer hides SQL and database details behind a PHP-based API. Finally, the Display Logic layer interfaces the user, manages forms and templates, and communicates with the database through the Application Logic layer.

You may add caching between every layer of your application, as shown in Figure 14.4.

Figure 14.4. A cleanly layered application with a cache between each layer.


This design captures four distinct types of cache functionality:

  • Database query/result caching

  • Call/return value caching

  • Template caching/code generation

  • Output caching

14.2.2.1 Database Query/Result Caching

Caching the results of database queries can speed up your site and reduce the load on the database server. The biggest challenge is to determine the best caching strategy. Should you cache the results from every single query? Do you know in advance which queries are going to be expensive?

The following example demonstrates an approach to this using the Cache_DB class, which is part of the Cache PEAR package. It wraps a DB connection object inside a proxy object that intercepts query() calls and uses a Strategy pattern to determine a caching strategy for each query:

 <?php require_once ''DB.php''; require_once ''Cache/DB.php''; abstract class QueryStrategy {     protected $cache;     abstract function query($query, $params); } class Cache1HourQueryStrategy extends QueryStrategy {     function __construct($dsn, $cache_options) {         $this->cache = new Cache_DB(''file'', $cache_options, 3600);         $this->cache->setConnection($dsn);     }     function query($query, $params = array()) {         $hitmiss = $this->cache->isCached(md5($query), ''db_cache'') ? " HIT" : "MISS";         print "Cache 1h $hitmiss: $query\n";         return $this->cache->query($query, $params);     } } class Cache5MinQueryStrategy extends QueryStrategy {     function __construct($dsn, $cache_options) {         $this->cache = new Cache_DB(''file'', $cache_options, 300);         $this->cache->setConnection($dsn);     }     function query($query, $params = array()) {         $hitmiss = $this->cache->isCached(md5($query), ''db_cache'') ? " HIT" : "MISS";         print "Cache 5m $hitmiss: $query\n";         return $this->cache->query($query, $params);     } } class UncachedQueryStrategy extends QueryStrategy {     function __construct($dsn) {         $this->cache = DB::connect($dsn);     }     function query($query, $params = array()) {         print "Uncached:      $query\n";         return $this->cache->query($query, $params);     } } class QueryCacheStrategyWrapper {     private $cache_1h = null;     private $cache_5m = null;     private $direct = null;     function __construct($dsn) {         $opts = array(             ''cache_dir'' => ''/tmp'',             ''filename_prefix'' => ''query'');         $this->cache_1h = new Cache1HourQueryStrategy($dsn, $opts);         $this->cache_5m = new Cache5MinQueryStrategy($dsn, $opts);         $this->direct = new UncachedQueryStrategy($dsn);     }     function query($query, $params = array()) {         $obj = $this->cache_5m;         $re = ''/\s+FROM\s+(\S+)\s*((AS\s+)?([A-Z0-9_]+))?(,*)/i'';         if (preg_match($re, $query, $m)) {             if ($m[1] == ''bids'') {                 $obj = $this->direct;             } elseif ($m[5] == '','') { // a join                 $obj = $this->cache_1h;             }         }         return $obj->query($query, $params);     }     function __call($method, $args) {         return call_user_func_array(array($this->dbh, $method), $args);     } } $dbh = new QueryCacheStrategyWrapper(getenv("DSN")); test_query($dbh, "SELECT * FROM vendors"); test_query($dbh, "SELECT v.name, p.name FROM vendors v, products p".            " WHERE p.vendor = v.id"); test_query($dbh, "SELECT * FROM bids"); function test_query($dbh, $query) {     $u1 = utime();     $r = $dbh->query($query);     $u2 = utime();     printf("elapsed: %.04fs\n\n", $u2 - $u1); } function utime() {     list($usec, $sec) = explode(" ", microtime());     return $sec + (double)$usec; } 

The QueryCacheStrategyWrapper class implements the Strategy wrapper, and uses a regular expression to determine whether the query should be cached and if it should be cached for five minutes or one hour. If the query contains a join across multiple database tables, it is cached for one hour; if it is a SELECT on the bids table (for an auction), the query is not cached. The rest will be cached for five minutes.

Here is the output from this example the first time the queries are run, and the results are not cached:

 Cache 5m MISS: SELECT * FROM vendors elapsed: 0.0222s Cache 1h MISS: SELECT v.name, p.name FROM vendors v, products p WHERE p.vendor = v.id elapsed: 0.0661s Uncached:      SELECT * FROM bids WHERE product = 42 elapsed: 0.0013s 

As you can see, the join is relatively expensive compared to the other queries. Now, look at the timings on the second run:

 Cache 5m MISS: SELECT * FROM vendors elapsed: 0.0098s Cache 1h MISS: SELECT v.name, p.name FROM vendors v, products p WHERE p.vendor = v.id elapsed: 0.0055s Uncached:      SELECT * FROM bids WHERE product = 42 elapsed: 0.0015s 

The cache gave a 125 percent speed-up for the first query, and a whopping 1,100 percent speed-up for the join.

A good exercise to complete after reading the APD section, "Profiling with ADP," later in this chapter would be to adapt the caching strategy in your own database (just change the "bids" table name), and use APD to compare the performance of the wrapped caching solution with a regular non-caching approach.

14.2.2.2 Call Caching

Call caching means caching the return value of a function given a set of parameters. Both the Cache and Cache_Lite PEAR packages provide this. Chapter 11, "Important PEAR Packages," contains an example of call caching.

14.2.2.3 Compiled Templates

Most template systems today compile templates to PHP code before displaying them. This not only makes the template display faster, but it also allows an opcode cache to cache them between requests so they do not need to be parsed on every request.

The only template packages in PEAR that do not compile to PHP code are HTML_Template_IT and HTML_Template_PHPLIB. If you use one of the others, such as Smarty or HTML_Template_Flexy, everything will be taken care of for you.

14.2.2.4 Output Caching

Finally, you may cache the printed output of an entire script or just parts of it using PHP's output buffering functions. Again, the PEAR caching packages have wrappers in place for output caching. See the Cache_Lite example in Chapter 11.

14.2.3. PHP Design Tip #3: Do Not Over Design!

With PHP 5's new OO features, it is easier to make clean object-oriented designs. PHP has a vast amount of built-in functions and functions provided by various extensions, most of which are procedural (calling functions rather than working with objects).

14.2.3.1 OO Wrappers for Built-In Functions

To make interfaces "cleaner," it may be tempting to wrap a class layer around built-in functions. Unless these wrappers provide real value, they just add bloat and complexity. "Real value" could be providing a unified API to different extensions (similar to, for example, PEAR DB), or it could be adding new, higher-level functionality (similar to PEAR Net_Socket).

14.2.3.2 Generalize Carefully

Generalization is expensive (saying it is cheap). Know why you make something more general or abstract, and think about what you expect to gain from doing it. If you add abstractions without knowing exactly why you need them, chances are you are making another abstraction that you need further down the road.

14.2.3.3 Do Not Pretend PHP Is Java!

PHP and languages such as Java or C++ are vastly different. One thing is that PHP is compiled at runtime, but PHP has a huge amount of low-level, built-in functionality that Java provides through higher-level packages. Even though PHP 5 has a vastly improved object model, object instantiation in Java is several times faster than in PHP. Java has String objects, while PHP has a string type. Java has a Vector class, and PHP has arrays. Writing a Vector class for PHP could be interesting as an exercise, but for production use, it is just silly because PHP has built-in functionality for doing the same thing much faster.

PHP applications need to be designed as PHP applications that accommodate PHP's different strengths and weaknesses.



    PHP 5 Power Programming
    PHP 5 Power Programming
    ISBN: 013147149X
    EAN: 2147483647
    Year: 2003
    Pages: 240

    flylib.com © 2008-2017.
    If you may any questions please contact us: flylib@qtcs.net