Introduction to Performance Problems

The first time you'll hear about an application running slowly is probably from a user report. Unfortunately, users are rarely particularly useful in this respect; the cry of "this thing is so slow'' across the office tells you virtually nothing. It's important, therefore, to do a little digging.

Types of Performance Bottlenecks

Let's go back to basics for a minute and look at the structure of an HTTP GET or POST request. When the user's Web browser makes a request, it makes a socket connection to the Web server, usually on port 80 or, for SSL connections, 443. This is a blocking activity in the Web browser, so called because the browser cannot do anything else until the connection is successfully made. In practice, however, most modern Web browsers will allow the user to cancel while the connection attempt is being made, and on faster connections the connection is made in literally fractions of a second. But if the server is heavily loaded the connection may take a while to establish. If this is the case, it points to other applications or processes on that server slowing things down (not necessarily the application in question).

After achieving connection, the Web browser does not wait for any response because the HTTP protocol does not dictate there should be any. Immediately, it sends a very small request packet, usually not more than a few bytes in size. This request contains, among other data, the document that the Web browser requires, and any GET or POST parameters the user has offered as part of the request.

This request is, in itself, small. The time between the socket connection's being established and the request's being sent to the Web server is likely to be minimal.

The time between when the Web server has received the request and when it starts to return data is known as the processing time for your script. In most cases, PHP will not attempt to send any output to the Web browser until the entire script has finished executing, unless the volume of your output exceeds the value of output_buffering in php.ini. This means that the processing time is roughly equal to the time between when PHP starts executing your script and the time it finishes executing. This is the most likely place for a delay. The time between when data starts to be returned to the Web browser and when that data is finished transferring is the delivery time and is not likely to be related to PHP in any way. The time for this data to be transferred is much more likely to be tied to network performance, either at the server side (for example, an overloaded connection) or client side (a horrifyingly slow modem). Unless your page weight exceeds 55K, which is generally regarded as the limit for sensible Web pages, this is unlikely to be a cause of delays.

The easiest way to identify where a performance bottleneck is occurring is to use some manual tool, rather than a bona fide Web browser, to make the HTTP request, and analyze the results yourself.

Differentiating between Different Types of Bottleneck

Suppose you want to analyze where poor performance might be occurring in requests for the page /example.php, on the server www.example.com, with GET parameter foo equaling bar. This is, of course, equivalent to http://www.example.com/example.php?foo=bar. Start a console session and use telnet as follows:

   ed@genesis:"$ telnet www.example.com 80    Trying 192.168.1.2...    Connected to www.example.com    Escape character is '^ ]'.    GET /example.php?foo=bar HTTP/1.1    Host: www.example.com   <HTML>    <BODY>     Hello, World!    </BODY>   </HTML>

In Windows, the same approach applies start a command prompt and use telnet in exactly the same manner.

To get real-life output, you will need to substitute the hostname and URL for real-life examples. You need to press Enter where you see a blank line, and enter spaces exactly as above. You may find it easier to write it all out in Notepad and paste it into telnet.

Have a stopwatch handy when you do this. Observe where the delay lies, and infer as follows:

A delay between pressing Enter and observing Trying 192.168.1.2 indicates a delay in resolving the IP address of the server against a name server. This is unusual. It could indicate unresponsive name servers, either those you are using yourself (typically those of your own ISP) or those serving the domain of the server in question (typically the ISP hosting the server). Why this unresponsiveness exists is outside the scope of this book, but you can at least reassure yourself PHP is not to blame. To a real-world user, this delay would be experienced only once, when first accessing the site, because most Web browsers (and, indeed, operating systems) cache the results of name server lookups.
A delay between the Trying . . . line and the Connected line indicates that the server itself took a while to successfully respond to your requests to connect. A delay here is massively damaging, because a page with several images could easily consist of twenty or thirty HTTP requests. If each one has a delay attached, the page could appear dramatically sluggish even if script execution time is markedly quick. Unfortunately, the delay could exist in one of two places either the network to/from the server or in the server's ability to respond to connections in a timely manner. The latter of these could be caused by server load, which could be caused by poorly optimized PHP (not necessarily this script) or other processes bringing the server to its knees. A quick check of memory and CPU utilization on the server can reveal the truth here. If it's the former, then it's outside the scope of this book; if it's the latter, then you should try to track down which script is causing the problem and, if it's PHP, apply the same methods seen here to that script.
A delay between pressing Enter twice after having entered your HTTP request and seeing the HTML of your response almost certainly indicates poor performance at processing time. This can be validated and verified by adding watches in code.

Assuming that the delay appears to be down to script processing time, you can now determine what in the script is causing the delay (or delays).

$intTimeNow = microtime(); $q_handle = pg_exec($this->link_ident, $sql); $intTimeTaken = microtime() - $intTimeNow; error_log("DEBUG: QUERY: $sql\n"); error_log("DEBUG: TIME TAKEN: $intTimeTaken\n");

[Sun May 16 22:10:19 2004] [error] DEBUG: QUERY: SELECT id,logged_in,user_id FROM "user_sessions" WHERE session_id='98ce552be0a2ea6b6f69fbebcd14997c' AND user_agent='Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.1)' AND ip_address='192.168.4.3' [Sun May 16 22:10:19 2004] [error] DEBUG: TIME TAKEN: 0.003752

Types of Performance Bottlenecks

Differentiating between Different Types of Bottleneck

Causes of Poor Performance

Tracking Down the Bottleneck

Database Queries

Algorithms