As we mentioned previously, a browser could naively process each embedded object serially by completely requesting the original HTML page, then the first embedded object, then the second embedded object, etc. But this is too slow!
HTTP allows clients to open multiple connections and perform multiple HTTP transactions in parallel, as sketched in Figure 4-11 . In this example, four embedded images are loaded in parallel, with each transaction getting its own TCP connection. [12]
[12] The embedded components do not all need to be hosted on the same web server, so the parallel connections can be established to multiple servers.
Composite pages consisting of embedded objects may load faster if they take advantage of the dead time and bandwidth limits of a single connection. The delays can be overlapped , and if a single connection does not saturate the client's Internet bandwidth, the unused bandwidth can be allocated to loading additional objects.
Figure 4-12 shows a timeline for parallel connections, which is significantly faster than Figure 4-10 . The enclosing HTML page is loaded first, and then the remaining three transactions are processed concurrently, each with their own connection. [13] Because the images are loaded in parallel, the connection delays are overlapped.
[13] There will generally still be a small delay between each connection request due to software overheads, but the connection requests and transfer times are mostly overlapped.
Even though parallel connections may be faster, however, they are not always faster. When the client's network bandwidth is scarce (for example, a browser connected to the Internet through a 28.8-Kbps modem), most of the time might be spent just transferring data. In this situation, a single HTTP transaction to a fast server could easily consume all of the available modem bandwidth. If multiple objects are loaded in parallel, each object will just compete for this limited bandwidth, so each object will load proportionally slower, yielding little or no performance advantage. [14]
[14] In fact, because of the extra overhead from multiple connections, it's quite possible that parallel connections could take longer to load the entire page than serial downloads.
Also, a large number of open connections can consume a lot of memory and cause performance problems of their own. Complex web pages may have tens or hundreds of embedded objects. Clients might be able to open hundreds of connections, but few web servers will want to do that, because they often are processing requests for many other users at the same time. A hundred simultaneous users, each opening 100 connections, will put the burden of 10,000 connections on the server. This can cause significant server slowdown . The same situation is true for high-load proxies.
In practice, browsers do use parallel connections, but they limit the total number of parallel connections to a small number (often four). Servers are free to close excessive connections from a particular client.
Okay, so parallel connections don't always make pages load faster. But even if they don't actually speed up the page transfer, as we said earlier, parallel connections often make users feel that the page loads faster, because they can see progress being made as multiple component objects appear onscreen in parallel. [15] Human beings perceive that web pages load faster if there's lots of action all over the screen, even if a stopwatch actually shows the aggregate page download time to be slower!
[15] This effect is amplified by the increasing use of progressive images that produce low-resolution approximations of images first and gradually increase the resolution.