As you learned in the "Content Compression" section, browsers and servers have brief conversations about what kind of content they would prefer to accept and deliver. The browser tells the server that it can accept content encoding, and if the server is capable, it will then compress the data and transmit it. The browser decompresses the data and then renders the page. Clients that don't understand compressed content don't request encoded files and, thus, receive files uncompressed ( assuming that the content is offered conditionally). By definition, HTTP 1.1-compliant browsers support gzip compression. Most modern browsers support gzip content encoding (see Table 18.3).
Browser | Encoding Support |
---|---|
Microsoft Internet Explorer | 4.x+ gzip, deflate. Macintosh versions do not understand coding by the methods gzip and deflate. They do not transfer the "Accept-Encoding" header. |
There is a caching issue with compressed content in Internet Explorer. Fortunately, all the content compression software vendors are aware of this and know how to work around it. The only software that works incorrectly with MSIE is Microsoft Internet Information Server. | |
Netscape 4.06+ | Supports HTTP/1.0, but Netscape 4.06 and later versions send "Accept-Encoding: gzip" in the header. There are some limitations, however. It works consistently only for content type "text/html" or "text/plain." JavaScript and CSS files ("application/x-javascript" and "text/css") will not be decompressed properly. |
Mozilla m14-m18, 0.6-0.9.3, Netscape 6.0-6.1, Galeon, and SkipStone | Error in implementation. |
Mozilla 0.9.4+, Netscape 6.2+ | Good |
Opera 5.12+ | Good |
Lynx 2.6+ | Good |
Konqueror | gzip only |
What is the difference between gzip and deflate? Both are based on the same compression algorithm, deflate, [5] implemented in the compression library zlib. [6] Deflate encoding assumes that you are sending only the compressed data. gzip [7] adds a 10-byte header to the compressed data. It also adds a CRC32 checksum and the length of the compressed data (4+4=8 bytes) to the end of compressed file. The image of transferred data is a valid .gz file.
There is one more content-encoding compression algorithm: compress, which utilizes a compression algorithm implemented in the UNIX compress utility. This algorithm is supported only by Lynx, Netscape, and Mozilla.
Because some versions of Konqueror have an error in deflate decoding and gzip is widely supported, most compression solutions use gzip content encoding.
HTML and other text files can be compressed on the server and automatically decompressed with HTTP 1.1-compliant browsers. Because HTML files must be downloaded before your content appears, fast delivery of this page framework is critical to user satisfaction.
Because HTML text files can be highly redundant ( especially tables), compression rates for HTML files can be dramatic, with savings up to 90 percent. Most modern browsers support decompression of HTML files compressed with gzip.
The ZLIB Saga After CompuServe and Unisys rattled their GIF copyright sabers in late 1995, browser manufacturers rushed to add PNG support to their browsers. [8] Luckily, the PNG format uses public domain GZIP/ZLIB [6] compression algorithms (deflate and inflate), which are based on the older, non-proprietary Lempel-Ziv algorithm (LZ77). [9] GIFs use the less efficient Lempel-Ziv-Welch algorithm (LZW), [10] which is based on LZ78. [11] So in order to receive and display PNG files, the browser manufacturers had to add ZLIB inflation to their browsers. CompuServe subsequently backed down, but the deed was done. Now browsers had ZLIB support. Developers at Microsoft and Netscape realized that they already had ZLIB on board to handle inflating PNG files. Why not implement IETF content encoding? Why not indeed. Their first attempts went badly (browsers would report "Accept-Encoding" but then botch things when the compressed data arrived), but after a few more browser releases, they both got it right. The outcome is that any browser that can display PNG files can usually decompress anything sent with IETF content encoding: gzip. |
In theory, you can also compress external style sheets using content encoding. In practice, webmasters have found that browsers inconsistently decompress .css files. Apparently, style sheets were hacked into some browsers in a non-HTTP-compliant way. So when these browsers receive a 'Content-Encoding: gzip' header in the response for a .css file, they don't realize that they are supposed to decompress it first.
This is not always the case, however, and no one to my knowledge has been able to nail down which browsers can actually handle the decompression of style sheets and under what circumstances. The problem seems to involve a mixture of variables . Therefore, I recommend that you exclude compression of .css files in any configuration files for programs such as mod_gzip:
mod_gzip_item_exclude file \.css$
Most .css files are smaller than .js files anyway, so the need for compression is usually greater for .js files. In fact, CSS files are usually so small that the two HTTP headers needed to request and respond can add up to a significant portion of the total traffic (up to 7501,000 bytes). So for smaller CSS files on high-traffic pages, it may be more efficient to embed them directly into your (X)HTML files or use SSI, where they can then be compressed.
Like HTML files, external JavaScript files can be compressed with IETF content encoding. Unlike external .css files, support for decompressing compressed JavaScript files is good in modern HTTP 1.1-compliant browsers, as long as they are placed within the head of your HTML documents. Although it is possible, I don't recommend using proprietary compression methods to deliver external JavaScript files ( .jar , CHM/ITS, etc.). The standards-based method described in this chapter requires at most only one additional filenot fourto maintain.
External scripts must be referenced in the head element of (X)HTML documents to be reliably decompressed by modern browsers. The story goes like this. Netscape's original specification for JavaScript 1.1 implied that the inclusion of JavaScript source files should take place in the head section, because that is the only place where they are pre-loaded and pre- processed . [12] For some reason, browser manufacturers stopped decompressing any compressed files after leaving the head .
As scripts grew larger, developers started moving script elements down into the body to satisfy impatient users. In HTML, the " head -only" rule was then relaxed to allow script elements within the body , but the die was cast. Developers subsequently discovered that certain JavaScript inclusion operations must be in the head section or problems can occur.
Browsers continue to decompress scripts only when they are located within the head element. Some companies get around this limitation by adding " _h " to the names of JavaScript include files in the head section of HTML documents. Using this technique, a script author can use server-side filtering logic to find whether a request for a certain JavaScript file is coming from the head section of a HTML document (where it is OK to send it compressed) versus somewhere in the body (where it is not OK to send the script compressed). You can optionally use the defer attribute to compensate for this requirement.
JavaScript Compression Gotcha: Premature onload Events Internet Explorer 5 has a known bug when loading compressed JavaScripts. [13] IE mistakenly triggers the onload event after it downloads the compressed file, but before it is decompressed. This can lead to unexpected behavior. The way around this bug is to include another variable at the end of your external file and poll for its presence in the onload event handler. |