ProblemYou need to optimize your web pages to improve load time. SolutionRemove whitespace, hidden characters, and other unnecessary tags from your code using simple regular expression searches, or a full-fledged code optimization utility. DiscussionEven though high-speed Internet access has a firm foothold in U.S. homes and offices, everyone still likes a fast-loading web page. Unnecessarily large files also consume disk space and bandwidth resources on your web server, which can cost you if your web site starts to exceed the limits of your account quotas.
There's some slack in your HTML code, too, and the good news is that getting rid of some or all of it won't affect how the page looks in a browser. Depending on the coding techniques you used in creating the original file, and the extent of the optimization techniques you use, the size of an optimized web page can be 5 to 25 percent less than the original. The one downside: Fully optimized HTML code is noticeably not user-friendly to the hand-coder, since all line feeds, extraneous spaces and tabs, and even comments are stripped away. Scanning over a dense, unformatted block of HTML code looking for the place to make a change can be maddening. To make a modest impact on the file sizes of your web pages, you can use regular expressions in the find-and-replace dialog of your web page editor to remove extra spaces between tags, after tag attributes or punctuation, or at the beginnings of lines. Using an HTML editor capable of performing regular expressions, or grep, searches (such as BBEdit, HomeSite, or Dreamweaver) , type >\s+< into the search field and >< into the replace field to push all your tags up close together. Using just this technique on what I thought was a well-coded page of my own reduced its file size more than 5 percent. For a full list of special characters and wildcards that can be used in a regular expression search, see the tutorial site in the "See Also" section of this Recipe. You can also use Perl to execute regular expression searches directly on a batch of files on your web server, or combine a bunch of Perl find and replace commands in a shell script: perl -pi -e 's/>\s+</></g' /full/path/to/your/files/*.html
Combine more than one Perl command, each on its own line, and save them in a file on your server called optimize_files.sh: #/bin/bash perl -pi -e 's/>\s+</></g' /full/path/to/your/files/*.html perl -pi -e 's/.\s\s/.\s/g' /full/path/to/your/files/*.html perl -pi -e 's/\t+/\t/g' /full/path/to/your/files/*.html perl -pi -e 's/\r+/\r/g' /full/path/to/your/files/*.html Then run the script from the command-line prompt to your web server: sh optimize_files.sh
To squeeze every last byte out of your HTML files, there are numerous PC applications and online tools that will cut the fat out of your web pages. I tried one of each on the original file mentioned above and got an overall file size reduction of about 12 percent with each of them. But in both cases, the code bore only a scant resemblance to its former self (see Figures 4-6 and 4-7). Both procedures approached file optimization more or less the same way: remove everything that's not absolutely necessary. The online tool (links are in the "See Also" section of this Recipe) offers no way to tweak its routine. Just enter the URL of the page in the form, and it returns the optimized code. The PC application (also mentioned in "See Also" section of this Recipe) will optimize one file, a batch of files, or an entire site, and offers a long list of settings that let the user dictate what stays and what goes. Heavy-duty optimization complements the model of web sites as software. By that, I mean you as the designer work on a version of the site with easy-to-read formatting and comments, and then deliver an optimized version to your customers, which in this case are your site's visitors.
Web page optimization is all about speed and visitor satisfaction. After all, the comments and neatly aligned tags are for your benefit, not the web surfer's. If you want to go as far as you can with optimization and keep a version that's easy to edit, you could maintain two versions of your sitean offline version that's easy to edit by hand and an optimized "live" version that is uploaded to the web server. (Software mentioned in the "See Also" section of this Recipe can help you set this up.) Figure 4-6. My original, pre-optimization file; maybe a few too many line feeds and tab indents, but easy to read for a hand-coderThe amount of optimization you'll want to do depends on your work habits and web site needs. If you prefer to edit HTML code by hand (and you have to do it frequently), then you'll probably want to pick and choose how and what to optimize. For example, you might want to get rid of unneeded spaces, tabs, and tag attributes that specify a default setting, but leave your comments and line feeds so the files remain more manageable. Or, you could get the best of both worlds by structuring your pages as optimized shell files, while leaving the content you edit most often in a more-readable, lightly optimized include file.You can optimize to the fullest extent possible if you don't edit the pages very often or you do most of your site editing in the WYSIWYG or design view of your web page editor, rather than in code view. Figure 4-7. Post-optimization; a smaller file…but a bigger headache?See AlsoFor a good tutorial on using regular expressions see http://www.anybrowser.org/bbedit/grep.shtml. The two heavy-duty optimization tools I used for this Recipe are HTML Code Cleaner (online form at http://www.yook.de/webmaster/clean/) and HTML-Optimizer Pro (download from http://www.tonbrand.nl/products.htm). Port80 Software also offers a full-featured web page optimization application called w3compiler (http://www.w3compiler.com/). |