Producing Something Worth Serving


Although this chapter is concerned mainly with running and maintaining a Web server, it's important that you understand something of how the Web pages served by a Web server come into being. Some of the preceding sections have already described some types of Web pages (particularly dynamic content). The most common type of static content is HTML, which is a text-based format with formatting extensions. There are several different tools available to help you create HTML, as well as the file formats upon which HTML pages frequently rely, such as graphics file formats. Understanding how to use these tools, and how Web browsers interpret HTML, will help you create Web sites that can be handled by most Web browsers available today.

HTML and Other Web File Formats

Although there are many tools for creating Web files, as discussed in the next section, "Tools for Producing Web Pages," it helps to understand something about the various file formats that are common on static Web pages. File formats that are common on the Web include various text file formats, graphics files, and assorted data files.

Most Web pages are built around an HTML text file. This file is a plain text file that you can edit in an ordinary text editor. Listing 20.2 shows a simple HTML file as an example. Most text in an HTML file is displayed in the Web browser's window, but text enclosed in angle brackets ( <> ) is formatting information. Many of these codes come in pairs, with the second bearing the same name as the first but using a slash ( / ) to indicate it's the end of the formatted area. The opening code sometimes includes parameters that fine-tune its behavior, such as setting the size and filename of a graphic or specifying the color of text and background. Some of these codes reference other documents on the Web (both on the main document's server and on other Web servers).

Listing 20.2 A sample HTML file
 <!DOCTYPE HTML PUBLIC "-//IETF//DTD HTML 2.0//EN"> <HTML><HEAD> <TITLE>Sample Web Page</TITLE> </HEAD> <BODY BGCOLOR="#FFFFFF" TEXT="#000000"> <CENTER><H1 ALIGN="CENTER">Sample Web Page</H1></CENTER> <IMG SRC="graphics/logo.jpg" ALT="Logo" WIDTH="197" HEIGHT="279"> <P>This is a sample Web page, including <A HREF="http://www.threeroomco.com/anotherpage. graphics/ccc.gif html">a link.</A></P> </BODY></HTML> 

Some of the formatting codes in Listing 20.2 should be self-explanatory, but others are more obscure. A handful of the more important codes include the following:

  • <HTML> ” This code identifies the page as being HTML. Most browsers don't actually require this code, but it's best to use it in order to create proper HTML.

  • <HEAD> ” HTML pages contain both a header and a body. The header provides information that's not usually displayed in the Web page proper, such as the <TITLE> code in Listing 20.2. The body includes the bulk of the text that appears in the Web browser's window. <HEAD> identifies the header.

  • <TITLE> ” Most Web browsers display a Web page's title in the window's drag bar, and enter it as the default title if a user enters the Web page in a bookmark list.

  • <BODY> ” This code denotes the main body of the Web page. It frequently includes parameters to set text color, background color, and similar attributes.

  • <H1> Headings allow you to create text that's useful to break up a Web page into sections. They're usually displayed in a larger font than is regular text. You can create several levels of headings, starting with 1 ( <H1> ). <H2> is a heading below <H1> , and so on. Listing 20.2's <H1> code includes an ALIGN option to tell the Web browser to center the text. Unfortunately, not all Web browser respond to the same alignment codes, so some redundancy in this matter is required for consistent display.

  • <CENTER> ” The <H1> heading in Listing 20.2 is centered by both an option within the <H1> code and by a <CENTER> code that surrounds the <H1> code. Some browsers respond to one or the other of these codes, but not both. The <CENTER> code is usually not needed for modern browsers, but some older ones do need it.

  • <IMG> ” You can include a graphic on a Web page by using the <IMG> code, as shown in Listing 20.2. These codes usually include several parameters, including SRC (which points to the source of the image ”its filename on the server or a complete URL if it's stored on another server), ALT (text that's associated with the image for those who have automatic image display disabled or for display when the user moves a mouse over the image), WIDTH , and HEIGHT (the width and height of the image, which allows the Web browser to display the text of the document before the image has loaded, or to dynamically scale an image if it's some other size).

  • <P> ” This code denotes a paragraph. A Web browser will automatically rewrap text within a paragraph to fit the window size, font size, and other characteristics of the browser's window.

  • <A HREF> ” This code denotes a link. The text or image enclosed by this code usually appears on the Web browser underlined or in a different color, and users can click the link to view the page specified by the URL within the code.

It's possible to use nothing but these codes to create a Web page, but HTML supports many more options, including the ability to format tables, specify fonts, display bulleted or numbered lists, break the document into multiple independent frames , and so on. It's possible to over-use advanced HTML features, though. The upcoming section, "Web Page Design Tips," includes some information on this matter.

In addition to HTML, Web servers can deliver other document types to browsers. Indeed, HTML documents often refer to these documents directly, as in the <IMG> option in Listing 20.2. You can link to plain text pages, graphics, downloadable program files, scripts, or any other type of file. One important caveat is that your Web server should have an appropriate MIME type set for your documents, usually in the mime.types file described in the earlier section, "Understanding Apache Configuration Files." If Apache can't determine the MIME type of the file, it usually sends it as plain text, which can cause problems because the target OS may alter certain characters in the file, thus corrupting it.

Because many Web pages incorporate extensive graphics, it's important to understand something of the graphics file formats that are common on the Web. The three most common formats are as follows :

  • GIF ” The Graphics Interchange Format has been popular since the 1980s. It uses a lossless compression scheme, which means that it compresses data, but in a way that allows the exact input data to be displayed. GIF supports images with color depths of up to 8 bits (256 colors). One unique drawback to this format is that it uses a compression scheme that is covered by a patent (which is due to expire in 2003). Some people object to using a graphics file format that's so encumbered.

  • PNG ” The Portable Network Graphics file format is another one that uses a lossless compression scheme. It also supports greater color depth (up to 64-bit, but 24-bit is a more common depth), and hence many more colors than GIF. Its compression scheme isn't covered by patents. On the down side, some older Web browsers don't support PNG graphics. There's a Web site devoted to PNG at http://www.libpng.org/pub/png/.

  • JPEG ” The Joint Photographic Experts Group format uses a lossy compression scheme, meaning that it can attain greater compression than a lossless format, but the compressed image may not exactly match the original when displayed. JPEG supports true-color (up to 24-bit) images.

As a general rule, a lossless format is best for line art and cartoon-like images that use just a handful of colors. These images tend to acquire ugly-looking artifacts when converted to JPEG format. Digitized photos, by contrast, usually look best in a true-color format (PNG or JPEG), and JPEG's lossy compression scheme doesn't impact such images as much. Therefore, JPEGs are common for digitized photos displayed on the Web.

When you use JPEG, your graphics package will give you an option for a compression level. You can save your graphics file with little compression, which produces a large but good-looking image, or use a great deal of compression, which produces a much smaller file that degrades more in quality. The exact scale used to describe the level of compression varies from one package to another, but a 1 “100 scale is not uncommon, with 100 representing the best quality. Most images you're likely to put on the Web look acceptable at a fairly low compression level (say, around 50), and compressing these images can help reduce the load on your Web server and cause the images to appear more quickly in your users' Web browsers. You may want to experiment with different types of graphics files to learn what compression level works best for you.

Tools for Producing Web Pages

Although you can create Web pages by hand by editing the raw HTML in a text editor and using separate tools like The GIMP (http://www.gimp.org) to create or edit graphics, many Web page designers prefer to use GUI HTML design tools. These tools let you type in and edit text much as you can in a what-you-see-is-what-you-get (WYSIWYG) word processor, using buttons or special keystrokes to indicate centering, bold text, new paragraphs, and so on. This approach is certainly convenient , and Apache doesn't really care how you generate your files, so from a server operation point of view, there's no reason to avoid such tools. One exception is that Microsoft's Front Page can create Web pages that depend on special server extensions, so it's best to avoid it when using Apache.

NOTE

graphics/note.gif

Creating Web pages with a design tool isn't normally a problem for Apache, but some creation tools include interfaces to automatically upload a Web page to a Web server. These upload features might not work with Apache, at least not directly. You may need to save your Web pages in local files, then transfer them by floppy, FTP, or some other means to the Web server computer.


Examples of Web page creation tools include the following:

  • Word processors ” Many modern GUI word processors include a feature to export documents as HTML, or special HTML formatting modes. Such HTML exports may lose some formatting features if the files were generated as normal word processor documents. This can be a convenient way to generate HTML documents if you're already familiar with a word processor that supports such a feature. Linux word processors with HTML export capabilities include Applix Words, StarOffice, and WordPerfect.

  • Web browsers ” Many Web browsers, including Netscape for Linux, come with document-creation modules. As a general rule, these are more finely tuned to the needs of Web page design than are word processors, but if you're already familiar with a word processor, the browser tools represent another program to master.

  • Standalone Web page creation tools ” These tools are designed from the ground up to do nothing but create Web pages. Examples in Linux include ASHE (http://www.cs.rpi.edu/pub/puninj/ASHE/), August (http://www.lls.se/~johanb/august/), Bluefish (http://bluefish.openoffice.nl), and WebSphere (http://www-4.ibm.com/software/webservers/hpbuilder/). Some of these are very basic tools, whereas others are extremely complex.

If you use a Web page development tool, you should be aware of the limitations of these tools. Because of the nature of the Web, no two browsers are likely to display the same page in precisely the same way, but working with these tools makes it easy to overlook this fact. If the tool creates HTML that's optimized for particular browsers, your Web site's visitors may find your site difficult to read because of the assumptions your HTML editor made.

Web Page Design Tips

Some Web designers like to use HTML features to their fullest, thus creating a layout that can be almost as complex as anything that could be created on a printed page. There are drawbacks to using the more advanced HTML features, though. Specifically, it's impossible to predict precisely how a given browser will handle a code. Indeed, even the codes in Listing 20.2 aren't entirely consistent in their application ”as noted in the preceding descriptions, different browsers respond differently to the various codes used to center text, for instance. Font specifications work only if the font is installed on the client's Web browser; if it's not, the usual result is a fallback to an ugly default, such as Courier. Color specifications may interact poorly with a user's own color choices. (One particularly annoying error is specifying a background color without specifying a text color. If you specify a white background color but no text color, a user who has defaults set to white text on black background will be unable to read your page. Listing 20.2 specifies background and foreground colors, but it doesn't specify link colors, which can also be important in this equation.)

Because Web browsers vary wildly, it's best to test your Web pages on multiple browsers. At the very least, you should test on both Netscape Navigator and Microsoft Internet Explorer. If possible, you should test on multiple versions of these browsers. Other browsers that are popular, particularly in the Linux community, include Mozilla (http://www.mozilla.org, an open source cousin to Netscape Navigator), Opera (http://www.opera.com), Konqueror (a part of the KDE project), and Lynx (http://lynx.browser.org, a text-based Web browser). Lynx is particularly important if you want your site to be accessible to all users. Because it's text-based, it will turn up problems you might not notice in a GUI browser, but that might be important to somebody who uses Lynx, or to a visually impaired person who uses a speech synthesizer with a computer. Also, keep in mind that many (perhaps most) of your Web server users won't be using Linux. On Windows, Internet Explorer is the most popular browser, but others (including many of the preceding browsers) are available. MacOS, BeOS, OS/2, and many other platforms all sport their own browsers, some of which are shared with other platforms and some of which are not.

TIP

graphics/tip.gif

You can examine your server log files, as described shortly, to determine what types of browsers are most often used with your Web site.




Advanced Linux Networking
Advanced Linux Networking
ISBN: 0201774232
EAN: 2147483647
Year: 2002
Pages: 203

flylib.com © 2008-2017.
If you may any questions please contact us: flylib@qtcs.net