Dodging the Limitations of HTML | Cascading Style Sheets: Designing for the Web (3rd Edition)

HTML is a simple, easy-to-learn markup language designed for hypertext documents on the Web. In fact, a computer-literate person can learn to write basic HTML in less than a day. This simplicity is one reason for the huge success of the Web.

From the beginnings of HTML, Web-page designers have tried to sidestep its stylistic limitations. Their intentions have been the best to improve the presentation of documents but often, the techniques have had unfortunate side effects. Typically, the techniques work for some of the people some of the time, but never for all the people all the time. They include the following:

Using proprietary HTML extensions
Converting text into images
Placing text in tables
Writing a program instead of using HTML

We discuss these techniques and their side effects in the next sections.

Proprietary HTML Extensions

One way to sidestep HTML's limitations has been for browser vendors to create their own tags that give designers who use their browser a little more control over the appearance of a Web page. At some point, it seemed that every new version of a browser introduced a few new elements that designers could play with. For example, Netscape introduced the CENTER element in 1994 to allow text to be centered on the screen, and more recently, the SPACER element for, among other things, indenting the first line of a paragraph. Microsoft introduced the MARQUEE element in 1995 to make text slide across the screen. (Chapter 17, "Tables," shows more extensions with their CSS replacements.)

But, these HTML extensions have their problems. First, they are not universal. Although the W3C officially added CENTER to HTML 3.2 to avoid problems with browsers behaving differently at a time when CSS was not ready, the others remain specific to a particular browser.

Another problem is that the extensions are meaningless on non-visual browsers, such as speech browsers (that read pages aloud) or Braille browsers. Some of the extensions, such as the FONT element, won't even work on certain visual browsers, such as the text-only browser Lynx or browsers on hand-held devices. The standard HTML elements, on the other hand, are all designed to be device-independent. An element such as EM ("emphasis"), which is usually shown as italic text on graphical browsers, can be rendered underlined or in reverse video by Lynx or with a more emphatic voice by a speech browser.

Luckily, CSS offers more powerful alternatives to these HTML extensions, as we will show in this book. Moreover, the CSS equivalents are standardized by W3C, which means all major browser makers agree on them. CSS also offers control over non-visual presentations, which none of the extensions do.

The availability of CSS has even made possible the removal from HTML of the oldest extensions. Now that there is a better place to put layout information, elements such as CENTER and FONT are no longer needed in HTML. In HTML 4 (and XHTML 1, its cousin in XML), which is the current version at the time of writing, they have been relegated to a special "transitional module" and are no longer part of the main standard. In the next version of HTML, they may disappear completely.

Converting Text into Images

A second way by which designers have sought to get around the limitations of HTML has been to make text into images. With an image, the designer can fully control colors, spacing, and fonts, among other features. Then, the designer simply inserts the appropriate link in the document where the image is to appear on the page, thereby linking the image's file to the page. When the browser displays the page, the text in the form of an image appears on the page.

This method, too, has downsides: It compromises accessibility to a page, and it requires readers to wait longer for documents to display.

Accessibility is the ability of people or programs to use the information on a page. Accessibility of a page is compromised in two ways when you use images to hold text. First, certain types of software called robots (also known as crawlers or spiders) roam the Web (so to speak) seeking what's out there and then creating and updating indexes that users can use to find Web pages. Indexing services, such as AltaVista, AllTheWeb, and Google, use robots to build their indexes.

Robots work by loading a Web page, and then automatically loading all the pages that are linked from that page, and then loading all the pages that are linked to those pages, and so on, usually for the purpose of creating a database of all the words on all the pages. When a user searches for a particular word or set of words, all the pages containing that selection are made available. Robots, however, cannot read images. So they just skip them. Hence, they simply miss text that is part of an image.

Accessibility of your page is compromised in a second way. Not all users have a browser that provides a graphical user interface (GUI) such as what's provided by Navigator and Explorer. Some browsers can display only text, not images. Also, some people may have configured their browser to not display images. So, the user loses the content of those images. Some people do have a graphical brower with images, but need to set the fonts to a large size to be able to read them. They will find that the text in the images is too small or doesn't have enough contrast.

Currently, the only way around these accessibility problems, apart from CSS, is to enclose a textual description of the image that robots and text-only browsers can use. In the latter, for example, the user would receive this textual description of the image rather than the image not a great substitute for the real thing, but it's better than nothing.

The second downside to using images to hold text is that images take longer to load and draw on the screen than text. The user may become impatient and back out of a page before it's loaded completely. Also, the preponderance of images as a substitute for attractive type can account for much of the reputed slowness of the Web to respond when drawing pages onscreen.

Placing Text in a Table

A third technique designers have used to bypass HTML's limitations is to put text into a table. Doing this enables the designer to control the layout of the text. For example, to add a margin of a certain width on the left side of a page, you would put the entire document inside a table and then add an empty column along the left side to create the "margin."

The downside (you knew there would be one): Not all browsers support tables, so pages that use tables do not display well on those browsers. Depending on how you use tables, the results on such browsers can be somewhere between "weird" and "disastrous."

The use of tables also complicates the writing of HTML. You have to add many more tags even for a simple table. The more complex the table or table structure is you can create tables within tables to any depth you want the more complex your code becomes.

Tables also have severe accessibility problems. Tables used for layout pose problems to programs that try to read pages without displaying them visually. For example, a browser that gives access to the Web over the phone (by reading the pages out loud) would indicate to the listener that it enters a table and then make some specific sound at the start of every cell, which is rather disturbing if the text isn't actually made up of tabular data. The voice browser has to do it that way, however, because it has to assume that a table contains data for which it is important to know the precise arrangement in rows and columns, such as price lists or sales figures. Browsers with a limited display area, such as browsers in mobile phones, Braille browsers, or browsers set to display text with a large font, have the same problem. They often display only one table cell at a time. Users won't like it when they have to navigate through the cells of a table that isn't one.

Used with care, tables can sometimes be the right solution. CSS can nearly always replace tables, so the designer has a choice: Is the arrangement in rows and columns a matter of style (and thus for CSS), or is it an intrinsic part of the structure of the text, that even non-visual browsers need to know about?

Writing a Program Instead of Using HTML

A fourth technique designers use to bypass HTML's limitations is to create a program that displays pages. Although it's more complex than any other alternative, this technique has the advantage of giving designers control over every pixel on the screen something not even CSS style sheets can do. However, this technique shares some of the drawbacks of the previous three that we just discussed. A program cannot be searched by robots, and it cannot be used by text-only browsers. Furthermore, because it is an actual programming language (which HTML is not), it is more difficult to learn. It may contain a computer virus. It is questionable whether, 15 years from now, there will be computers that can run the program. (Examples of programming languages for creating Web documents are Java and JavaScript.)

Why should all this matter?

HTML has become a universal data format for publishing information. Thanks to its simplicity, anyone with a computer and Internet connection can publish in HTML without expensive DTP applications. Likewise, on the user side, HTML documents can be shown on various devices without the user having to buy proprietary software. Also, perhaps the strongest point in HTML's favor: It allows for electronic documents that have a much higher chance of withstanding the years than proprietary data formats. The methods of dodging the limitations of HTML undermine these benefits: The "extended HTML" that you all too often find on the Web is a complicated proprietary data format that cannot be freely exchanged. By allowing authors to express their desire for influence over document presentation, CSS will help HTML remain the simple little language that it's meant to be.

This is why we developed CSS.

There are aesthetic and commercial reasons for why the Web needs a powerful style-sheet language. Today, placing a page on the Web is no longer just a matter of posting some text and hoping someone stumbles across it. Web pages have become an important means whereby people around the world can get together to share ideas, hobbies, interests, and more. It also is becoming an increasingly important medium for advertising products and services. A page needs to attract and stimulate as well as inform. It needs to stand out among the enormous and rapidly growing repertoire of pages that make up the Web. Aesthetics have become more important. The old HTML tools simply aren't enough for the Web-page designer who wants to make good-looking pages.

Let's get started. In the next section, we review the basics of writing HTML. In Chapter 2, we introduce CSS and show you how it works with HTML. From there, we lead you on an exploration of CSS and explain how to use it to create distinctive and manageable Web pages.