What HTML IsAnd What It Isn't
Take note of just one more thing before you dive into actually writing web pages. You should know what HTML is, what it can do, and most importantly what it can't do.
HTML stands for Hypertext Markup Language. HTML is based on the Standard Generalized Markup Language (SGML), a much larger document-processing system. To write HTML pages, you won't need to know a whole lot about SGML. However, knowing that one of the main features of SGML is that it describes the general structure of the content inside documentsrather than its actual appearance on the page or onscreendoes help. This concept might be a bit foreign to you if you're used to working with WYSIWYG (What You See Is What You Get) editors like Adobe's Dreamweaver or Microsoft FrontPage, so let's go over the information carefully.
HTML Describes the Structure of a Page
HTML, by virtue of its SGML heritage, is a language for describing the structure of a document, not its actual presentation. The idea here is that most documents have common elementsfor example, titles, paragraphs, and lists. Before you start writing, therefore, you can identify and define the set of elements in that document and give them appropriate names (see Figure 3.1).
Figure 3.1. Document elements.
If you've worked with word processing programs that use style sheets (such as Microsoft Word) or paragraph catalogs (such as FrameMaker), you've done something similar; each section of text conforms to one of a set of styles that are predefined before you start working.
HTML defines a set of common styles for web pages: headings, paragraphs, lists, and tables. It also defines character styles such as boldface and code examples. These styles are indicated inside HTML documents using tags. Each tag has a specific name and is set off from the content of the document using a notation that I'll get into a bit later.
HTML Does Not Describe Page Layout
When you're working with a word processor or page layout program, styles are not just named elements of a pagethey also include formatting information such as the font size and style, indentation, underlining, and so on. So, when you write some text that's supposed to be a heading, you can apply the Heading style to it, and the program automatically formats that paragraph for you in the correct style.
HTML doesn't go this far. For the most part, HTML doesn't say anything about how a page looks when it's viewed. HTML tags just indicate that an element is a heading or a list; they say nothing about how that heading or list is to be formatted. So, as with the magazine example and the layout person who formats your article, the layout person's job is to decide how big the heading should be and what font it should be in. The only thing you have to worry about is marking which section is supposed to be a heading.
Although HTML doesn't say much about how a page looks when it's viewed, Cascading Style Sheets (CSS) enable you to apply advanced formatting to HTML tags. Many changes in HTML 4.0 favor the use of CSS tags. And XHTML, which is the current version of HTML, eliminates almost all tags that are associated with formatting in favor of Cascading Style Sheets. I'll talk about both XHTML and CSS later today.
Web browsers, in addition to providing the networking functions to retrieve pages from the Web, double as HTML formatters. When you read an HTML page into a browser such as Netscape or Internet Explorer, the browser interprets, or parses, the HTML tags and formats the text and images on the screen. The browser has mappings between the names of page elements and actual styles on the screen; for example, headings might be in a larger font than the text on the rest of the page. The browser also wraps all the text so that it fits into the current width of the window.
Different browsers running on diverse platforms might have various style mappings for each page element. Some browsers might use different font styles than others. For example, a browser on a desktop computer might display italics as italics, whereas a handheld device or mobile phone might use reverse text or underlining on systems that don't have italic fonts. Or it might put a heading in all capital letters instead of a larger font.
What this means to you as a web page designer is that the pages you create with HTML might look radically different from system to system and from browser to browser. The actual information and links inside those pages are still there, but the onscreen appearance changes. You can design a web page so that it looks perfect on your computer system, but when someone else reads it on a different system, it might look entirely different (and it might very well be entirely unreadable).
Why It Works This Way
If you're used to writing and designing documents that will wind up printed on paper, this concept might seem almost perverse. No control over the layout of a page? The whole design can vary depending on where the page is viewed? This is awful! Why on earth would a system work like this?
Remember in Lesson 1, "Navigating the World Wide Web," when I mentioned that one of the cool things about the Web is that it's cross-platform and that web pages can be viewed on any computer system, on any size screen, with any graphics display? If the final goal of web publishing is for your pages to be readable by anyone in the world, you can't count on your readers having the same computer systems, the same size screens, the same number of colors, or the same fonts that you have. The Web takes into account all these differences and enables all browsers and all computer systems to be on equal ground.
The Web, as a design medium, is not a new form of paper. The Web is an entirely different medium, with its own constraints and goals that are very different from working with paper. The most important rules of web page design, as I'll keep harping on throughout this book, are the following:
Throughout this book, I'll show you examples of HTML code and what they look like when displayed. In examples in which browsers display code very differently, I'll give you a comparison of how a snippet of code looks in two very different browsers. Through these examples, you'll get an idea for how different the same page can look from browser to browser.
Although this rule of designing by structure and not by appearance is the way to produce good HTML, when you surf the Web, you might be surprised that the vast majority of websites seem to have been designed with appearance in mindusually appearance in a particular browser such as Microsoft Internet Explorer. Don't be swayed by these designs. If you stick to the rules I suggest, in the end, your web pages and websites will be even more successful simply because more people can easily read and use them.
How Markup Works
HTML is a markup language. Writing in a markup language means that you start with the text of your page and add special tags around words and paragraphs. The tags indicate the different parts of the page and produce different effects in the browser. You'll learn more about tags and how they're used in the next section.
HTML has a defined set of tags you can use. You can't make up your own tags to create new styles or features. And just to make sure that things are really confusing, various browsers support different sets of tags. To further explain this, take a brief look at the history of HTML.
A Brief History of HTML Tags
The base set of HTML tags, the lowest common denominator, is referred to as HTML 2.0. HTML 2.0 is the old standard for HTML (a written specification for it is developed and maintained by the W3C) and the set of tags that all browsers must support. In the next few lessons, you'll primarily learn to use tags that were first introduced in HTML 2.0.
The HTML 3.2 specification was developed in early 1996. Several software vendors, including IBM, Microsoft, Netscape Communications Corporation, Novell, SoftQuad, Spyglass, and Sun Microsystems, joined with the W3C to develop this specification. Some of the primary additions to HTML 3.2 included features such as tables, applets, and text flow around images. HTML 3.2 also provided full backward-compatibility with the existing HTML 2.0 standard.
The enhancements introduced in HTML 3.2 are covered later in this book. You'll learn more about tables in Lesson 8, "Building Tables." Lesson 11, "Integrating Multimedia: Sound, Video, and More," tells you how to use Java applets.
HTML 4.0, first introduced in 1997, incorporated many new features that gave designers greater control over page layout than HTML 2.0 and 3.2. Like HTML 2.0 and 3.2, the W3C maintains the HTML 4.0 standard.
Framesets (originally introduced in Netscape 2.0) and floating frames (originally introduced in Internet Explorer 3.0) became an official part of the HTML 4.0 specification. Framesets are discussed in more detail in Lesson 14, "Working with Frames and Linked Windows." We also see additional improvements to table formatting and rendering. By far, however, the most important change in HTML 4.0 was its increased integration with style sheets.
If you're interested in how HTML development is working and just exactly what's going on at the W3C, check out the pages for HTML at the Consortium's site at http://www.w3.org/pub/WWW/MarkUp/.
At one time, Microsoft and Netscape were releasing new versions of their browsers frequently, competing to see who could add the most compelling new features to HTML without waiting for the standards process to catch up. These days, browser releases are less frequent, and HTML is more "finished" than it was in the late nineties. Now developers must mostly concern themselves with slight differences between how the browsers handle the HTML they support rather than deciding against competing sets of features. Confused yet? You're not alone. The extra work involved in dealing with variations between browsers has been a headache for Web developers for a very long time. Keeping track of all this information can be really confusing. Throughout this book, as I introduce each tag, I'll explain any browser specific issues you'll run into.