Now that you know how to peer into existing HTML files and create your own, the next step is to understand what goes inside the average HTML file. It all revolves around a single concept tags .
HTML tags are the formatting instructions that tell the browser how to transform ordinary text into something that's visually appealing. If you were to take all the tags out of an HTML document, you'd be left with nothing more than plain, unformatted text.
You can recognize a tag by looking for the angle brackets, which are two special characters that look like this: < >. The angle brackets contain a code. This code is for the browser's eyes only, and it's never shown to Web surfers (unless they use the View Source trick to peek at the HTML). Essentially, the code is an instruction that conveys some information to the browser about how it should format the text that follows the code.
For example, one simple tag is the <b> tag, which stands for bold. When the browser encounters this tag, it switches on boldface formatting, which affects all the text that appears after this tag. Here's an example:
In a browser, you don't see the <b>. You're just left with the end result, which looks something like this:
As you can see, the browser has a fairly simple job. It scans through an HTML document, looking for tags and switching on and off various formatting settings. It sends everything else (everything that isn't a tag) straight to the Web browser window.
Many tags come in pairs. That means there's a starting tag and an ending tag. The end tag marks the end of the instruction that was given by the start tag. In the bold text example, that means the end tag switches off the bold formatting, returning the text to normal.
End tags are easy to recognize. They always look the same as the start tag, except they start with the characters </ instead of <. So the end tag for bold formatting is </b>. Here's an example:
Which the browser displays as:
This example demonstrates another important principle in how a browser works. The browser always processes the tags in order, based on where they show up in your text. To get the bold formatting in the right place, you need to make sure you position the <b> and </b> tags appropriately.
It's considered good HTML style to always use tags in pairs. If you don't, it could conceivably confuse some browsers (and anyway, it's lazy). To get into the right habit, it helps to think of the start and end tags as a container into which you insert some text. In other words, when you use the <b> and </b> tags, you aren't exactly telling the browser to turn bold formatting on and offmore accurately, you're telling it to bold a specific piece of text.
Of course, life wouldn't be much fun (and computer books wouldn't be nearly as thick) without exceptions. When you get right down to it, there are really two types of tags:
The container tag is, by far, the most common type of tag. With a container tag, you're usually applying some sort of formatting that affects only the content that's nestled in between the start and the end tags. The <b> tag is a container tag, and should always be accompanied by a </b>.
There are some tags that don't come in pairs. These standalone tags don't turn formatting on or off. Instead, they insert something on the page, like an image. One example is the <hr> tag, which inserts a horizontal line on the page. Standalone tags are often called empty tags because there's no way to put any text inside them.
Figure 2-4 puts it in perspective.
In the previous example, you saw how to apply a simple <b> tag for bold formatting. Between the <b> and </b> tags, you place the text that you want to make bold. However, text isn't the only thing that you can put between a start and an end tag. You can also nest one tag inside another. In fact, nesting tags is one of the basic building block techniques of Web pages. Nesting lets you apply more detailed formatting (for example, bold, underlined , italicized text), by piling in all the tags you need in the same place. Nesting is also required for more complicated structures (like bulleted lists).
To see nesting in action, you need another tag to work with. For the next example, consider both the familiar <b> tag and the <i> tag, which lets you italicize text.
The question is what happens if you want to make a piece of text bold and italicized? HTML doesn't include a tag for this purpose, so you need to combine the two. Here's an example:
When the browser chews through this scrap of HTML, it produces text that looks like this:
Incidentally, it doesn't matter if you reverse the order of the <i> and <b> tags. The following HTML produces exactly the same result.
However, you should always make sure that you close tags in the reverse order that you opened them. In other words, if you apply italic formatting and then bold formatting, you should always switch off bold formatting first, and then italic formatting next. Here's an example that breaks this rule:
| FREQUENTLY ASKED QUESTION |
Telling the Browser to Ignore a Tag
What if I really do want the text "<b>" to appear on my Web page ?
The tag system works great until you actually want to use an angle bracket (< or >) in your text. Then you're in a tricky position.
For example, imagine you want to write the following bit of text as part of remarkable insight you've achieved:
When the browser reaches the less than (<) symbol, it becomes utterly bewildered. Its first instinct is to assume you're starting a tag, and the text following "2 is clearly false " is part of a long tag name . Obviously, this isn't what you intended. The end result is unpredictable, but usually the text after the < character disappears into a nonexisting tag.
To solve this problem, you need to replace angle brackets with the corresponding HTML character entity . Character entities always begin with an ampersand (&) and end with a semicolon (;). The character entity for the less than symbol is < because the lt stands for "less than." Similarly, > is the character entity for the greater than symbol.
Here's the corrected example:
In your text editor, this doesn't look like what you want. However, when the browser interprets this document, it automatically changes the < into a < character, without confusing it with a tag. You'll learn more about character entities on Section 184.108.40.206 (at the end of this chapter).
Most Web browsers are savvy enough to figure out what you're trying to do and give you the right result. However, in some cases, violating this rule can cause different browsers to render the same document in different ways. To avoid these glitches, always close your tags in the reverse order that you open them.
Finally, it's worth noting that HTML gives you many more complex ways to nest tags. For example, you can nest one tag inside another, and then nest another tag inside that one, and so on, indefinitely. Just to give you some ideas, consider the following example, which uses a combination of italic, bold, and underline formatting with the <i>, <b>, and <u> tags.
If you follow through all the tags, you'll discover that this example produces the following dizzying line of text:
To break down complex snippets of HTML like this, it's often handy to use a tree model . You'll use the tree model later in this chapter to analyze a complete HTML document.