Section 2.2. The HTML Tag

2.2. The HTML Tag

Now that you know how to peer into existing HTML files and create your own, the next step is to understand what goes inside the average HTML file. It all revolves around a single concept tags .

HTML tags are the formatting instructions that tell the browser how to transform ordinary text into something that's visually appealing. If you were to take all the tags out of an HTML document, you'd be left with nothing more than plain, unformatted text.


Note: Technically, HTTP (HyperText Transport Protocol) is the low-level communication system that allows two computers to exchange data over the Internet. If you were to apply the analogy of a phone conversation, the telephone line would use HTTP, and the juicy tidbits of gossip you're exchanging with your Aunt Martha would be the HTML documents.

Figure 2-3. The address bar indicates where the Web page you're viewing is really located (in geek-speak this is known as the file path). If you see "http://" your page is on a Web server somewhere out on the Internet (top). If you're looking at a Web page on your own computer, you'll just see an ordinary file path instead (middle, showing a Windows PC), or you'll see a URL that starts with the prefix "file:///"(bottom, showing a Mac). It all depends on the browser and operating system you're using.


2.2.1. What's in a Tag

You can recognize a tag by looking for the angle brackets, which are two special characters that look like this: < >. The angle brackets contain a code. This code is for the browser's eyes only, and it's never shown to Web surfers (unless they use the View Source trick to peek at the HTML). Essentially, the code is an instruction that conveys some information to the browser about how it should format the text that follows the code.

For example, one simple tag is the <b> tag, which stands for bold. When the browser encounters this tag, it switches on boldface formatting, which affects all the text that appears after this tag. Here's an example:

This text isn't bold. <b>This text is bold .

In a browser, you don't see the <b>. You're just left with the end result, which looks something like this:

This text isn't bold. This text is bold .

As you can see, the browser has a fairly simple job. It scans through an HTML document, looking for tags and switching on and off various formatting settings. It sends everything else (everything that isn't a tag) straight to the Web browser window.


Note: Adding tags to ordinary text is known as marking up a document, and the tags themselves are known as HTML markup . When you look at raw HTML, you may be interested in looking at the content (the text that's nestled between the tags), or the markup (the HTML tags themselves).

Many tags come in pairs. That means there's a starting tag and an ending tag. The end tag marks the end of the instruction that was given by the start tag. In the bold text example, that means the end tag switches off the bold formatting, returning the text to normal.

End tags are easy to recognize. They always look the same as the start tag, except they start with the characters </ instead of <. So the end tag for bold formatting is </b>. Here's an example:

This isn't bold. <b>Pay attention!</b> Now we're back to normal .

Which the browser displays as:

This isn't bold. Pay attention ! Now we're back to normal.

This example demonstrates another important principle in how a browser works. The browser always processes the tags in order, based on where they show up in your text. To get the bold formatting in the right place, you need to make sure you position the <b> and </b> tags appropriately.

2.2.2. Container Tags and Standalone Tags

It's considered good HTML style to always use tags in pairs. If you don't, it could conceivably confuse some browsers (and anyway, it's lazy). To get into the right habit, it helps to think of the start and end tags as a container into which you insert some text. In other words, when you use the <b> and </b> tags, you aren't exactly telling the browser to turn bold formatting on and offmore accurately, you're telling it to bold a specific piece of text.

Of course, life wouldn't be much fun (and computer books wouldn't be nearly as thick) without exceptions. When you get right down to it, there are really two types of tags:

  • Container tags

    The container tag is, by far, the most common type of tag. With a container tag, you're usually applying some sort of formatting that affects only the content that's nestled in between the start and the end tags. The <b> tag is a container tag, and should always be accompanied by a </b>.

  • Standalone tags

    There are some tags that don't come in pairs. These standalone tags don't turn formatting on or off. Instead, they insert something on the page, like an image. One example is the <hr> tag, which inserts a horizontal line on the page. Standalone tags are often called empty tags because there's no way to put any text inside them.

Figure 2-4 puts it in perspective.

Figure 2-4. Top: This snippet of HTML shows both a container tag and a standalone tag.
Bottom: The browser shows the resulting Web page.



Note: Standalone tags sometimes include a slash character, like this <hr /> (sort of like an opening and a closing tag rolled into one). This syntax is handy, because it clearly indicates that you have a standalone tag on your hands. It isn't official HTML, but it's used for a new standard called XHTML, which you'll learn about at the end of this chapter (Section 2.4).

2.2.3. Nesting Tags

In the previous example, you saw how to apply a simple <b> tag for bold formatting. Between the <b> and </b> tags, you place the text that you want to make bold. However, text isn't the only thing that you can put between a start and an end tag. You can also nest one tag inside another. In fact, nesting tags is one of the basic building block techniques of Web pages. Nesting lets you apply more detailed formatting (for example, bold, underlined , italicized text), by piling in all the tags you need in the same place. Nesting is also required for more complicated structures (like bulleted lists).

To see nesting in action, you need another tag to work with. For the next example, consider both the familiar <b> tag and the <i> tag, which lets you italicize text.

The question is what happens if you want to make a piece of text bold and italicized? HTML doesn't include a tag for this purpose, so you need to combine the two. Here's an example:

This <b><i>word</i></b> has italic and bold formatting.

When the browser chews through this scrap of HTML, it produces text that looks like this:

This word has italic and bold formatting.

Incidentally, it doesn't matter if you reverse the order of the <i> and <b> tags. The following HTML produces exactly the same result.

This <i><b>word</b></i> has italic and bold formatting.

However, you should always make sure that you close tags in the reverse order that you opened them. In other words, if you apply italic formatting and then bold formatting, you should always switch off bold formatting first, and then italic formatting next. Here's an example that breaks this rule:

This <i><b>word</i></b> has italic and bold formatting.
FREQUENTLY ASKED QUESTION
Telling the Browser to Ignore a Tag

What if I really do want the text "<b>" to appear on my Web page ?

The tag system works great until you actually want to use an angle bracket (< or >) in your text. Then you're in a tricky position.

For example, imagine you want to write the following bit of text as part of remarkable insight you've achieved:

The expression 5 < 2 is clearly false,
because 5 is bigger than 2.

When the browser reaches the less than (<) symbol, it becomes utterly bewildered. Its first instinct is to assume you're starting a tag, and the text following "2 is clearly false " is part of a long tag name . Obviously, this isn't what you intended. The end result is unpredictable, but usually the text after the < character disappears into a nonexisting tag.

To solve this problem, you need to replace angle brackets with the corresponding HTML character entity . Character entities always begin with an ampersand (&) and end with a semicolon (;). The character entity for the less than symbol is &lt; because the lt stands for "less than." Similarly, &gt; is the character entity for the greater than symbol.

Here's the corrected example:

The expression 5 < 2 is clearly false,
because 5 is bigger than 2.

In your text editor, this doesn't look like what you want. However, when the browser interprets this document, it automatically changes the &lt; into a < character, without confusing it with a tag. You'll learn more about character entities on Section 2.3.5.3 (at the end of this chapter).


Most Web browsers are savvy enough to figure out what you're trying to do and give you the right result. However, in some cases, violating this rule can cause different browsers to render the same document in different ways. To avoid these glitches, always close your tags in the reverse order that you open them.

Finally, it's worth noting that HTML gives you many more complex ways to nest tags. For example, you can nest one tag inside another, and then nest another tag inside that one, and so on, indefinitely. Just to give you some ideas, consider the following example, which uses a combination of italic, bold, and underline formatting with the <i>, <b>, and <u> tags.

<u>The <b> easiest </b> way to <b>confuse</b> a Web surfer is with <i>too much
<b>formatting</b></i>.</u>

If you follow through all the tags, you'll discover that this example produces the following dizzying line of text:

The easiest way to confuse a Web surfer is with too much formatting .

To break down complex snippets of HTML like this, it's often handy to use a tree model . You'll use the tree model later in this chapter to analyze a complete HTML document.


Tip: If you're a graphic-design type, you're probably itching to get your hands on more powerful formatting tags to change alignment, spacing, and fonts. Unfortunately, in the Web world you can't always control everything you want. Chapter 5 has the lowdown, and Chapter 6 introduces the best solution (style sheets).


Creating Web Sites. The Missing Manual
Creating Web Sites: The Missing Manual
ISBN: B0057DA53M
EAN: N/A
Year: 2003
Pages: 135

flylib.com © 2008-2017.
If you may any questions please contact us: flylib@qtcs.net