Chapter 16. Essential XHTML

CONTENTS
  •  XHTML Versions
  •  XHTML Checklist
  •  XHTML Programming

Probably the biggest XML application today is XHTML, which is W3C's implementation of HTML 4.0 in XML. XHTML is a true XML application, which means that XHTML documents are XML documents that can be checked for well-formedness and validity.

There are two big advantages to using XHTML. First, HTML predefines all its elements and attributes, and that's not something you can change unless you use XHTML. Because XHTML is really XML, you can extend it with your own elements, and we'll see how to do that in the next chapter. Need <INVOICE>, <DELIVERY_DATE>, and <PRODUCT_ID> elements in your Web page? Now you can add them. (This aspect of XHTML isn't supported by the major browsers yet, but it's coming.) The other big advantage, as far as HTML authors are concerned, is that you can display XHTML documents in today's browsers without modification. That's the whole idea behind XHTML it's supposed to provide a bridge between XML and HTML. XHTML is true XML, but you can use it today in browsers. And that has made it very popular.

Here's an example; this page is written in standard HTML:

<HTML>     <HEAD>         <TITLE>             Welcome to my page     </HEAD>     <BODY>         <H1>             Welcome to HTML!         </H1>     </BODY> </HTML>

Here's the same page, written in XHTML, with the message changed from Welcome to HTML! to Welcome to XHTML!:

<?xml version="1.0"?> <!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd"> <html xmlns="http://www.w3.org/1999/xhtml" xml:lang="en" lang="en">     <head>         <title>             Welcome to my page         </title>     </head>     <body>         <h1>             Welcome to XHTML!         </h1>     </body> </html>

I'll go through exactly what's happening here in this chapter.

You save XHTML documents with the extension .html to make sure that browsers treat those documents as HTML. This document produces the same result as the previous HTML document, except that this document says Welcome to XHTML! instead, as you can see in Figure 16.1.

Figure 16.1. An XHTML document in Netscape.

graphics/16fig01.gif

Take a look at this XHTML document; as you can see, it's true XML, starting with the XML declaration:

<?xml version="1.0"?>     .     .     .

Next comes a <!DOCTYPE> element:

<?xml version="1.0"?> <!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd">     .     .     .

This is just a standard <!DOCTYPE> element; in this case, it indicates that the document element is <html>. Note the lowercase here <html>, not <HTML>. All elements in XHTML (except the <!DOCTYPE> element) are lowercase. That's the XHTML standard if you're accustomed to using uppercase tag names, it'll take a little getting used to.

The DTDs that XHTML use are public DTDs, created by W3C. Here, the formal public identifier (FPI) for the DTD that I'm using is "-//W3C//DTD XHTML 1.0 Transitional//EN", which is one of several DTDs available, as we'll see. I'm also giving the URL for the DTD, which for this DTD is "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd".

Using an XHTML DTD, browsers can validate XHTML documents, at least theoretically (and, in fact, browsers such as Internet Explorer will read in the DTD and check the document against it, although as we've seen, you must explicitly check whether errors occurred because the browser won't announce them).

Note also that the URI for the DTD is at W3C itself: "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd". Now imagine 40 million browsers trying to validate XHTML documents all at the same time by downloading XHTML DTDs like this one from the W3C site quite a problem. To avoid bottlenecks like this, you can copy the XHTML DTDs and store them locally (I'll give their URIs and discuss this in a few pages), or do without a DTD in your documents. However, my guess is that when we get fully enabled validating XHTML browsers, they'll have the various XHTML DTDs stored internally for immediate access, without having to download the XHTML DTDs from the Internet. (As it stands now, it takes Internet Explorer 10 to 20 seconds to download a typical XHTML DTD on a typical modem line.)

After the <!DOCTYPE> element comes the <html> element, which is the document element. It starts the actual document content:

<?xml version="1.0"?> <!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd"> <html xmlns="http://www.w3.org/1999/xhtml" xml:lang="en" lang="en">     .     .     .

I'm using three attributes of this XHTML element here, as is usual:

  • xmlns defines an XML namespace for the document.

  • xml:lang sets the language for the document when it's interpreted as XML.

  • The standard HTML attribute lang sets the language when the document is treated as HTML.

Note in particular the namespace used for XHTML: http://www.w3.org/1999/xhtml, which is the official XHTML namespace. All the XHTML elements must be in this namespace.

The remainder of the page is very like the HTML document we saw earlier the only real difference is that the tag names are now lowercase:

<?xml version="1.0"?> <!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd"> <html xmlns="http://www.w3.org/1999/xhtml" xml:lang="en" lang="en">     <head>         <title>             Welcome to my page         </title>     </head>     <body>         <h1>             Welcome to XHTML!         </h1>     </body> </html>

XHTML Versions

As you see in the <!DOCTYPE> element, I'm using the XHTML DTD that's called "XHTML 1.0 Transitional". That's only one of the XHTML DTDs available, although it's currently the most popular one. So, what XHTML DTDs are available, and what do they mean? That all depends on what version of XHTML that you are using.

XHTML Version 1.0

The standard version of XHTML, version 1.0, is just a rewrite of HTML 4.0 in XML. You can find the W3C recommendation for XHTML 1.0 at http://www.w3.org/TR/xhtml1. Essentially, it's just a set of DTDs that provide validity checks for documents that are supposed to mimic HTML 4.0 (actually, HTML 4.01). The W3C has created several DTDs for HTML 4.0, and the XHTML DTDs are based on those, translated into straight XML. As with HTML 4.0, XHTML 1.0 has three versions, which correspond to three DTDs here:

  • The strict XHTML 1.0 DTD. The strict DTD is based on straight HTML 4.0 and does not include support for elements and attributes that the W3C considers deprecated. This is the version of XHTML 1.0 that the W3C hopes people will migrate to in time.

  • The transitional XHTML 1.0 DTD. The transitional DTD is based on the transitional HTML 4.0 DTD. This DTD has support for the many elements and attributes that were deprecated in HTML 4.0 but that are still popular, such as the <CENTER> and <FONT> elements. This DTD is also named the "loose" DTD. It is the most popular version of XHTML at the moment.

  • The frameset XHTML 1.0 DTD. The frameset DTD is based on the frameset HTML 4.0 DTD. This is the DTD you should work with when you're creating pages based on frames: In that case, you replace the <BODY> element with a <FRAMESET> element. The DTD must reflect that, so you use the frameset DTD when working with frames. That's the difference between the XHTML 1.0 transitional and frameset DTDs the frameset DTD replaces the <BODY> element with the <FRAMESET> element.

Here are the actual <!DOCTYPE> elements you should use in XHTML for these various DTDs strict, transitional, and frameset including the URIs for these DTDs:

<!DOCTYPE html      PUBLIC "-//W3C//DTD XHTML 1.0 Strict//EN"      "http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd"> <!DOCTYPE html      PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN"      "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd"> <!DOCTYPE html      PUBLIC "-//W3C//DTD XHTML 1.0 Frameset//EN"      "http://www.w3.org/TR/xhtml1/DTD/xhtml1-frameset.dtd">

Because I'm giving the DTDs' URIs here, you can copy them and cache a local copy if you want for faster access. For example, if you place the DTD files in a directory named DTD in your Web site, your <!DOCTYPE> elements might look more like this:

<!DOCTYPE html      PUBLIC "-//W3C//DTD XHTML 1.0 Strict//EN"      "DTD/xhtml1-strict.dtd"> <!DOCTYPE html      PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN"      "DTD/xhtml1-transitional.dtd"> <!DOCTYPE html      PUBLIC "-//W3C//DTD XHTML 1.0 Frameset//EN"      "DTD/xhtml1-frameset.dtd">

If you cache these DTDs locally, there should be less of a bottleneck when XHTML becomes very popular and users try to download your documents.

XHTML Version 1.1

There's also a new version of XHTML available, version 1.1. This version is not yet in W3C recommendation form; it's a working draft. You can find the current working draft of XHTML 1.1 at http://www.w3.org/TR/xhtml11.

XHTML 1.1 is a strict version of XHTML, and it's clear that the W3C wants to wean HTML authors away from their loose ways into writing very tight XML. How far those HTML authors will follow is yet to be determined. XHTML 1.1 removes all the elements and attributes deprecated in HTML 4.0, and a few more as well.

<APPLET> Versus <OBJECT>

There's another interesting thing going on in XHTML 1.1: The W3C has long said that it wants to replace the <APPLET> and other elements with the Microsoft-supported <OBJECT> element. However, and surprisingly, <OBJECT> is missing from XHTML 1.1. And surprise the <APPLET> element is back.

XHTML 1.1 is so far ahead of the pack that many of the features that today's HTML authors and browsers use aren't supported there at all. Therefore, I'm going to stick to XHTML 1.0 transitional in the examples in this chapter and the next chapter. However, I'll also indicate which elements and attributes are supported by what versions of XHTML, including XHTML 1.1, throughout these chapters.

XHTML 1.0 Versus XHTML 1.1

You can find the differences between XHTML 1.0 and XHTML 1.1 at http://www.w3.org/TR/xhtml11/changes.html#a_changes.

When you want to use XHTML 1.1, here's the <!DOCTYPE> element you should use (there's only one XHTML 1.1 DTD, not three, as in XHTML 1.0):

<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.1//EN"      "http://www.w3.org/TR/xhtml11/DTD/xhtml11.dtd">

Another big difference between XHTML 1.1 and XHTML 1.0 goes beyond the support offered to various elements and attributes. XHTML is designed to be modular. In practice, this means that the XHTML 1.1 DTD is actually relatively short it's a driver DTD, which inserts various other DTDs as modules. The benefit of modular DTDs is that you can omit the modules that your application doesn't support.

For example, if you're supporting XHTML 1.1 on a nonstandard device such as a PDA or even a cell phone or pager (the W3C has all kinds of big ideas for the future), you might not be able to support everything, such as tables or hyperlinks. With XHTML 1.1, all you need to do is to omit the DTD modules corresponding to tables and hyperlinks. (Several modules are marked as required in the XHTML 1.1 DTD, and those cannot be omitted.)

XHTML Basic

In fact, there's another version of XHTML that is also in the working draft stage: XHTML Basic. XHTML Basic is a very small subset of XHTML, reduced to a very minimum so that it can be supported by devices considerably simpler than standard PCs. You can find the current working draft for XHTML Basic at http://www.w3.org/TR/xhtml-basic.

If you want to use XML Basic, here's the <!DOCTYPE> element you should use:

<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML Basic 1.0//EN"      "http://www.w3.org/TR/xhtml-basic/xhtml-basic10.dtd">

XHTML Checklist

The W3C has a number of requirements for documents before they can be called true XHTML documents. Here's the list of requirements that documents must meet:

  • The document must successfully validate against one of the W3C XHTML DTDs.

  • The document element must be <html>.

  • The document element, <html>, must set an XML namespace for the document, using the xmlns attribute. This namespace must be "http://www.w3.org/1999/xhtml.

  • There must be a <!DOCTYPE> element, and it must appear before the document element.

XHTML is designed to be displayed in today's browsers, and it works well (largely because those browsers ignore elements that they don't understand, such as <?xml?> and <!DOCTYPE>). However, because XHTML is also XML, a number of differences exist between legal HTML and legal XHTML.

XHTML Versus HTML

As you know, XML is more particular about many aspects of writing documents than HTML is. For example, you need to place all attribute values in quotes in XML, although HTML documents can use unquoted values because HTML browsers will accept that. One of the problems the W3C is trying to solve with XHTML, in fact, is the thicket of nonstandard HTML that's out there on the Web, mostly because browsers support it. Some observers estimate that half of the code in browsers is there to handle nonstandard use of HTML, and that discourages any but the largest companies from creating HTML browsers. XHTML is supposed to be different if a document isn't in perfect XHTML, the browser is supposed to quit loading it and display an error, not guess what the document author was trying to do. Hopefully, that will make it easier to write XHTML browsers.

Here are some of the major differences between HTML and XHTML:

  • XHTML documents must be well-formed XML documents.

  • Element and attribute names must be in lowercase.

  • Elements that aren't empty need end tags; end tags can't be omitted as they can sometimes in HTML.

  • Attribute values must always be quoted.

  • You cannot use "standalone" attributes that are not assigned values. If need be, assign a dummy value to an attribute, as in action = "action".

  • Empty elements must end with the /> characters. In practice, this does not seem to be a problem for the major browsers, which is a lucky thing for XHTML because it's definitely not standard HTML.

  • The <a> element cannot contain other <a> elements.

  • The <pre> element cannot contain the <img>, <object>, <big>, <small>, <sub>, or <sup> elements.

  • The <button> element cannot contain the <input>, <select>, <textarea>, <label>, <button>, <form>, <fieldset>, <iframe>, or <isindex> elements.

  • The <label> element cannot contain other <label> elements.

  • The <form> element cannot contain other <form> elements.

  • You must use the id attribute, not the name attribute, even on elements that have also had a name attribute. In XHTML 1.0, the name attribute of the <a>, <applet>, <form>, <frame>, <iframe>, <img>, and <map> elements is formally deprecated. In practice, this is a little difficult in browsers such as Netscape that support name and not id; in that case, you should use both attributes in the same element, even though it's not legal XHTML.

  • You must escape sensitive characters. For example, when an attribute value contains an ampersand (&), the ampersand must be expressed as a character entity reference, as &amp;.

As we'll see in the next chapter, there are some additional requirements for example, if you use < characters in your scripts, you should either escape such characters as &lt;, or, if the browser can't handle that, place the script in an external file. (The W3C's suggestion to place scripts in CDATA sections is definitely not understood by any major browser today.)

Automatic Conversion from HTML to XHTML

You may already have a huge Web site full of HTML pages, and you might be reading all this with some trepidation how are you going to convert all those pages to the far more strict XHTML? In fact, a utility out there can do it for you the Tidy utility, created by Dave Raggett. This utility is available for a wide variety of platforms, and you can download it for free from http://www.w3.org/People/Raggett/tidy. There's also a complete set of instructions on that page.

Here's an example: I'll use Tidy in Windows to convert a file from HTML to XHTML. In this case, I'll use the example HTML file we developed earlier, as saved in a file named index.html:

<HTML>     <HEAD>         <TITLE>             Welcome to my page         </TITLE>     </HEAD>     <BODY>         <H1>             Welcome to XHTML!         </H1>     </BODY> </HTML>

After downloading Tidy, you run it at the command prompt. Here are the command-line switches, or options, that you can use with Tidy:

Switch Description
-config file Use the configuration file named file
-indent or -i Indent element content
-omit or -o Omit optional end tags
-wrap 72 Wrap text at column 72 (default is 68)
-upper or -u Force tags to uppercase (default is lowercase)
-clean or -c Replace font, nobr, &amp;, and center tags, by CSS
-raw Don't substitute entities for characters 128 to 255
-ascii Use ASCII for output, and Latin-1 for input
-latin1 Use Latin-1 for both input and output
-utf8 Use UTF-8 for both input and output
-iso2022 Use ISO2022 for both input and output
-numeric or -n Output numeric rather than named entities
-modify or -m Modify original files
-errors or -e Show only error messages
-quiet or -q Suppress nonessential output
-f file Write errors to file
-xml Use this when input is in XML
-asxml Convert HTML to XML
-slides Burst into slides on h2 elements
-help List command-line options
-version Show release date

In this example, I'll use three switches:

  • -m indicates that I want Tidy to modify the file I pass to it, which will be index.html

  • -i indicates that I want it to indent the resulting XHTML elements

  • -config indicates that I want to use a configuration file named config.txt.

Here's how I use Tidy from the command line:

%tidy -m -i -config configuration.txt index.html

Tidy is actually a utility that cleans up HTML, as you might gather from its name. To make it create XHTML, you must use a configuration file, which I've named configuration.txt here. You can see all the configuration file options on the Tidy Web site. Here are the contents of configuration.txt, which I'll use to convert index.html to XHTML:

output-xhtml: yes add-xml-pi: yes doctype: loose

Here, output-xhtml indicates that I want Tidy to create XHTML output. Using add-xml-pi indicates that the output should also include an XML declaration, and doctype: loose means that I want to use the transitional XHTML DTD. If you don't specify what DTD to use, Tidy will guess, based on your HTML.

Here's the resulting XHTML document:

<?xml version="1.0"?> <!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN"     "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd"> <html xmlns="http://www.w3.org/1999/xhtml">   <head>     <meta name="generator" content="HTML Tidy, see www.w3.org" />     <title>Welcome to my page</title>   </head>   <body>     <h1> Welcome to XHTML!</h1>   </body> </html>

You can even teach Tidy about new XHTML tags that you've added. If you're ever stuck and want a quick way of translating HTML into XHTML, check out Tidy; it's fast, it's effective, and it's free.

Validating Your XHTML Document

The W3C has a validator you can use to check the validity of your XHTML document, and you can find this validator at http://validator.w3.org. To use the XHTML validator, you just enter the URI of your document and click the Validate This Page button. The W3C validator checks the document and gives you a full report. Here's an example response:

Congratulations, this document validates as XHTML1.0 Transitional! To show your readers that you have taken the care to create an interoperable Web page, you may display this icon on any page that validates. Here is the HTML you could use to add this icon to your Web page:   <p>     <a href="http://validator.w3.org/check/referer"><img         src="http://validator.w3.org/images/vxhtml10"         alt="Valid XHTML 1.0!" height="31" width="88" /></a>   </p>

In this case, the document I tested validated properly, and the W3C validator says that I can add the official W3C XHTML 1.0 Transitional logo to the document. That logo appears in Figure 16.2.

Figure 16.2. The W3C transitional XHTML logo.

graphics/16fig02.gif

Actually, the W3C XHTML validator does not do a complete job it doesn't check to see if values are supplied for required attributes, for example, or make sure that child elements are allowed to be nested inside the particular type of their parents. However, it does a reasonably good job.

XHTML Programming

In the remainder of this chapter, I'm going to get to the actual XHTML programming, starting with the document element, <html>.

Document Element (<html>)

This element is supported in XHTML 1.0 Strict, XHTML 1.0 Transitional, XHTML 1.0 Frameset, and XHTML 1.1. Here are its attributes:

Attribute Description
dir Sets the direction of text that doesn't have an inherent direction in which you should read it, called directionally neutral text. You can set this attribute to LTR, for left-to-right text, or RTL, for right-to-left text.
lang Specifies the base language used in the element. Applies only when the document is interpreted as HTML.
xml:lang Specifies the base language for the element when the document is interpreted as XML.
xmlns Is required. Set this attribute to "http://www.w3.org/1999/xhtml".

The document element for all XHTML elements is <html>, which is how XHTML matches the <HTML> element in HTML documents. This element must contain all the content of the document, as in this example:

<?xml version="1.0"?> <!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd"> <html xmlns="http://www.w3.org/1999/xhtml" xml:lang="en" lang="en">     <head>         <title>             Welcome to my page         </title>     </head>     <body>         <h1>             Welcome to XHTML!         </h1>     </body> </html>

The document element is very important in XML documents, of course. Note that this is one of the big differences between XHTML and HTML in HTML, the <HTML> tag is optional because it's the default. To be valid XHTML, a document must have an <html> element.

Of all the attributes of this element, only one is required xmlns, which sets the XML namespace. Most XML applications set up their own namespace to avoid overlap, and XHTML is no exception; you must set the xmlns attribute to "http://www.w3.org/1999/xhtml" in XHTML documents.

This tag also supports the lang and xml:lang attributes to let you specify the language of the document. If you specify values for both these attributes, the xml:lang attribute takes precedence in XHTML.

In XHTML, the <html> element can contain a <head> and a <body> element (or a <head> and a <frameset> element, in the XHTML 1.0 frameset document).

Creating a Web Page Heading (<head>)

The <head> element contains the head of an XHTML document, which should contain at least a <title> element. The <head> element is supported in XHTML 1.0 Strict, XHTML 1.0 Transitional, XHTML 1.0 Frameset, and XHTML 1.1. Here are the attributes of this element:

Attribute Description
dir Sets the direction of directionally neutral text. You can set this attribute to LTR, for left-to-right text, or RTL, for right-to-left text.
lang Specifies the base language used in the element. Applies only when the document is interpreted as HTML.
profile Specifies the location of one or more whitespace-separated metadata profile URIs.
xml:lang Specifies the base language for the element when the document is interpreted as an XML document.

Each XHTML document should have a <head> element, like the one in this example:

<?xml version="1.0"?> <!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd"> <html xmlns="http://www.w3.org/1999/xhtml" xml:lang="en" lang="en">     <head>         <title>             Welcome to my page         </title>     </head>     <body>         <h1>             Welcome to XHTML!         </h1>     </body> </html>

The head of an XHTML document holds information that isn't directly displayed in the document itself, such as a title for the document (which usually appears in the browser's title bar), keywords that search engines can pick up, the base address for URIs, and so on. The head of every XHTML document is supposed to contain a <title> element, which holds the title of the document.

This element also supports the usual attributes, such as lang and xml:lang, as well as one attribute that is specific to <head> elements: profile. The profile attribute can hold a whitespace-separated list of URIs that hold information about the document, such as a description of the document, the author's name, copyright information, and so forth. (None of the major browsers implement this attribute yet.)

Here are the elements that can appear in the head:

Elements Description
<base> Specifies the base URI for the document
<isindex> Supports rudimentary input control
<link> Specifies the relationship between the document and an external object
<meta> Contains information about the document
<noscript> Contains text that appears only if the browser does not support the <script> tag
<object> Embeds an object
<script> Contains programming scripts, such as JavaScript code
<style> Contains style information used for rendering elements
<title> Gives the document's title, which appears in the browser

As mentioned, each <head> element should contain exactly one <title> element.

Document Title (<title>)

As in HTML, the <title> element contains a title for the document, stored as simple text. Most browsers will read the document's title and display it in its title bar. The title of a document is also used by search engines. This element is supported in XHTML 1.0 Strict, XHTML 1.0 Transitional, XHTML 1.0 Frameset, and XHTML 1.1. Here are the attributes for this element:

Attribute Description
dir Sets the direction of directionally neutral text. You can set this attribute to LTR, for left-to-right text or RTL, for right-to-left text.
lang Specifies the base language used in the element. Applies only when the document is interpreted as HTML.
xml:lang Specifies the base language for the element when the document is interpreted as an XML document.

You use the <title> element to specify the document's title to browsers and search engines; browsers usually display the title in the title bar. The W3C XHTML DTDs say, "Exactly one title is required per document." However, the W3C XHTML validator doesn't complain if you omit a title. Nonetheless, you should put a <title> element in every document.

We saw an example <title> element at the beginning of this chapter:

<?xml version="1.0"?> <!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd"> <html xmlns="http://www.w3.org/1999/xhtml" xml:lang="en" lang="en">     <head>         <title>             Welcome to my page         </title>     </head>     <body>         <h1>             Welcome to XHTML!         </h1>     </body> </html>

No major browser will react badly if you don't give a document a title. However, XHTML documents should have one, according to the W3C.

We've completed the head section of XHTML documents; next comes the body.

Document Body (<body>)

The document's body is where the action is the content that the document is designed to contain, that is. The <body> element is supported in XHTML 1.0 Strict, XHTML 1.0 Transitional, and XHTML 1.1. Here are this element's attributes, all of which are supported in XHTML 1.0 Strict, XHTML 1.0 Transitional, and XHTML 1.1, unless otherwise noted:

Attribute Description
alink Is deprecated in HTML 4.0. Sets the color of hyperlinks when they're being activated. (XHTML 1.0 Transitional, XHTML 1.0 Frameset.)
background Is deprecated in HTML 4.0. Holds the URI of an image to be used in tiling the browser's background. (XHTML 1.0 Transitional, XHTML 1.0 Frameset.)
bgcolor Is deprecated in HTML 4.0. Sets the color of the browser's background. (XHTML 1.0 Transitional, XHTML 1.0 Frameset.)
class Gives the style class of the element.
dir Sets the direction of directionally neutral text. You can set this attribute to LTR, for left-to-right text or RTL, for right-to-left text.
id Use the ID to refer to the element; set this attribute to a unique identifier.
lang Specifies the base language used in the element. Applies only when the document is interpreted as HTML.
link Is deprecated in HTML 4.0. Sets the color of hyperlinks that have not yet been visited. (XHTML 1.0 Transitional, XHTML 1.0 Frameset.)
style Set to an inline style to specify how the browser should display the element.
text Is deprecated in HTML 4.0. Sets the color of the text in the document. (XHTML 1.0 Transitional, XHTML 1.0 Frameset.)
title Contains the title of the body (which might be displayed in ToolTips).
vlink Is deprecated in HTML 4.0. Sets the color of hyperlinks that have been visited already. (XHTML 1.0 Transitional, XHTML 1.0 Frameset.)
xml:lang Specifies the base language for the element when the document is interpreted as an XML document.

This element also supports these events in XHTML: onclick, ondblclick, onload, onmousedown, onmouseup, onmouseover, onmousemove, onmouseout, onkeypress, onkeydown, onkeyup, and onunload. You can use scripts like JavaScript with events like these, and I'll take a look at how in the next chapter.

If you place descriptions of your document in the <head> element, you place the actual content of the document in the <body> element unless you're sectioning your page into frames, in which case you should use the <frameset> element instead of the <body> element.

We've already seen a simple example in which the content of a page is just an <h1> heading, like this:

<?xml version="1.0"?> <!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd"> <html xmlns="http://www.w3.org/1999/xhtml" xml:lang="en" lang="en">     <head>         <title>             Welcome to my page         </title>     </head>     <body>         <h1>             Welcome to XHTML!         </h1>     </body> </html>

If you've written HTML before, you may be startled to discover that many cherished attributes are now considered deprecated in XHTML, which means that they're omitted from XHTML 1.0 Strict and XHTML 1.1. Deprecated attributes of the <body> element include these:

  • alink

  • background

  • bgcolor

  • link

  • text

  • vlink

Instead of using these attributes, you're now supposed to use style sheets. Here's an example showing how to replace deprecated attributes. In this case, I'll set the browser's background to white, the color of displayed text to black, the color of hyperlinks (created with the <a> element, which I'll take a look at in the next chapter) to red, the color of activated links to blue, and the color of visited links to green, all using deprecated attributes of the <body> element:

<?xml version="1.0"?> <!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd"> <html xmlns="http://www.w3.org/1999/xhtml" xml:lang="en" lang="en">     <head>         <title>             Welcome to my page         </title>     </head>     <body bgcolor="white" text="black" link="red" alink="blue"         vlink="green">         Welcome to my XHTML document.         Want to check out more about XHTML?         Go to         <a href="http://www.w3c.org">W3C</a>.     </body> </html>

You can see this document displayed in Netscape in Figure 16.3. It works as it should, but the fact is that it's not strict XHTML.

Figure 16.3. Displaying a hyperlink in Netscape.

graphics/16fig03.gif

To make the same page adhere to the XHTML strict standard, you use style sheets. Here's how this page looks using a <style> element to set up the same colors (I'll take a look at the <style> element more closely in the next chapter):

<?xml version="1.0"?> <!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd"> <html xmlns="http://www.w3.org/1999/xhtml" xml:lang="en" lang="en">     <head>         <title>             Welcome to my page         </title>         <style type="text/css">             body {background: white; color: black}             a:link {color: red}             a:visited {color: green}             a:active {color: blue}         </style>     </head>     <body>         Welcome to my XHTML document.         Want to check out more about XHTML?         Go to         <a href="http://www.w3c.org">W3C</a>.     </body> </html>

In this case, I'm using CSS to style this document. To separate content from markup, the W3C is relying on style sheets a great deal these days. Note, however, that the contents of a <style> element are still part of the XHTML document. This means that if you use sensitive characters such as & or < in it, you should either escape those characters or use an external style sheet, which I'll take a look at in the next chapter.

Comments (<!-->)

Because XHTML documents are actually XML, they support XML comments, which you can use to annotate your document. These annotations will not be displayed by the browser. XHTML comments are supported in XHTML 1.0 Strict, XHTML 1.0 Transitional, XHTML 1.0 Frameset, and XHTML 1.1. Comments have no attributes.

We're familiar with comments from XML; you enclose the text in a comment like this: <!--This page was last updated July 3.-->. Using comments, you can describe to readers what's going on in your document.

Here's how I might add comments to an XHTML document:

<?xml version="1.0"?> <!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd"> <html xmlns="http://www.w3.org/1999/xhtml" xml:lang="en" lang="en">     <!-- This is the document head -->     <head>         <!-- This is the document title -->         <title>             Welcome to my page         </title>     </head>     <!-- This is the document body element -->     <body>         <h1>             <!-- This is an h1 heading element -->             Welcome to XHTML!         </h1>     </body> </html>

As with any XML documents, the comments are supposed to be stripped out by the XML processor that reads the document. On the other hand, keep in mind that comments are text if you have a lot of them in a lengthy document, you can increase the download time of your document significantly.

Headings (<h1> Through <h6>)

You use the <h1> through <h6> elements to creating headings in your documents. These are the familiar headings from HTML: <h1> creates the largest text, and <h6> creates the smallest. These elements are supported in 1.0 Strict, 1.0 Transitional, 1.0 Frameset, and XHTML 1.1. This table lists the possible attributes of these elements. Unless otherwise noted, versions 1.0 Strict, 1.0 Transitional, 1.0 Frameset, and XHTML 1.1 support them:

Attribute Description
align Gives the alignment of text in the heading. The possible values are left (the default), center, right, and justify. (XHTML 1.0 Transitional, XHTML 1.0 Frameset.)
class Gives the style class of the element.
dir Sets the direction of directionally neutral text. You can set this attribute to LTR, for left-to-right text, or RTL, for right-to-left text.
id Use the ID to refer to the element; set this attribute to a unique identifier.
lang Specifies the base language used in the element. Applies only when the document is interpreted as HTML.
style Set this to an inline style to specify how the browser should display the element.
title Contains the title of the element (which might be displayed in ToolTips).
xml:lang Specifies the base language for the element when the document is interpreted as an XML document.

These elements also support these events in XHTML: onclick, ondblclick, onmousedown, onmouseup, onmouseover, onmousemove, onmouseout, onkeypress, onkeydown, and onkeyup.

Headings act much like headlines in newspapers. They are block elements that present text in bold and that often are larger than other text. Six heading tags exist: <h1>, <h2>, <h3>, <h4>, <h5>, and <h6>. Because headings are block elements, they get their own line in a displayed XHTML document.

Here's an example that shows these headings in action:

<?xml version="1.0"?> <!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd"> <html xmlns="http://www.w3.org/1999/xhtml" xml:lang="en" lang="en">     <head>         <title>             The &lt;h1&gt; - &lt;h6&gt; Elements         </title>     </head>     <body>         <center>             <h1>This is an &lt;h1&gt; heading</h1>             <h2>This is an &lt;h2&gt; heading</h2>             <h3>This is an &lt;h3&gt; heading</h3>             <h4>This is an &lt;h4&gt; heading</h4>             <h5>This is an &lt;h5&gt; heading</h5>             <h6>This is an &lt;h6&gt; heading</h6>         </center>     </body> </html>

You can see this XHTML displayed in Netscape in Figure 16.4. Headings such as these help break up the text in a page, just as they do in newspapers, and they let the structure of your document stand out.

Figure 16.4. Displaying the six levels of headings in Netscape.

graphics/16fig04.gif

Handling Text

Displaying simple text works the same way in XHTML as it does in HTML: You just place the text directly in a document. XHTML elements that display text have mixed content models, so they can contain both text and other elements, as in the example we saw earlier:

<?xml version="1.0"?> <!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd"> <html xmlns="http://www.w3.org/1999/xhtml" xml:lang="en" lang="en">     <head>         <title>             Welcome to my page         </title>     </head>     <body bgcolor="white" text="black" link="red" alink="blue"         vlink="green">         Welcome to my XHTML document.         Want to check out more about XHTML?         Go to         <a href="http://www.w3c.org">W3C</a>.     </body> </html>

The text in this document is displayed directly in the browser, as you see in Figure 16.3. It's up to you to format the text the way you want it. In early versions of HTML, you used elements such as <b> (bold), <i> (italic), and <u> (underline) to format text, but as formatting has become more sophisticated, the emphasis has switched to using style sheets.

As you know, there are five predefined entity references in XML, and they stand for characters that can be interpreted as markup or other control characters:

This Displays This
&amp; &
&apos; '
&gt; >
&lt; <
&quot; "

There are a great many more character entities in HTML 4.0, and they're supported in XHTML as well. You can find them in Table 16.1.

Table 16.1. Character Entities in XHTML (Support Varies by Browser)
Entity Number Display This
Aacute &#193; Latin capital letter A with acute accent
aacute &#225; Latin small letter a with acute accent
Acirc &#194; Latin capital letter A with circumflex
acirc &#226; Latin small letter a with circumflex
acute &#180; Acute accent
AElig &#198; Latin capital letter AE
aelig &#230; Latin small letter ae
Agrave &#192; Latin capital letter A with grave accent
agrave &#224; Latin small letter a with grave accent
alefsym &#8501; Alef symbol = first transfinite cardinal
Alpha &#913; Greek capital letter alpha
alpha &#945; Greek small letter alpha
amp &#38; Ampersand
and &#8743; Logical and
ang &#8736; Angle
Aring &#197; Latin capital letter A with ring above
aring &#229; Latin small letter a with ring above
asymp &#8776; Almost equal to = asymptotic to
Atilde &#195; Latin capital letter A with tilde
atilde &#227; Latin small letter a with tilde
Auml &#196; Latin capital letter A with diaeresis (umlaut)
auml &#228; Latin small letter a with diaeresis (umlaut)
bdquo &#8222; Double low-9 quotation mark
Beta &#914; Greek capital letter beta
beta &#946; Greek small letter beta
brvbar &#166; Broken bar = broken vertical bar
bull &#8226; Bullet = black small circle
cap &#8745; Intersection = cap
Ccedil &#199; Latin capital letter C with cedilla
ccedil &#231; Latin small letter c with cedilla
cedil &#184; Cedilla
cent &#162; Cent sign
Chi &#935; Greek capital letter chi
chi &#967; Greek small letter chi
circ &#710; Modifier letter circumflex accent
clubs &#9827; Black club suit = shamrock
cong &#8773; Approximately equal to
copy &#169; Copyright sign
crarr &#8629; Downward arrow with corner leftward
cup &#8746; Union = cup
curren &#164; Currency sign
dagger &#8224; Dagger
Dagger &#8225; Double dagger
darr &#8595; Downward arrow
dArr &#8659; Downward double arrow
deg &#176; Degree sign
Delta &#916; Greek capital letter delta
delta &#948; Greek small letter delta
diams &#9830; Black diamond suit
divide &#247; Division sign
Eacute &#201; Latin capital letter E with acute
eacute &#233; Latin small letter e with acute
Ecirc &#202; Latin capital letter E with circumflex
ecirc &#234; Latin small letter e with circumflex
Egrave &#200; Latin capital letter E with grave accent
egrave &#232; Latin small letter e with grave accent
empty &#8709; Empty set = null set = diameter
emsp &#8195; Em space
ensp &#8194; En space
Epsilon &#917; Greek capital letter epsilon
epsilon &#949; Greek small letter epsilon
equiv &#8801; Identical to
Eta &#919; Greek capital letter eta
eta &#951; Greek small letter eta
ETH &#208; Latin capital letter ETH
eth &#240; Latin small letter eth
Euml &#203; Latin capital letter E with diaeresis (umlaut)
euml &#235; Latin small letter e with diaeresis
euro &#8364; Euro sign
exist &#8707; There exists
fnof &#402; Latin small f with hook = function
forall &#8704; For all
frac12 &#189; Vulgar fraction one-half
frac14 &#188; Vulgar fraction one-quarter
frac34 &#190; Vulgar fraction three-quarters
frasl &#8260; Fraction slash
Gamma &#915; Greek capital letter gamma
gamma &#947; Greek small letter gamma
ge &#8805; Greater than or equal to
gt &#62; Greater than sign
harr &#8596; Left right arrow
hArr &#8660; Left right double arrow
hearts &#9829; Black heart suit = valentine
hellip &#8230; Horizontal ellipsis = three-dot leader
Iacute &#205; Latin capital letter I with acute accent
iacute &#237; Latin small letter i with acute accent
Icirc &#206; Latin capital letter I with circumflex
icirc &#238; Latin small letter i with circumflex
iexcl &#161; Inverted exclamation mark
Igrave &#204; Latin capital letter I with grave accent
igrave &#236; Latin small letter i with grave accent
image &#8465; Blackletter capital I = imaginary part
infin &#8734; Infinity
int &#8747; Integral
Iota &#921; Greek capital letter iota
iota &#953; Greek small letter iota
iquest &#191; Inverted question mark
isin &#8712; Element of
Iuml &#207; Latin capital letter I with diaeresis (umlaut)
iuml &#239; Latin small letter i with diaeresis
Kappa &#922; Greek capital letter kappa
kappa &#954; Greek small letter kappa
Lambda &#923; Greek capital letter lambda
lambda &#955; Greek small letter lambda
lang &#9001; Left-pointing angle bracket = bra
laquo &#171; Left-pointing double angle quotation mark
larr &#8592; Leftward arrow
lArr &#8656; Leftward double arrow
lceil &#8968; Left ceiling = apl upstile
ldquo &#8220; Left double quotation mark
le &#8804; Less than or equal to
lfloor &#8970; Left floor = apl downstile
lowast &#8727; Asterisk operator
loz &#9674; Lozenge
lrm &#8206; Left-to-right mark
lsaquo &#8249; Single left-pointing angle quotation mark
lsquo &#8216; Left single quotation mark
lt &#60; Less than
macr &#175; Macron = spacing macron
mdash &#8212; Em dash
micro &#181; Micro sign
middot &#183; Middle dot
minus &#8722; Minus sign
Mu &#924; Greek capital letter mu
mu &#956; Greek small letter mu
nabla &#8711; Nabla = backward difference
nbsp &#160; No-break space = nonbreaking space
ndash &#8211; En dash
ne &#8800; Not equal to
ni &#8715; Contains as member
not &#172; Not sign
notin &#8713; Not an element of
nsub &#8836; Not a subset of
Ntilde &#209; Latin capital letter N with tilde
ntilde &#241; Latin small letter n with tilde
Nu &#925; Greek capital letter nu
nu &#957; Greek small letter nu
Oacute &#211; Latin capital letter O with acute accent
oacute &#243; Latin small letter o with acute accent
Ocirc &#212; Latin capital letter O with circumflex
ocirc &#244; Latin small letter o with circumflex
OElig &#338; Latin capital ligature OE
oelig &#339; Latin small ligature oe
Ograve &#210; Latin capital letter O with grave accent
ograve &#242; Latin small letter o with grave accent
oline &#8254; Overline = spacing overscore
Omega &#937; Greek capital letter omega
omega &#969; Greek small letter omega
Omicron &#927; Greek capital letter omicron
omicron &#959; Greek small letter omicron
oplus &#8853; Circled plus = direct sum
or &#8744; Logical or = vee
ordf &#170; Feminine ordinal indicator
ordm &#186; Masculine ordinal indicator
Oslash &#216; Latin capital letter O with stroke
oslash &#248; Latin small letter o with stroke
Otilde &#213; Latin capital letter O with tilde
otilde &#245; Latin small letter o with tilde
otimes &#8855; Circled times = vector product
Ouml &#214; Latin capital letter O with diaeresis (umlaut)
ouml &#246; Latin small letter o with diaeresis (umlaut)
para &#182; Pilcrow sign
part &#8706; Partial differential
permil &#8240; Per mille sign
perp &#8869; Up tack = orthogonal to = perpendicular
Phi &#934; Greek capital letter phi
phi &#966; Greek small letter phi
Pi &#928; Greek capital letter pi
pi &#960; Greek small letter pi
piv &#982; Greek pi symbol
plusmn &#177; Plus-minus sign
pound &#163; Pound sign
prime &#8242; Prime = minutes = feet
Prime &#8243; Double prime = seconds = inches
prod &#8719; N-ary product = product sign
prop &#8733; Proportional to
Psi &#936; Greek capital letter psi
psi &#968; Greek small letter psi
quot &#34; Quotation mark = APL quote
radic &#8730; Square root = radical sign
rang &#9002; Right-pointing angle bracket = ket
raquo &#187; Right-pointing double angle quotation mark
rarr &#8594; Rightward arrow
rArr &#8658; Rightward double arrow
rceil &#8969; Right ceiling
rdquo &#8221; Right double quotation mark
real &#8476; Blackletter capital R = real part symbol
reg &#174; Registered sign
rfloor &#8971; Right floor
Rho &#929; Greek capital letter rho
rho &#961; Greek small letter rho
rlm &#8207; Right-to-left mark
rsaquo &#8250; Single right-pointing angle quotation mark
rsquo &#8217; Right single quotation mark
sbquo &#8218; Single low-9 quotation mark
Scaron &#352; Latin capital letter S with caron
scaron &#353; Latin small letter s with caron
sdot &#8901; Dot operator
sect &#167; Section sign
shy &#173; Soft hyphen
Sigma &#931; Greek capital letter sigma
sigma &#963; Greek small letter sigma
sigmaf &#962; Greek small letter final sigma
sim &#8764; Tilde operator
spades &#9824; Black spade suit
sub &#8834; Subset of
sube &#8838; Subset of or equal to
sum &#8721; N-ary summation
sup &#8835; Superset of
sup1 &#185; Superscript 1
sup2 &#178; Superscript 2
sup3 &#179; Superscript 3
supe &#8839; Superset of or equal to
szlig &#223; Latin small letter sharp s
Tau &#932; Greek capital letter tau
tau &#964; Greek small letter tau
there4 &#8756; Therefore
Theta &#920; Greek capital letter theta
theta &#952; Greek small letter theta
thetasym &#977; Greek small letter theta symbol
thinsp &#8201; Thin space
THORN &#222; Latin capital letter THORN
thorn &#254; Latin small letter thorn
tilde &#732; Small tilde
times &#215; Multiplication sign
trade &#8482; Trademark sign
Uacute &#218; Latin capital letter U with acute accent
uacute &#250; Latin small letter u with acute accent
uarr &#8593; Upward arrow
uArr &#8657; Upward double arrow
Ucirc &#219; Latin capital letter U with circumflex
ucirc &#251; Latin small letter u with circumflex
Ugrave &#217; Latin capital letter U with grave accent
ugrave &#249; Latin small letter u with grave accent
uml &#168; Diaeresis (umlaut)
upsih &#978; Greek upsilon with hook symbol
Upsilon &#933; Greek capital letter upsilon
upsilon &#965; Greek small letter upsilon
Uuml &#220; Latin capital letter U with diaeresis (umlaut)
uuml &#252; Latin small letter u with diaeresis
weierp &#8472; Script capital P = power set
Xi &#926; Greek capital letter xi
xi &#958; Greek small letter xi
Yacute &#221; Latin capital letter Y with acute accent
yacute &#253; Latin small letter y with acute accent
yen &#165; Yen sign = yuan sign
Yuml &#376; Latin capital letter Y with diaeresis
yuml &#255; Latin small letter y with diaeresis
Zeta &#918; Greek capital letter zeta
zeta &#950; Greek small letter zeta
zwj &#8205; Zero-width joiner
zwnj &#8204; Zero-width nonjoiner

As I mentioned before, as with HTML, XHTML supports the various text-formatting tags, such as <b> for bold text, <i> for italic text, and <u> for underlined text. I'll take a look at them briefly because they're still very popular.

Making Text Bold (<b>) or Italic (<i>)

The <b> element gives you a simple inline way of bolding text. Although plenty of experts would prefer that you use style sheets to display text in bold, you can still use the <b> element. Like the <b> element, the <i> element offers some rudimentary text formatting in this case, creating italic text. Both elements are supported in XHTML 1.0 Strict, XHTML 1.0 Transitional, XHTML 1.0 Frameset, and XHTML 1.1. Here are their attributes:

Attribute Description
class Gives the style class of the element.
dir Sets the direction of directionally neutral text. You can set this attribute to LTR, for left-to-right text, or RTL, for right-to-left text.
id Use the ID to refer to the element; set this attribute to a unique identifier.
lang Specifies the base language used in the element. Applies only when the document is interpreted as HTML.
style Set this to an inline style to specify how the browser should display the element.
title Contains the title of the element (which might be displayed in ToolTips).
xml:lang Specifies the base language for the element when the document is interpreted as an XML document.

Here are the official XHTML events that these elements support: onclick, ondblclick, onmousedown, onmouseup, onmouseover, onmousemove, onmouseout, onkeypress, onkeydown, and onkeyup.

Here's an example that displays text in both italic and bold (I'm using the line break, <br> element, which we'll see later, to separate the lines of text):

<?xml version="1.0"?> <!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd"> <html xmlns="http://www.w3.org/1999/xhtml" xml:lang="en" lang="en">     <head>         <title>             Bold and Italic Text         </title>     </head>     <body>         <i>This text is italic.</i>         <br />         <b>This text is bold.</b>         <br />         <b><i>This text is both.</i></b>     </body> </html>

The results of this XHTML appear in Figure 16.5, where you can see text that's bold, italic, and both bold and italic. The <b> and <i> tags are favorites among Web page authors because they're so easy to use.

Figure 16.5. Displaying bold and italic text in Netscape.

graphics/16fig05.gif

Underlining Text (<u>)

The <u> element displays underlined text. This element was deprecated in HTML 4.0, so it is not supported in XHTML 1.0 Strict or XHTML 1.1. It is supported in XHTML 1.0 Transitional and XHTML 1.0 Frameset, however. Note, of course, that if your readers are very traditional, they might mistake underlined text for a hyperlink. Here are the attributes of this element:

Attribute Description
class Gives the style class of the element.
dir Sets the direction of directionally neutral text. You can set this attribute to LTR, for left-to-right text, or RTL, for right-to-left text.
id Use the ID to refer to the element; set this attribute to a unique identifier.
lang Specifies the base language used in the element. Applies only when the document is interpreted as HTML.
style Set this to an inline style to specify how the browser should display the element.
title Contains the title of the element (which might be displayed in ToolTips).
xml:lang Specifies the base language for the element when the document is interpreted as an XML document.

Here are the official XHTML events this element supports: onclick, ondblclick, onmousedown, onmouseup, onmouseover, onmousemove, onmouseout, onkeypress, onkeydown, and onkeyup.

The <u> element offers another easy formatting option, underlining its enclosed text. This element is deprecated in HTML 4.0, so you can't use it in strict XHTML 1.0 or XHTML 1.1. Here's an example putting <u> to work:

<?xml version="1.0"?> <!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd"> <html xmlns="http://www.w3.org/1999/xhtml" xml:lang="en" lang="en">     <head>         <title>             Using the &lt;u&gt; Element         </title>     </head>     <body>         You can &lt;u>underline&lt;/u> text for a little more emphasis.     </body> </html>

The results of this XHTML appear in Figure 16.6.

Figure 16.6. Displaying underlined text in Netscape.

graphics/16fig06.gif

Specifying a Text Font (<font>)

Using the <font> element, you can select text size, color, and face. The <font> element has always been very popular among HTML authors, but with the new emphasis on handling styles in style sheets, you can imagine that it was headed for extinction. And it has indeed been deprecated in HTML 4.0, so it's not available in XHTML 1.1 or XHTML 1.0 Strict. It's supported in XHTML 1.0 Transitional and XHTML 1.0 Frameset. Because it's so popular still, I'll cover it here briefly.

Here are this element's attributes:

Attribute Description
class Gives the style class of the element.
color Is deprecated. Sets the color of the text.
dir Sets the direction of directionally neutral text. You can set this attribute to LTR, for left-to-right text, or RTL, for right-to-left text.
face Is deprecated. You can set this attribute to a single font name or a list of names separated by commas. The browser will select the first font face from the list that it can find.
id Use the ID to refer to the element; set this attribute to a unique identifier.
lang Specifies the base language used in the element. Applies only when the document is interpreted as HTML.
size Is deprecated. Gives the size of the text. Possible values range from 1 through 7.
style Set this to an inline style to specify how the browser should display the element.
title Contains the title of the element (which might be displayed in ToolTips).
xml:lang Specifies the base language for the element when the document is interpreted as an XML document.

This element does not support any XHTML events.

You can use the <font> element to set a font face, size, and color for text. Here's an example; in this case, I'm displaying text in a large red Arial font:

<?xml version="1.0"?> <!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd"> <html xmlns="http://www.w3.org/1999/xhtml" xml:lang="en" lang="en">     <head>         <title>             Using the &lt;font&gt; Element         </title>     </head>     <body>         <font size="6" color="#ff0000" face="Arial">         Putting the &lt;font&gt; element to work.         </font>     </body> </html>

The results of this XHTML appear in Figure 16.7.

Figure 16.7. Using the <font> element in Netscape.

graphics/16fig07.gif

You specify font sizes by using the values 1 through 7. In practice, font size 1 is about 6 points, font size 2 is about 12 points, and so on, but actual sizes vary by system. Here's an example showing the range of possible sizes:

<?xml version="1.0"?> <!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd"> <html xmlns="http://www.w3.org/1999/xhtml" xml:lang="en" lang="en">     <head>         <title>             Using the &lt;font&gt; Element         </title>     </head>     <body>         <center>             <h1>                 Using the &lt;font&gt; Element             </h1>             <font size="1">This is font size 1.</font>             <br />             <font size="2">This is font size 2.</font>             <br />             <font size="3">This is font size 3.</font>             <br />             <font size="4">This is font size 4.</font>             <br />             <font size="5">This is font size 5.</font>             <br />             <font size="6">This is font size 6.</font>             <br />             <font size="7">This is font size 7.</font>         </center>     </body> </html>

The results of this XHTML appear in Figure 16.8. As mentioned earlier, <font> has been deprecated in HTML 4.0 in favor of style sheets. So how should you replace the <font> element? See the section "Formatting Text Inline (<span>)," at the end of this chapter, for a good substitute.

Besides the simple text formatting elements, HTML also contains elements to arrange text in the display; XHTML supports those elements as well.

Figure 16.8. Displaying various font sizes in Netscape.

graphics/16fig08.gif

Line Breaks (<br>) and Text Paragraphs (<p>)

The <br> element is an empty element that inserts a line break into text. Because this element is empty, you use it like this in XHTML: <br />. This element is supported in XHTML 1.0 Strict, XHTML 1.0 Transitional, XHTML 1.0 Frameset, and XHTML 1.1. Here are the attributes of this element:

Attribute Description
class Gives the style class of the element.
clear Is used to move past-aligned images or other elements. Set this to none (the default just a normal break), left (breaks line and moves down until there is a clear left margin past the aligned element), right (breaks line and moves down until there is a clear right margin past the aligned element), or all (breaks line and moves down until both margins are clear of the aligned element). (XHTML 1.0 Transitional, XHTML 1.0 Frameset.)
id Use the ID to refer to the element; set this attribute to a unique identifier.
style Set this to an inline style to specify how the browser should display the element.
title Contains the title of the element (which might be displayed in ToolTips).

This element does not support any XHTML events.

You use the <br> element to arrange the text in a document by adding a line break, making the browser skip to the next text line. This usage actually does not cause any problems in the major browsers, and the fact that those browsers are capable of handling empty elements with the usual XML /> closing characters is one of the reasons that XHTML actually works as it should in HTML browsers. In fact, you can also insert line breaks as <br></br>, but that usage does turn out to be confusing to some browsers and XML validators.

Letting the Browser Handle the Formatting

Ideally, you should let the browser handle text formatting as much as possible. The text flow is supposed to be handled by the browser to display that text as best as possible to fit the display area. This means that if you add a lot of line breaks, you may interfere with the best possible display (unless you're adding line breaks to specifically separate discrete elements, such as images). It's usually best to format your text into paragraphs that the browser can handle as appropriate, rather than expressly adding line breaks to text yourself.

The <p> element enables you to break text up into paragraphs. Paragraphs are block elements that you can format as you like in style sheets or with style attributes, including indenting the first line and so forth. If you're coming to XHTML from HTML, one thing to recall is that every <p> tag needs a corresponding </p> tag, which is easy to forget because HTML doesn't require that. In addition, note that paragraphs are block elements, which in XHTML means that you cannot display other block elements, such as headings, in them. The <p> element is supported in XHTML 1.0 Strict, XHTML 1.0 Transitional, XHTML 1.0 Frameset, and XHTML 1.1. Here are this element's attributes:

Attribute Description
align Is deprecated in HTML 4. Sets the alignment of the text. Possible values include left (the default), right, center, and justify. (XHTML 1.0 Transitional, XHTML 1.0 Frameset.)
class Gives the style class of the element.
dir Sets the direction of directionally neutral text. You can set this attribute to LTR, for left-to-right text, or RTL, for right-to-left text.
id Use the ID to refer to the element; set this attribute to a unique identifier.
lang Specifies the base language used in the element. Applies only when the document is interpreted as HTML.
style Set this to an inline style to specify how the browser should display the element.
title Contains the title of the element (which might be displayed in ToolTips).
xml:lang Specifies the base language for the element when the document is interpreted as an XML document.

This element supports these XHTML events: onclick, ondblclick, onmousedown, onmouseup, onmouseover, onmousemove, onmouseout, onkeypress, onkeydown, and onkeyup.

You use the <p> element to organize your text. The browser adds a little vertical space on top of paragraphs to separate them from other elements. The browser formats the text in a paragraph to fit the current page width.

Here's an example; in this case, I'm using <br> elements to introduce line breaks, and a <p> element to create a new paragraph:

<?xml version="1.0"?> <!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd"> <html xmlns="http://www.w3.org/1999/xhtml" xml:lang="en" lang="en">     <head>         <title>             Using the &lt;br&gt; and &lt;p&gt; Elements         </title>     </head>     <body>         <center>             <h1>                 Using the &lt;br&gt; and &lt;p&gt; Elements             </h1>         </center>         This is a line of text.         <br />         Using a line break skips to the next line.         <p style="font-weight: bold">             This is a line of bold text in a paragraph.             <br />             Here's a new line of text in the same paragraph.         </p>     </body> </html>

The results of this code appear in Figure 16.9. As you can see, inserting a <br> element makes the browser move to the next line of text.

Figure 16.9. Using line breaks and paragraphs in Netscape.

graphics/16fig09.gif

This example points out the difference between <br> and <p>. The <br> element is empty and just makes the flow of text skip to the next line. The <p> element, on the other hand, is a block element that encloses content. You can apply styles to the content in a <p> element, and those styles are applied to all text in the paragraph, even if they're broken up with line breaks as you see in Figure 16.9, where the bold style of text applies to both lines in the paragraph.

Creating Horizontal Rules (<hr>)

Another handy element to arrange text is the <hr> horizontal rule element. This element just causes the browser to draw a horizontal line to separate or group elements vertically. It's supported in XHTML 1.0 Strict, XHTML 1.0 Transitional, XHTML 1.0 Frameset, and XHTML 1.1. Here are the attributes of this element note that it includes a few attributes that have been deprecated:

Attribute Description
align Is deprecated. Sets the alignment of the rule; set this to left, center (the default), or right. To set this attribute, you must also set the width attribute. (XHTML 1.0 Transitional, XHTML 1.0 Frameset.)
class Gives the style class of the element.
id Use the ID to refer to the element; set this attribute to a unique identifier.
noshade Is deprecated. Displays the rule with a two-dimensional, not three-dimensional (the default), appearance. (XHTML 1.0 Transitional, XHTML 1.0 Frameset.)
size Is deprecated. Sets the vertical size of the horizontal rule in pixels. (XHTML 1.0 Transitional, XHTML 1.0 Frameset.)
style Set this to an inline style to specify how the browser should display the element.
title Contains the title of the element (which might be displayed in ToolTips).
width Is deprecated. Sets the horizontal width of the rule. You can set this attribute to a pixel measurement or a percentage of the display area. (XHTML 1.0 Transitional, XHTML 1.0 Frameset.)
xml:lang Specifies the base language for the element when the document is interpreted as an XML document.

These are the XHTML events supported by this element: onclick, ondblclick, onmousedown, onmouseup, onmouseover, onmousemove, onmouseout, onkeypress, onkeydown, and onkeyup.

It's easy to break your text up with horizontal rules, using the <hr> element. This can be very useful in longer documents, and it serves to organize your document visually into sections. This element is empty and just instructs the browser to insert a horizontal rule.

As with many style attributes in HTML 4.0, the <hr> element's align, width, noshade, and size attributes are all deprecated. However, they're still in the XHTML 1.0 Transitional or Frameset DTDs. Here's an example that displays a few horizontal rules of varying width and alignment:

<?xml version="1.0"?> <!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd"> <html xmlns="http://www.w3.org/1999/xhtml" xml:lang="en" lang="en">     <head>         <title>             Using the &lt;hr&gt; Element         </title>     </head>     <body>         <center>             <h1>                 Using the &lt;hr&gt; Element             </h1>         </center>         This is &lt;hr /&gt;:         <hr />         <br />         This is &lt;hr align="left" width="60%" /&gt;:         <hr align="left" width="60%" />         <br />         This is &lt;hr align="center" width="60%" /&gt;:         <hr align="center" width="60%" />         <br />         This is &lt;hr align="right" width="60%" /&gt;:         <hr align="right" width="60%" />         <br />     </body> </html>

You can see the results of this XHTML in Figure 16.10, which shows a number of ways to configure horizontal rules. Here's another note: When you set the align attribute, you must also set the width attribute.

Figure 16.10. Displaying horizontal rules in Netscape.

graphics/16fig10.gif

Centering Displayed Text (<center>)

The <center> element does just what its name implies: It centers text and elements in the browser's display area. The W3C deprecated <center> in HTML 4, so you won't find it in the XHTML 1.0 strict or XHTML 1.1 DTDs. Nonetheless, <center> remains a favorite element and will be in use for a long time to come.

Here are the attributes of this element:

Attribute Description
class Gives the style class of the element.
dir Sets the direction of directionally neutral text. You can set this attribute to LTR, for left-to-right text, or RTL, for right-to-left text.
id You use the ID to refer to the element; set this attribute to a unique identifier.
lang Specifies the base language used in the element. Applies only when the document is interpreted as HTML.
style Set this to an inline style to specify how the browser should display the element.
title Contains the title of the element (which might be displayed in ToolTips).
xml:lang Specifies the base language for the element when the document is interpreted as an XML document.

This element supports the following XHTML events: onclick, ondblclick, onmousedown, onmouseup, onmouseover, onmousemove, onmouseout, onkeypress, onkeydown, and onkeyup.

Here's an example of <center> at work centering multiline text:

<?xml version="1.0"?> <!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd"> <html xmlns="http://www.w3.org/1999/xhtml" xml:lang="en" lang="en">     <head>         <title>             Using the &lt;center&gt; Element         </title>     </head>     <body>         <center>             <h1>                 Using the &lt;center&gt; Element             </h1>         </center>         <center>             The &lt;center&gt; element is a             <br />             useful one for centering             <br />             text made up of             <br />             multiple lines.         </center>     </body> </html>

You can see the results of this XHTML in Figure 16.11.

Figure 16.11. Using the <center> element.

graphics/16fig11.gif

The <center> element is still in widespread use, which is why I'm taking a look at it here; however, it has been deprecated, which means that it will disappear from XHTML one day. So, what are you supposed to use instead? Take a look at the next topic, the <div> element.

Formatting Text Blocks (<div>)

You can use the <div> element to select or enclose a block of text, usually so that you can apply styles to it. This element is supported in XHTML 1.0 Strict, XHTML 1.0 Transitional, XHTML 1.0 Frameset, and XHTML 1.1. Here are its attributes:

Attribute Description
align Is deprecated. Sets the horizontal alignment of the element. Set this to left (the default), right, center, or justify. (XHTML 1.0 Transitional, XHTML 1.0 Frameset.)
class Gives the style class of the element.
dir Sets the direction of directionally neutral text. You can set this attribute to LTR, for left-to-right text, or RTL, for right-to-left text.
id Use the ID to refer to the element; set this attribute to a unique identifier.
lang Specifies the base language used in the element. Applies only when the document is interpreted as HTML.
style Set this to an inline style to specify how the browser should display the element.
title Contains the title of the element (which might be displayed in ToolTips).
xml:lang Specifies the base language for the element when the document is interpreted as an XML document.

This element supports these XHTML events: onclick, ondblclick, onmousedown, onmouseup, onmouseover, onmousemove, onmouseout, onkeypress, onkeydown, and onkeyup.

The <div> element enables you to refer to an entire section of your document by name. You can replace the text in it from JavaScript code, as we did in Chapter 7, "Handling XML Documents with JavaScript," where we read in XML documents and worked with them, displaying results using the <div> element's innerHTML property in Internet Explorer as in this HTML document:

<HTML>     <HEAD>          <TITLE>              Reading XML element values          </TITLE>          <SCRIPT LANGUAGE="JavaScript">               function readXMLDocument()               {                   var xmldoc, meetingsNode, meetingNode, peopleNode                   var first_nameNode, last_nameNode, outputText                   xmldoc = new ActiveXObject("Microsoft.XMLDOM")                   xmldoc.load("meetings.xml")                   meetingsNode = xmldoc.documentElement                   meetingNode = meetingsNode.firstChild                   peopleNode = meetingNode.lastChild                   personNode = peopleNode.lastChild                   first_nameNode = personNode.firstChild                   last_nameNode = first_nameNode.nextSibling                   outputText = "Third name: " +                         first_nameNode.firstChild.nodeValue + ' '                       + last_nameNode.firstChild.nodeValue                   messageDIV.innerHTML=outputText              }          </SCRIPT>     </HEAD>     <BODY>         <CENTER>             <H1>                 Reading XML element values             </H1>             <INPUT TYPE="BUTTON" VALUE="Get the name of the third person"                 ONCLICK="readXMLDocument()">             <P>             <DIV ID="messageDIV"></DIV>         </CENTER>     </BODY> </HTML>

Here's an XHTML example; in this case, I'm enclosing some text in a <div> element and styling the text in bold red italics with an XHTML <style> element. (More on the <style> element comes in the next chapter.)

<?xml version="1.0"?> <!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd"> <html xmlns="http://www.w3.org/1999/xhtml" xml:lang="en" lang="en">     <head>         <title>             Using the &lt;div&gt; tag         </title>         <style>             div {color: red; font-weight: bold; font-style: italic}         </style>     </head>          <body>         <center>             <h1>                 Using the &lt;div&gt; Element             </h1>         </center>                  <p>             <div>                 This text, which                 <br />                 takes up multiple lines,                 <br />                 was formatted all at once                 <br />                 in a single &lt;div&gt; element.             </div>         </p>     </body> </html>

You can see the results of this XHTML in Figure 16.12 where, as you see, all the lines in the <div> element were styled in the same way.

Figure 16.12. Styling text with the <div> element.

graphics/16fig12.gif

The W3C suggests that you use the <div> element's align attribute to replace the now deprecated <center> element by setting align to "center". That would look like this, where I'm modifying the example from the previous section:

<?xml version="1.0"?> <!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd"> <html xmlns="http://www.w3.org/1999/xhtml" xml:lang="en" lang="en">     <head>         <title>             Using the &lt;div&gt; Element         </title>     </head>          <body>         <div align="center">             <h1>                 Using the &lt;div&gt; Element             </h1>         </div>         <div align="center">             The &lt;div&gt; element is a             <br />             useful one for centering             <br />             text made up of             <br />             multiple lines.         </div>     </body> </html>

In fact, although W3C documentation suggests that you use the align attribute, the W3C seems to have forgotten that it deprecated that attribute in HTML 4.0. The way to center text now is setting a <div> element's text-align style property to "center". That might look like this:

<?xml version="1.0"?> <!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd"> <html xmlns="http://www.w3.org/1999/xhtml" xml:lang="en" lang="en">     <head>         <title>             Using the &lt;div&gt; Element         </title>         <style>             div {text-align: center}         </style>     </head>          <body>         <div>             <h1>                 Using the &lt;div&gt; Element             </h1>         </div>         <div>             The &lt;div&gt; element is a             <br />             useful one for centering             <br />             text made up of             <br />             multiple lines.         </div>     </body> </html>

This works as planned the text is indeed centered in the browser.

Using the positioning style properties, you can also position text with the <div> tag, even overlapping displayed text blocks. There's another handy element that you can use to select text and apply styles: <span>.

Formatting Text Inline (<span>)

The <span> element lets you select inline text to apply styles. It's supported in XHTML 1.0 Strict, XHTML 1.0 Transitional, XHTML 1.0 Frameset, and XHTML 1.1. Here are the attributes of this element:

Attribute Description
class Gives the style class of the element.
dir Sets the direction of directionally neutral text. You can set this attribute to LTR, for left-to-right text, or RTL, for right-to-left text.
id Use the ID to refer to the element; set this attribute to a unique identifier.
lang Specifies the base language used in the element. Applies only when the document is interpreted as HTML.
style Set this to an inline style to specify how the browser should display the element.
title Contains the title of the element (which might be displayed in ToolTips).
xml:lang Specifies the base language for the element when the document is interpreted as an XML document.

This element supports these XHTML events: onclick, ondblclick, onmousedown, onmouseup, onmouseover, onmousemove, onmouseout, onkeypress, onkeydown, and onkeyup.

You usually use <span> to apply styles inline, for example, in the middle of a sentence, to a few words or even characters. When styling blocks of text, you can use <div>; for individual characters, words, or sentences, use <span>.

As we saw, you can use <div> to replace the deprecated <center> element; there's also a deprecated element that you can replace with <span>: the <font> element. Using <span>, you can apply styles inline to a few characters or words, which is what Web authors previously used <font> for. For example, here I'm applying a style to a section of text using <span>, displaying that text in bold red italic:

<?xml version="1.0"?> <!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd"> <html xmlns="http://www.w3.org/1999/xhtml" xml:lang="en" lang="en">     <head>         <title>             Using the &lt;span&gt; Element         </title>         <style>             span {color: red; font-weight: bold; font-style: italic}         </style>     </head>          <body>         <center>             <h1>                 Using the &lt;span&gt; Element             </h1>         </center>         <h2>             Sometimes, for <span>emphasis</span>, you might want to             target <span>specific words</span> in your text.         </h2>     </body> </html>

You can see the results of this XHTML in Figure 16.13, where the words we want styled in a specific way are indeed styled as we want them.

Figure 16.13. Using <span> to style text.

graphics/16fig13.gif

There's more XHTML to come take a look at the next chapter.

CONTENTS


Inside XML
Real World XML (2nd Edition)
ISBN: 0735712867
EAN: 2147483647
Year: 2005
Pages: 23
Authors: Steve Holzner

Similar book on Amazon

flylib.com © 2008-2017.
If you may any questions please contact us: flylib@qtcs.net