CONTENTS |
Probably the biggest XML application today is XHTML, which is W3C's implementation of HTML 4.0 in XML. XHTML is a true XML application, which means that XHTML documents are XML documents that can be checked for well-formedness and validity.
There are two big advantages to using XHTML. First, HTML predefines all its elements and attributes, and that's not something you can change unless you use XHTML. Because XHTML is really XML, you can extend it with your own elements, and we'll see how to do that in the next chapter. Need <INVOICE>, <DELIVERY_DATE>, and <PRODUCT_ID> elements in your Web page? Now you can add them. (This aspect of XHTML isn't supported by the major browsers yet, but it's coming.) The other big advantage, as far as HTML authors are concerned, is that you can display XHTML documents in today's browsers without modification. That's the whole idea behind XHTML it's supposed to provide a bridge between XML and HTML. XHTML is true XML, but you can use it today in browsers. And that has made it very popular.
Here's an example; this page is written in standard HTML:
<HTML> <HEAD> <TITLE> Welcome to my page </HEAD> <BODY> <H1> Welcome to HTML! </H1> </BODY> </HTML>
Here's the same page, written in XHTML, with the message changed from Welcome to HTML! to Welcome to XHTML!:
<?xml version="1.0"?> <!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd"> <html xmlns="http://www.w3.org/1999/xhtml" xml:lang="en" lang="en"> <head> <title> Welcome to my page </title> </head> <body> <h1> Welcome to XHTML! </h1> </body> </html>
I'll go through exactly what's happening here in this chapter.
You save XHTML documents with the extension .html to make sure that browsers treat those documents as HTML. This document produces the same result as the previous HTML document, except that this document says Welcome to XHTML! instead, as you can see in Figure 16.1.
Take a look at this XHTML document; as you can see, it's true XML, starting with the XML declaration:
<?xml version="1.0"?> . . .
Next comes a <!DOCTYPE> element:
<?xml version="1.0"?> <!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd"> . . .
This is just a standard <!DOCTYPE> element; in this case, it indicates that the document element is <html>. Note the lowercase here <html>, not <HTML>. All elements in XHTML (except the <!DOCTYPE> element) are lowercase. That's the XHTML standard if you're accustomed to using uppercase tag names, it'll take a little getting used to.
The DTDs that XHTML use are public DTDs, created by W3C. Here, the formal public identifier (FPI) for the DTD that I'm using is "-//W3C//DTD XHTML 1.0 Transitional//EN", which is one of several DTDs available, as we'll see. I'm also giving the URL for the DTD, which for this DTD is "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd".
Using an XHTML DTD, browsers can validate XHTML documents, at least theoretically (and, in fact, browsers such as Internet Explorer will read in the DTD and check the document against it, although as we've seen, you must explicitly check whether errors occurred because the browser won't announce them).
Note also that the URI for the DTD is at W3C itself: "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd". Now imagine 40 million browsers trying to validate XHTML documents all at the same time by downloading XHTML DTDs like this one from the W3C site quite a problem. To avoid bottlenecks like this, you can copy the XHTML DTDs and store them locally (I'll give their URIs and discuss this in a few pages), or do without a DTD in your documents. However, my guess is that when we get fully enabled validating XHTML browsers, they'll have the various XHTML DTDs stored internally for immediate access, without having to download the XHTML DTDs from the Internet. (As it stands now, it takes Internet Explorer 10 to 20 seconds to download a typical XHTML DTD on a typical modem line.)
After the <!DOCTYPE> element comes the <html> element, which is the document element. It starts the actual document content:
<?xml version="1.0"?> <!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd"> <html xmlns="http://www.w3.org/1999/xhtml" xml:lang="en" lang="en"> . . .
I'm using three attributes of this XHTML element here, as is usual:
xmlns defines an XML namespace for the document.
xml:lang sets the language for the document when it's interpreted as XML.
The standard HTML attribute lang sets the language when the document is treated as HTML.
Note in particular the namespace used for XHTML: http://www.w3.org/1999/xhtml, which is the official XHTML namespace. All the XHTML elements must be in this namespace.
The remainder of the page is very like the HTML document we saw earlier the only real difference is that the tag names are now lowercase:
<?xml version="1.0"?> <!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd"> <html xmlns="http://www.w3.org/1999/xhtml" xml:lang="en" lang="en"> <head> <title> Welcome to my page </title> </head> <body> <h1> Welcome to XHTML! </h1> </body> </html>
As you see in the <!DOCTYPE> element, I'm using the XHTML DTD that's called "XHTML 1.0 Transitional". That's only one of the XHTML DTDs available, although it's currently the most popular one. So, what XHTML DTDs are available, and what do they mean? That all depends on what version of XHTML that you are using.
The standard version of XHTML, version 1.0, is just a rewrite of HTML 4.0 in XML. You can find the W3C recommendation for XHTML 1.0 at http://www.w3.org/TR/xhtml1. Essentially, it's just a set of DTDs that provide validity checks for documents that are supposed to mimic HTML 4.0 (actually, HTML 4.01). The W3C has created several DTDs for HTML 4.0, and the XHTML DTDs are based on those, translated into straight XML. As with HTML 4.0, XHTML 1.0 has three versions, which correspond to three DTDs here:
The strict XHTML 1.0 DTD. The strict DTD is based on straight HTML 4.0 and does not include support for elements and attributes that the W3C considers deprecated. This is the version of XHTML 1.0 that the W3C hopes people will migrate to in time.
The transitional XHTML 1.0 DTD. The transitional DTD is based on the transitional HTML 4.0 DTD. This DTD has support for the many elements and attributes that were deprecated in HTML 4.0 but that are still popular, such as the <CENTER> and <FONT> elements. This DTD is also named the "loose" DTD. It is the most popular version of XHTML at the moment.
The frameset XHTML 1.0 DTD. The frameset DTD is based on the frameset HTML 4.0 DTD. This is the DTD you should work with when you're creating pages based on frames: In that case, you replace the <BODY> element with a <FRAMESET> element. The DTD must reflect that, so you use the frameset DTD when working with frames. That's the difference between the XHTML 1.0 transitional and frameset DTDs the frameset DTD replaces the <BODY> element with the <FRAMESET> element.
Here are the actual <!DOCTYPE> elements you should use in XHTML for these various DTDs strict, transitional, and frameset including the URIs for these DTDs:
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Strict//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd"> <!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd"> <!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Frameset//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-frameset.dtd">
Because I'm giving the DTDs' URIs here, you can copy them and cache a local copy if you want for faster access. For example, if you place the DTD files in a directory named DTD in your Web site, your <!DOCTYPE> elements might look more like this:
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Strict//EN" "DTD/xhtml1-strict.dtd"> <!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" "DTD/xhtml1-transitional.dtd"> <!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Frameset//EN" "DTD/xhtml1-frameset.dtd">
If you cache these DTDs locally, there should be less of a bottleneck when XHTML becomes very popular and users try to download your documents.
There's also a new version of XHTML available, version 1.1. This version is not yet in W3C recommendation form; it's a working draft. You can find the current working draft of XHTML 1.1 at http://www.w3.org/TR/xhtml11.
XHTML 1.1 is a strict version of XHTML, and it's clear that the W3C wants to wean HTML authors away from their loose ways into writing very tight XML. How far those HTML authors will follow is yet to be determined. XHTML 1.1 removes all the elements and attributes deprecated in HTML 4.0, and a few more as well.
<APPLET> Versus <OBJECT>There's another interesting thing going on in XHTML 1.1: The W3C has long said that it wants to replace the <APPLET> and other elements with the Microsoft-supported <OBJECT> element. However, and surprisingly, <OBJECT> is missing from XHTML 1.1. And surprise the <APPLET> element is back. |
XHTML 1.1 is so far ahead of the pack that many of the features that today's HTML authors and browsers use aren't supported there at all. Therefore, I'm going to stick to XHTML 1.0 transitional in the examples in this chapter and the next chapter. However, I'll also indicate which elements and attributes are supported by what versions of XHTML, including XHTML 1.1, throughout these chapters.
XHTML 1.0 Versus XHTML 1.1You can find the differences between XHTML 1.0 and XHTML 1.1 at http://www.w3.org/TR/xhtml11/changes.html#a_changes. |
When you want to use XHTML 1.1, here's the <!DOCTYPE> element you should use (there's only one XHTML 1.1 DTD, not three, as in XHTML 1.0):
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.1//EN" "http://www.w3.org/TR/xhtml11/DTD/xhtml11.dtd">
Another big difference between XHTML 1.1 and XHTML 1.0 goes beyond the support offered to various elements and attributes. XHTML is designed to be modular. In practice, this means that the XHTML 1.1 DTD is actually relatively short it's a driver DTD, which inserts various other DTDs as modules. The benefit of modular DTDs is that you can omit the modules that your application doesn't support.
For example, if you're supporting XHTML 1.1 on a nonstandard device such as a PDA or even a cell phone or pager (the W3C has all kinds of big ideas for the future), you might not be able to support everything, such as tables or hyperlinks. With XHTML 1.1, all you need to do is to omit the DTD modules corresponding to tables and hyperlinks. (Several modules are marked as required in the XHTML 1.1 DTD, and those cannot be omitted.)
In fact, there's another version of XHTML that is also in the working draft stage: XHTML Basic. XHTML Basic is a very small subset of XHTML, reduced to a very minimum so that it can be supported by devices considerably simpler than standard PCs. You can find the current working draft for XHTML Basic at http://www.w3.org/TR/xhtml-basic.
If you want to use XML Basic, here's the <!DOCTYPE> element you should use:
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML Basic 1.0//EN" "http://www.w3.org/TR/xhtml-basic/xhtml-basic10.dtd">
The W3C has a number of requirements for documents before they can be called true XHTML documents. Here's the list of requirements that documents must meet:
The document must successfully validate against one of the W3C XHTML DTDs.
The document element must be <html>.
The document element, <html>, must set an XML namespace for the document, using the xmlns attribute. This namespace must be "http://www.w3.org/1999/xhtml.
There must be a <!DOCTYPE> element, and it must appear before the document element.
XHTML is designed to be displayed in today's browsers, and it works well (largely because those browsers ignore elements that they don't understand, such as <?xml?> and <!DOCTYPE>). However, because XHTML is also XML, a number of differences exist between legal HTML and legal XHTML.
As you know, XML is more particular about many aspects of writing documents than HTML is. For example, you need to place all attribute values in quotes in XML, although HTML documents can use unquoted values because HTML browsers will accept that. One of the problems the W3C is trying to solve with XHTML, in fact, is the thicket of nonstandard HTML that's out there on the Web, mostly because browsers support it. Some observers estimate that half of the code in browsers is there to handle nonstandard use of HTML, and that discourages any but the largest companies from creating HTML browsers. XHTML is supposed to be different if a document isn't in perfect XHTML, the browser is supposed to quit loading it and display an error, not guess what the document author was trying to do. Hopefully, that will make it easier to write XHTML browsers.
Here are some of the major differences between HTML and XHTML:
XHTML documents must be well-formed XML documents.
Element and attribute names must be in lowercase.
Elements that aren't empty need end tags; end tags can't be omitted as they can sometimes in HTML.
Attribute values must always be quoted.
You cannot use "standalone" attributes that are not assigned values. If need be, assign a dummy value to an attribute, as in action = "action".
Empty elements must end with the /> characters. In practice, this does not seem to be a problem for the major browsers, which is a lucky thing for XHTML because it's definitely not standard HTML.
The <a> element cannot contain other <a> elements.
The <pre> element cannot contain the <img>, <object>, <big>, <small>, <sub>, or <sup> elements.
The <button> element cannot contain the <input>, <select>, <textarea>, <label>, <button>, <form>, <fieldset>, <iframe>, or <isindex> elements.
The <label> element cannot contain other <label> elements.
The <form> element cannot contain other <form> elements.
You must use the id attribute, not the name attribute, even on elements that have also had a name attribute. In XHTML 1.0, the name attribute of the <a>, <applet>, <form>, <frame>, <iframe>, <img>, and <map> elements is formally deprecated. In practice, this is a little difficult in browsers such as Netscape that support name and not id; in that case, you should use both attributes in the same element, even though it's not legal XHTML.
You must escape sensitive characters. For example, when an attribute value contains an ampersand (&), the ampersand must be expressed as a character entity reference, as &.
As we'll see in the next chapter, there are some additional requirements for example, if you use < characters in your scripts, you should either escape such characters as <, or, if the browser can't handle that, place the script in an external file. (The W3C's suggestion to place scripts in CDATA sections is definitely not understood by any major browser today.)
You may already have a huge Web site full of HTML pages, and you might be reading all this with some trepidation how are you going to convert all those pages to the far more strict XHTML? In fact, a utility out there can do it for you the Tidy utility, created by Dave Raggett. This utility is available for a wide variety of platforms, and you can download it for free from http://www.w3.org/People/Raggett/tidy. There's also a complete set of instructions on that page.
Here's an example: I'll use Tidy in Windows to convert a file from HTML to XHTML. In this case, I'll use the example HTML file we developed earlier, as saved in a file named index.html:
<HTML> <HEAD> <TITLE> Welcome to my page </TITLE> </HEAD> <BODY> <H1> Welcome to XHTML! </H1> </BODY> </HTML>
After downloading Tidy, you run it at the command prompt. Here are the command-line switches, or options, that you can use with Tidy:
Switch | Description |
---|---|
-config file | Use the configuration file named file |
-indent or -i | Indent element content |
-omit or -o | Omit optional end tags |
-wrap 72 | Wrap text at column 72 (default is 68) |
-upper or -u | Force tags to uppercase (default is lowercase) |
-clean or -c | Replace font, nobr, &, and center tags, by CSS |
-raw | Don't substitute entities for characters 128 to 255 |
-ascii | Use ASCII for output, and Latin-1 for input |
-latin1 | Use Latin-1 for both input and output |
-utf8 | Use UTF-8 for both input and output |
-iso2022 | Use ISO2022 for both input and output |
-numeric or -n | Output numeric rather than named entities |
-modify or -m | Modify original files |
-errors or -e | Show only error messages |
-quiet or -q | Suppress nonessential output |
-f file | Write errors to file |
-xml | Use this when input is in XML |
-asxml | Convert HTML to XML |
-slides | Burst into slides on h2 elements |
-help | List command-line options |
-version | Show release date |
In this example, I'll use three switches:
-m indicates that I want Tidy to modify the file I pass to it, which will be index.html
-i indicates that I want it to indent the resulting XHTML elements
-config indicates that I want to use a configuration file named config.txt.
Here's how I use Tidy from the command line:
%tidy -m -i -config configuration.txt index.html
Tidy is actually a utility that cleans up HTML, as you might gather from its name. To make it create XHTML, you must use a configuration file, which I've named configuration.txt here. You can see all the configuration file options on the Tidy Web site. Here are the contents of configuration.txt, which I'll use to convert index.html to XHTML:
output-xhtml: yes add-xml-pi: yes doctype: loose
Here, output-xhtml indicates that I want Tidy to create XHTML output. Using add-xml-pi indicates that the output should also include an XML declaration, and doctype: loose means that I want to use the transitional XHTML DTD. If you don't specify what DTD to use, Tidy will guess, based on your HTML.
Here's the resulting XHTML document:
<?xml version="1.0"?> <!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd"> <html xmlns="http://www.w3.org/1999/xhtml"> <head> <meta name="generator" content="HTML Tidy, see www.w3.org" /> <title>Welcome to my page</title> </head> <body> <h1> Welcome to XHTML!</h1> </body> </html>
You can even teach Tidy about new XHTML tags that you've added. If you're ever stuck and want a quick way of translating HTML into XHTML, check out Tidy; it's fast, it's effective, and it's free.
The W3C has a validator you can use to check the validity of your XHTML document, and you can find this validator at http://validator.w3.org. To use the XHTML validator, you just enter the URI of your document and click the Validate This Page button. The W3C validator checks the document and gives you a full report. Here's an example response:
Congratulations, this document validates as XHTML1.0 Transitional! To show your readers that you have taken the care to create an interoperable Web page, you may display this icon on any page that validates. Here is the HTML you could use to add this icon to your Web page: <p> <a href="http://validator.w3.org/check/referer"><img src="http://validator.w3.org/images/vxhtml10" alt="Valid XHTML 1.0!" height="31" width="88" /></a> </p>
In this case, the document I tested validated properly, and the W3C validator says that I can add the official W3C XHTML 1.0 Transitional logo to the document. That logo appears in Figure 16.2.
Actually, the W3C XHTML validator does not do a complete job it doesn't check to see if values are supplied for required attributes, for example, or make sure that child elements are allowed to be nested inside the particular type of their parents. However, it does a reasonably good job.
In the remainder of this chapter, I'm going to get to the actual XHTML programming, starting with the document element, <html>.
This element is supported in XHTML 1.0 Strict, XHTML 1.0 Transitional, XHTML 1.0 Frameset, and XHTML 1.1. Here are its attributes:
Attribute | Description |
---|---|
dir | Sets the direction of text that doesn't have an inherent direction in which you should read it, called directionally neutral text. You can set this attribute to LTR, for left-to-right text, or RTL, for right-to-left text. |
lang | Specifies the base language used in the element. Applies only when the document is interpreted as HTML. |
xml:lang | Specifies the base language for the element when the document is interpreted as XML. |
xmlns | Is required. Set this attribute to "http://www.w3.org/1999/xhtml". |
The document element for all XHTML elements is <html>, which is how XHTML matches the <HTML> element in HTML documents. This element must contain all the content of the document, as in this example:
<?xml version="1.0"?> <!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd"> <html xmlns="http://www.w3.org/1999/xhtml" xml:lang="en" lang="en"> <head> <title> Welcome to my page </title> </head> <body> <h1> Welcome to XHTML! </h1> </body> </html>
The document element is very important in XML documents, of course. Note that this is one of the big differences between XHTML and HTML in HTML, the <HTML> tag is optional because it's the default. To be valid XHTML, a document must have an <html> element.
Of all the attributes of this element, only one is required xmlns, which sets the XML namespace. Most XML applications set up their own namespace to avoid overlap, and XHTML is no exception; you must set the xmlns attribute to "http://www.w3.org/1999/xhtml" in XHTML documents.
This tag also supports the lang and xml:lang attributes to let you specify the language of the document. If you specify values for both these attributes, the xml:lang attribute takes precedence in XHTML.
In XHTML, the <html> element can contain a <head> and a <body> element (or a <head> and a <frameset> element, in the XHTML 1.0 frameset document).
The <head> element contains the head of an XHTML document, which should contain at least a <title> element. The <head> element is supported in XHTML 1.0 Strict, XHTML 1.0 Transitional, XHTML 1.0 Frameset, and XHTML 1.1. Here are the attributes of this element:
Attribute | Description |
---|---|
dir | Sets the direction of directionally neutral text. You can set this attribute to LTR, for left-to-right text, or RTL, for right-to-left text. |
lang | Specifies the base language used in the element. Applies only when the document is interpreted as HTML. |
profile | Specifies the location of one or more whitespace-separated metadata profile URIs. |
xml:lang | Specifies the base language for the element when the document is interpreted as an XML document. |
Each XHTML document should have a <head> element, like the one in this example:
<?xml version="1.0"?> <!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd"> <html xmlns="http://www.w3.org/1999/xhtml" xml:lang="en" lang="en"> <head> <title> Welcome to my page </title> </head> <body> <h1> Welcome to XHTML! </h1> </body> </html>
The head of an XHTML document holds information that isn't directly displayed in the document itself, such as a title for the document (which usually appears in the browser's title bar), keywords that search engines can pick up, the base address for URIs, and so on. The head of every XHTML document is supposed to contain a <title> element, which holds the title of the document.
This element also supports the usual attributes, such as lang and xml:lang, as well as one attribute that is specific to <head> elements: profile. The profile attribute can hold a whitespace-separated list of URIs that hold information about the document, such as a description of the document, the author's name, copyright information, and so forth. (None of the major browsers implement this attribute yet.)
Here are the elements that can appear in the head:
Elements | Description |
---|---|
<base> | Specifies the base URI for the document |
<isindex> | Supports rudimentary input control |
<link> | Specifies the relationship between the document and an external object |
<meta> | Contains information about the document |
<noscript> | Contains text that appears only if the browser does not support the <script> tag |
<object> | Embeds an object |
<script> | Contains programming scripts, such as JavaScript code |
<style> | Contains style information used for rendering elements |
<title> | Gives the document's title, which appears in the browser |
As mentioned, each <head> element should contain exactly one <title> element.
As in HTML, the <title> element contains a title for the document, stored as simple text. Most browsers will read the document's title and display it in its title bar. The title of a document is also used by search engines. This element is supported in XHTML 1.0 Strict, XHTML 1.0 Transitional, XHTML 1.0 Frameset, and XHTML 1.1. Here are the attributes for this element:
Attribute | Description |
---|---|
dir | Sets the direction of directionally neutral text. You can set this attribute to LTR, for left-to-right text or RTL, for right-to-left text. |
lang | Specifies the base language used in the element. Applies only when the document is interpreted as HTML. |
xml:lang | Specifies the base language for the element when the document is interpreted as an XML document. |
You use the <title> element to specify the document's title to browsers and search engines; browsers usually display the title in the title bar. The W3C XHTML DTDs say, "Exactly one title is required per document." However, the W3C XHTML validator doesn't complain if you omit a title. Nonetheless, you should put a <title> element in every document.
We saw an example <title> element at the beginning of this chapter:
<?xml version="1.0"?> <!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd"> <html xmlns="http://www.w3.org/1999/xhtml" xml:lang="en" lang="en"> <head> <title> Welcome to my page </title> </head> <body> <h1> Welcome to XHTML! </h1> </body> </html>
No major browser will react badly if you don't give a document a title. However, XHTML documents should have one, according to the W3C.
We've completed the head section of XHTML documents; next comes the body.
The document's body is where the action is the content that the document is designed to contain, that is. The <body> element is supported in XHTML 1.0 Strict, XHTML 1.0 Transitional, and XHTML 1.1. Here are this element's attributes, all of which are supported in XHTML 1.0 Strict, XHTML 1.0 Transitional, and XHTML 1.1, unless otherwise noted:
Attribute | Description |
---|---|
alink | Is deprecated in HTML 4.0. Sets the color of hyperlinks when they're being activated. (XHTML 1.0 Transitional, XHTML 1.0 Frameset.) |
background | Is deprecated in HTML 4.0. Holds the URI of an image to be used in tiling the browser's background. (XHTML 1.0 Transitional, XHTML 1.0 Frameset.) |
bgcolor | Is deprecated in HTML 4.0. Sets the color of the browser's background. (XHTML 1.0 Transitional, XHTML 1.0 Frameset.) |
class | Gives the style class of the element. |
dir | Sets the direction of directionally neutral text. You can set this attribute to LTR, for left-to-right text or RTL, for right-to-left text. |
id | Use the ID to refer to the element; set this attribute to a unique identifier. |
lang | Specifies the base language used in the element. Applies only when the document is interpreted as HTML. |
link | Is deprecated in HTML 4.0. Sets the color of hyperlinks that have not yet been visited. (XHTML 1.0 Transitional, XHTML 1.0 Frameset.) |
style | Set to an inline style to specify how the browser should display the element. |
text | Is deprecated in HTML 4.0. Sets the color of the text in the document. (XHTML 1.0 Transitional, XHTML 1.0 Frameset.) |
title | Contains the title of the body (which might be displayed in ToolTips). |
vlink | Is deprecated in HTML 4.0. Sets the color of hyperlinks that have been visited already. (XHTML 1.0 Transitional, XHTML 1.0 Frameset.) |
xml:lang | Specifies the base language for the element when the document is interpreted as an XML document. |
This element also supports these events in XHTML: onclick, ondblclick, onload, onmousedown, onmouseup, onmouseover, onmousemove, onmouseout, onkeypress, onkeydown, onkeyup, and onunload. You can use scripts like JavaScript with events like these, and I'll take a look at how in the next chapter.
If you place descriptions of your document in the <head> element, you place the actual content of the document in the <body> element unless you're sectioning your page into frames, in which case you should use the <frameset> element instead of the <body> element.
We've already seen a simple example in which the content of a page is just an <h1> heading, like this:
<?xml version="1.0"?> <!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd"> <html xmlns="http://www.w3.org/1999/xhtml" xml:lang="en" lang="en"> <head> <title> Welcome to my page </title> </head> <body> <h1> Welcome to XHTML! </h1> </body> </html>
If you've written HTML before, you may be startled to discover that many cherished attributes are now considered deprecated in XHTML, which means that they're omitted from XHTML 1.0 Strict and XHTML 1.1. Deprecated attributes of the <body> element include these:
alink
background
bgcolor
link
text
vlink
Instead of using these attributes, you're now supposed to use style sheets. Here's an example showing how to replace deprecated attributes. In this case, I'll set the browser's background to white, the color of displayed text to black, the color of hyperlinks (created with the <a> element, which I'll take a look at in the next chapter) to red, the color of activated links to blue, and the color of visited links to green, all using deprecated attributes of the <body> element:
<?xml version="1.0"?> <!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd"> <html xmlns="http://www.w3.org/1999/xhtml" xml:lang="en" lang="en"> <head> <title> Welcome to my page </title> </head> <body bgcolor="white" text="black" link="red" alink="blue" vlink="green"> Welcome to my XHTML document. Want to check out more about XHTML? Go to <a href="http://www.w3c.org">W3C</a>. </body> </html>
You can see this document displayed in Netscape in Figure 16.3. It works as it should, but the fact is that it's not strict XHTML.
To make the same page adhere to the XHTML strict standard, you use style sheets. Here's how this page looks using a <style> element to set up the same colors (I'll take a look at the <style> element more closely in the next chapter):
<?xml version="1.0"?> <!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd"> <html xmlns="http://www.w3.org/1999/xhtml" xml:lang="en" lang="en"> <head> <title> Welcome to my page </title> <style type="text/css"> body {background: white; color: black} a:link {color: red} a:visited {color: green} a:active {color: blue} </style> </head> <body> Welcome to my XHTML document. Want to check out more about XHTML? Go to <a href="http://www.w3c.org">W3C</a>. </body> </html>
In this case, I'm using CSS to style this document. To separate content from markup, the W3C is relying on style sheets a great deal these days. Note, however, that the contents of a <style> element are still part of the XHTML document. This means that if you use sensitive characters such as & or < in it, you should either escape those characters or use an external style sheet, which I'll take a look at in the next chapter.
Because XHTML documents are actually XML, they support XML comments, which you can use to annotate your document. These annotations will not be displayed by the browser. XHTML comments are supported in XHTML 1.0 Strict, XHTML 1.0 Transitional, XHTML 1.0 Frameset, and XHTML 1.1. Comments have no attributes.
We're familiar with comments from XML; you enclose the text in a comment like this: <!--This page was last updated July 3.-->. Using comments, you can describe to readers what's going on in your document.
Here's how I might add comments to an XHTML document:
<?xml version="1.0"?> <!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd"> <html xmlns="http://www.w3.org/1999/xhtml" xml:lang="en" lang="en"> <!-- This is the document head --> <head> <!-- This is the document title --> <title> Welcome to my page </title> </head> <!-- This is the document body element --> <body> <h1> <!-- This is an h1 heading element --> Welcome to XHTML! </h1> </body> </html>
As with any XML documents, the comments are supposed to be stripped out by the XML processor that reads the document. On the other hand, keep in mind that comments are text if you have a lot of them in a lengthy document, you can increase the download time of your document significantly.
You use the <h1> through <h6> elements to creating headings in your documents. These are the familiar headings from HTML: <h1> creates the largest text, and <h6> creates the smallest. These elements are supported in 1.0 Strict, 1.0 Transitional, 1.0 Frameset, and XHTML 1.1. This table lists the possible attributes of these elements. Unless otherwise noted, versions 1.0 Strict, 1.0 Transitional, 1.0 Frameset, and XHTML 1.1 support them:
Attribute | Description |
---|---|
align | Gives the alignment of text in the heading. The possible values are left (the default), center, right, and justify. (XHTML 1.0 Transitional, XHTML 1.0 Frameset.) |
class | Gives the style class of the element. |
dir | Sets the direction of directionally neutral text. You can set this attribute to LTR, for left-to-right text, or RTL, for right-to-left text. |
id | Use the ID to refer to the element; set this attribute to a unique identifier. |
lang | Specifies the base language used in the element. Applies only when the document is interpreted as HTML. |
style | Set this to an inline style to specify how the browser should display the element. |
title | Contains the title of the element (which might be displayed in ToolTips). |
xml:lang | Specifies the base language for the element when the document is interpreted as an XML document. |
These elements also support these events in XHTML: onclick, ondblclick, onmousedown, onmouseup, onmouseover, onmousemove, onmouseout, onkeypress, onkeydown, and onkeyup.
Headings act much like headlines in newspapers. They are block elements that present text in bold and that often are larger than other text. Six heading tags exist: <h1>, <h2>, <h3>, <h4>, <h5>, and <h6>. Because headings are block elements, they get their own line in a displayed XHTML document.
Here's an example that shows these headings in action:
<?xml version="1.0"?> <!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd"> <html xmlns="http://www.w3.org/1999/xhtml" xml:lang="en" lang="en"> <head> <title> The <h1> - <h6> Elements </title> </head> <body> <center> <h1>This is an <h1> heading</h1> <h2>This is an <h2> heading</h2> <h3>This is an <h3> heading</h3> <h4>This is an <h4> heading</h4> <h5>This is an <h5> heading</h5> <h6>This is an <h6> heading</h6> </center> </body> </html>
You can see this XHTML displayed in Netscape in Figure 16.4. Headings such as these help break up the text in a page, just as they do in newspapers, and they let the structure of your document stand out.
Displaying simple text works the same way in XHTML as it does in HTML: You just place the text directly in a document. XHTML elements that display text have mixed content models, so they can contain both text and other elements, as in the example we saw earlier:
<?xml version="1.0"?> <!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd"> <html xmlns="http://www.w3.org/1999/xhtml" xml:lang="en" lang="en"> <head> <title> Welcome to my page </title> </head> <body bgcolor="white" text="black" link="red" alink="blue" vlink="green"> Welcome to my XHTML document. Want to check out more about XHTML? Go to <a href="http://www.w3c.org">W3C</a>. </body> </html>
The text in this document is displayed directly in the browser, as you see in Figure 16.3. It's up to you to format the text the way you want it. In early versions of HTML, you used elements such as <b> (bold), <i> (italic), and <u> (underline) to format text, but as formatting has become more sophisticated, the emphasis has switched to using style sheets.
As you know, there are five predefined entity references in XML, and they stand for characters that can be interpreted as markup or other control characters:
This | Displays This |
---|---|
& | & |
' | ' |
> | > |
< | < |
" | " |
There are a great many more character entities in HTML 4.0, and they're supported in XHTML as well. You can find them in Table 16.1.
Entity | Number | Display This |
---|---|---|
Aacute | Á | Latin capital letter A with acute accent |
aacute | á | Latin small letter a with acute accent |
Acirc | Â | Latin capital letter A with circumflex |
acirc | â | Latin small letter a with circumflex |
acute | ´ | Acute accent |
AElig | Æ | Latin capital letter AE |
aelig | æ | Latin small letter ae |
Agrave | À | Latin capital letter A with grave accent |
agrave | à | Latin small letter a with grave accent |
alefsym | ℵ | Alef symbol = first transfinite cardinal |
Alpha | Α | Greek capital letter alpha |
alpha | α | Greek small letter alpha |
amp | & | Ampersand |
and | ∧ | Logical and |
ang | ∠ | Angle |
Aring | Å | Latin capital letter A with ring above |
aring | å | Latin small letter a with ring above |
asymp | ≈ | Almost equal to = asymptotic to |
Atilde | Ã | Latin capital letter A with tilde |
atilde | ã | Latin small letter a with tilde |
Auml | Ä | Latin capital letter A with diaeresis (umlaut) |
auml | ä | Latin small letter a with diaeresis (umlaut) |
bdquo | „ | Double low-9 quotation mark |
Beta | Β | Greek capital letter beta |
beta | β | Greek small letter beta |
brvbar | ¦ | Broken bar = broken vertical bar |
bull | • | Bullet = black small circle |
cap | ∩ | Intersection = cap |
Ccedil | Ç | Latin capital letter C with cedilla |
ccedil | ç | Latin small letter c with cedilla |
cedil | ¸ | Cedilla |
cent | ¢ | Cent sign |
Chi | Χ | Greek capital letter chi |
chi | χ | Greek small letter chi |
circ | ˆ | Modifier letter circumflex accent |
clubs | ♣ | Black club suit = shamrock |
cong | ≅ | Approximately equal to |
copy | © | Copyright sign |
crarr | ↵ | Downward arrow with corner leftward |
cup | ∪ | Union = cup |
curren | ¤ | Currency sign |
dagger | † | Dagger |
Dagger | ‡ | Double dagger |
darr | ↓ | Downward arrow |
dArr | ⇓ | Downward double arrow |
deg | ° | Degree sign |
Delta | Δ | Greek capital letter delta |
delta | δ | Greek small letter delta |
diams | ♦ | Black diamond suit |
divide | ÷ | Division sign |
Eacute | É | Latin capital letter E with acute |
eacute | é | Latin small letter e with acute |
Ecirc | Ê | Latin capital letter E with circumflex |
ecirc | ê | Latin small letter e with circumflex |
Egrave | È | Latin capital letter E with grave accent |
egrave | è | Latin small letter e with grave accent |
empty | ∅ | Empty set = null set = diameter |
emsp |   | Em space |
ensp |   | En space |
Epsilon | Ε | Greek capital letter epsilon |
epsilon | ε | Greek small letter epsilon |
equiv | ≡ | Identical to |
Eta | Η | Greek capital letter eta |
eta | η | Greek small letter eta |
ETH | Ð | Latin capital letter ETH |
eth | ð | Latin small letter eth |
Euml | Ë | Latin capital letter E with diaeresis (umlaut) |
euml | ë | Latin small letter e with diaeresis |
euro | € | Euro sign |
exist | ∃ | There exists |
fnof | ƒ | Latin small f with hook = function |
forall | ∀ | For all |
frac12 | ½ | Vulgar fraction one-half |
frac14 | ¼ | Vulgar fraction one-quarter |
frac34 | ¾ | Vulgar fraction three-quarters |
frasl | ⁄ | Fraction slash |
Gamma | Γ | Greek capital letter gamma |
gamma | γ | Greek small letter gamma |
ge | ≥ | Greater than or equal to |
gt | > | Greater than sign |
harr | ↔ | Left right arrow |
hArr | ⇔ | Left right double arrow |
hearts | ♥ | Black heart suit = valentine |
hellip | … | Horizontal ellipsis = three-dot leader |
Iacute | Í | Latin capital letter I with acute accent |
iacute | í | Latin small letter i with acute accent |
Icirc | Î | Latin capital letter I with circumflex |
icirc | î | Latin small letter i with circumflex |
iexcl | ¡ | Inverted exclamation mark |
Igrave | Ì | Latin capital letter I with grave accent |
igrave | ì | Latin small letter i with grave accent |
image | ℑ | Blackletter capital I = imaginary part |
infin | ∞ | Infinity |
int | ∫ | Integral |
Iota | Ι | Greek capital letter iota |
iota | ι | Greek small letter iota |
iquest | ¿ | Inverted question mark |
isin | ∈ | Element of |
Iuml | Ï | Latin capital letter I with diaeresis (umlaut) |
iuml | ï | Latin small letter i with diaeresis |
Kappa | Κ | Greek capital letter kappa |
kappa | κ | Greek small letter kappa |
Lambda | Λ | Greek capital letter lambda |
lambda | λ | Greek small letter lambda |
lang | 〈 | Left-pointing angle bracket = bra |
laquo | « | Left-pointing double angle quotation mark |
larr | ← | Leftward arrow |
lArr | ⇐ | Leftward double arrow |
lceil | ⌈ | Left ceiling = apl upstile |
ldquo | “ | Left double quotation mark |
le | ≤ | Less than or equal to |
lfloor | ⌊ | Left floor = apl downstile |
lowast | ∗ | Asterisk operator |
loz | ◊ | Lozenge |
lrm | ‎ | Left-to-right mark |
lsaquo | ‹ | Single left-pointing angle quotation mark |
lsquo | ‘ | Left single quotation mark |
lt | < | Less than |
macr | ¯ | Macron = spacing macron |
mdash | — | Em dash |
micro | µ | Micro sign |
middot | · | Middle dot |
minus | − | Minus sign |
Mu | Μ | Greek capital letter mu |
mu | μ | Greek small letter mu |
nabla | ∇ | Nabla = backward difference |
nbsp |   | No-break space = nonbreaking space |
ndash | – | En dash |
ne | ≠ | Not equal to |
ni | ∋ | Contains as member |
not | ¬ | Not sign |
notin | ∉ | Not an element of |
nsub | ⊄ | Not a subset of |
Ntilde | Ñ | Latin capital letter N with tilde |
ntilde | ñ | Latin small letter n with tilde |
Nu | Ν | Greek capital letter nu |
nu | ν | Greek small letter nu |
Oacute | Ó | Latin capital letter O with acute accent |
oacute | ó | Latin small letter o with acute accent |
Ocirc | Ô | Latin capital letter O with circumflex |
ocirc | ô | Latin small letter o with circumflex |
OElig | Œ | Latin capital ligature OE |
oelig | œ | Latin small ligature oe |
Ograve | Ò | Latin capital letter O with grave accent |
ograve | ò | Latin small letter o with grave accent |
oline | ‾ | Overline = spacing overscore |
Omega | Ω | Greek capital letter omega |
omega | ω | Greek small letter omega |
Omicron | Ο | Greek capital letter omicron |
omicron | ο | Greek small letter omicron |
oplus | ⊕ | Circled plus = direct sum |
or | ∨ | Logical or = vee |
ordf | ª | Feminine ordinal indicator |
ordm | º | Masculine ordinal indicator |
Oslash | Ø | Latin capital letter O with stroke |
oslash | ø | Latin small letter o with stroke |
Otilde | Õ | Latin capital letter O with tilde |
otilde | õ | Latin small letter o with tilde |
otimes | ⊗ | Circled times = vector product |
Ouml | Ö | Latin capital letter O with diaeresis (umlaut) |
ouml | ö | Latin small letter o with diaeresis (umlaut) |
para | ¶ | Pilcrow sign |
part | ∂ | Partial differential |
permil | ‰ | Per mille sign |
perp | ⊥ | Up tack = orthogonal to = perpendicular |
Phi | Φ | Greek capital letter phi |
phi | φ | Greek small letter phi |
Pi | Π | Greek capital letter pi |
pi | π | Greek small letter pi |
piv | ϖ | Greek pi symbol |
plusmn | ± | Plus-minus sign |
pound | £ | Pound sign |
prime | ′ | Prime = minutes = feet |
Prime | ″ | Double prime = seconds = inches |
prod | ∏ | N-ary product = product sign |
prop | ∝ | Proportional to |
Psi | Ψ | Greek capital letter psi |
psi | ψ | Greek small letter psi |
quot | " | Quotation mark = APL quote |
radic | √ | Square root = radical sign |
rang | 〉 | Right-pointing angle bracket = ket |
raquo | » | Right-pointing double angle quotation mark |
rarr | → | Rightward arrow |
rArr | ⇒ | Rightward double arrow |
rceil | ⌉ | Right ceiling |
rdquo | ” | Right double quotation mark |
real | ℜ | Blackletter capital R = real part symbol |
reg | ® | Registered sign |
rfloor | ⌋ | Right floor |
Rho | Ρ | Greek capital letter rho |
rho | ρ | Greek small letter rho |
rlm | ‏ | Right-to-left mark |
rsaquo | › | Single right-pointing angle quotation mark |
rsquo | ’ | Right single quotation mark |
sbquo | ‚ | Single low-9 quotation mark |
Scaron | Š | Latin capital letter S with caron |
scaron | š | Latin small letter s with caron |
sdot | ⋅ | Dot operator |
sect | § | Section sign |
shy | ­ | Soft hyphen |
Sigma | Σ | Greek capital letter sigma |
sigma | σ | Greek small letter sigma |
sigmaf | ς | Greek small letter final sigma |
sim | ∼ | Tilde operator |
spades | ♠ | Black spade suit |
sub | ⊂ | Subset of |
sube | ⊆ | Subset of or equal to |
sum | ∑ | N-ary summation |
sup | ⊃ | Superset of |
sup1 | ¹ | Superscript 1 |
sup2 | ² | Superscript 2 |
sup3 | ³ | Superscript 3 |
supe | ⊇ | Superset of or equal to |
szlig | ß | Latin small letter sharp s |
Tau | Τ | Greek capital letter tau |
tau | τ | Greek small letter tau |
there4 | ∴ | Therefore |
Theta | Θ | Greek capital letter theta |
theta | θ | Greek small letter theta |
thetasym | ϑ | Greek small letter theta symbol |
thinsp |   | Thin space |
THORN | Þ | Latin capital letter THORN |
thorn | þ | Latin small letter thorn |
tilde | ˜ | Small tilde |
times | × | Multiplication sign |
trade | ™ | Trademark sign |
Uacute | Ú | Latin capital letter U with acute accent |
uacute | ú | Latin small letter u with acute accent |
uarr | ↑ | Upward arrow |
uArr | ⇑ | Upward double arrow |
Ucirc | Û | Latin capital letter U with circumflex |
ucirc | û | Latin small letter u with circumflex |
Ugrave | Ù | Latin capital letter U with grave accent |
ugrave | ù | Latin small letter u with grave accent |
uml | ¨ | Diaeresis (umlaut) |
upsih | ϒ | Greek upsilon with hook symbol |
Upsilon | Υ | Greek capital letter upsilon |
upsilon | υ | Greek small letter upsilon |
Uuml | Ü | Latin capital letter U with diaeresis (umlaut) |
uuml | ü | Latin small letter u with diaeresis |
weierp | ℘ | Script capital P = power set |
Xi | Ξ | Greek capital letter xi |
xi | ξ | Greek small letter xi |
Yacute | Ý | Latin capital letter Y with acute accent |
yacute | ý | Latin small letter y with acute accent |
yen | ¥ | Yen sign = yuan sign |
Yuml | Ÿ | Latin capital letter Y with diaeresis |
yuml | ÿ | Latin small letter y with diaeresis |
Zeta | Ζ | Greek capital letter zeta |
zeta | ζ | Greek small letter zeta |
zwj | ‍ | Zero-width joiner |
zwnj | ‌ | Zero-width nonjoiner |
As I mentioned before, as with HTML, XHTML supports the various text-formatting tags, such as <b> for bold text, <i> for italic text, and <u> for underlined text. I'll take a look at them briefly because they're still very popular.
The <b> element gives you a simple inline way of bolding text. Although plenty of experts would prefer that you use style sheets to display text in bold, you can still use the <b> element. Like the <b> element, the <i> element offers some rudimentary text formatting in this case, creating italic text. Both elements are supported in XHTML 1.0 Strict, XHTML 1.0 Transitional, XHTML 1.0 Frameset, and XHTML 1.1. Here are their attributes:
Attribute | Description |
---|---|
class | Gives the style class of the element. |
dir | Sets the direction of directionally neutral text. You can set this attribute to LTR, for left-to-right text, or RTL, for right-to-left text. |
id | Use the ID to refer to the element; set this attribute to a unique identifier. |
lang | Specifies the base language used in the element. Applies only when the document is interpreted as HTML. |
style | Set this to an inline style to specify how the browser should display the element. |
title | Contains the title of the element (which might be displayed in ToolTips). |
xml:lang | Specifies the base language for the element when the document is interpreted as an XML document. |
Here are the official XHTML events that these elements support: onclick, ondblclick, onmousedown, onmouseup, onmouseover, onmousemove, onmouseout, onkeypress, onkeydown, and onkeyup.
Here's an example that displays text in both italic and bold (I'm using the line break, <br> element, which we'll see later, to separate the lines of text):
<?xml version="1.0"?> <!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd"> <html xmlns="http://www.w3.org/1999/xhtml" xml:lang="en" lang="en"> <head> <title> Bold and Italic Text </title> </head> <body> <i>This text is italic.</i> <br /> <b>This text is bold.</b> <br /> <b><i>This text is both.</i></b> </body> </html>
The results of this XHTML appear in Figure 16.5, where you can see text that's bold, italic, and both bold and italic. The <b> and <i> tags are favorites among Web page authors because they're so easy to use.
The <u> element displays underlined text. This element was deprecated in HTML 4.0, so it is not supported in XHTML 1.0 Strict or XHTML 1.1. It is supported in XHTML 1.0 Transitional and XHTML 1.0 Frameset, however. Note, of course, that if your readers are very traditional, they might mistake underlined text for a hyperlink. Here are the attributes of this element:
Attribute | Description |
---|---|
class | Gives the style class of the element. |
dir | Sets the direction of directionally neutral text. You can set this attribute to LTR, for left-to-right text, or RTL, for right-to-left text. |
id | Use the ID to refer to the element; set this attribute to a unique identifier. |
lang | Specifies the base language used in the element. Applies only when the document is interpreted as HTML. |
style | Set this to an inline style to specify how the browser should display the element. |
title | Contains the title of the element (which might be displayed in ToolTips). |
xml:lang | Specifies the base language for the element when the document is interpreted as an XML document. |
Here are the official XHTML events this element supports: onclick, ondblclick, onmousedown, onmouseup, onmouseover, onmousemove, onmouseout, onkeypress, onkeydown, and onkeyup.
The <u> element offers another easy formatting option, underlining its enclosed text. This element is deprecated in HTML 4.0, so you can't use it in strict XHTML 1.0 or XHTML 1.1. Here's an example putting <u> to work:
<?xml version="1.0"?> <!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd"> <html xmlns="http://www.w3.org/1999/xhtml" xml:lang="en" lang="en"> <head> <title> Using the <u> Element </title> </head> <body> You can <u>underline</u> text for a little more emphasis. </body> </html>
The results of this XHTML appear in Figure 16.6.
Using the <font> element, you can select text size, color, and face. The <font> element has always been very popular among HTML authors, but with the new emphasis on handling styles in style sheets, you can imagine that it was headed for extinction. And it has indeed been deprecated in HTML 4.0, so it's not available in XHTML 1.1 or XHTML 1.0 Strict. It's supported in XHTML 1.0 Transitional and XHTML 1.0 Frameset. Because it's so popular still, I'll cover it here briefly.
Here are this element's attributes:
Attribute | Description |
---|---|
class | Gives the style class of the element. |
color | Is deprecated. Sets the color of the text. |
dir | Sets the direction of directionally neutral text. You can set this attribute to LTR, for left-to-right text, or RTL, for right-to-left text. |
face | Is deprecated. You can set this attribute to a single font name or a list of names separated by commas. The browser will select the first font face from the list that it can find. |
id | Use the ID to refer to the element; set this attribute to a unique identifier. |
lang | Specifies the base language used in the element. Applies only when the document is interpreted as HTML. |
size | Is deprecated. Gives the size of the text. Possible values range from 1 through 7. |
style | Set this to an inline style to specify how the browser should display the element. |
title | Contains the title of the element (which might be displayed in ToolTips). |
xml:lang | Specifies the base language for the element when the document is interpreted as an XML document. |
This element does not support any XHTML events.
You can use the <font> element to set a font face, size, and color for text. Here's an example; in this case, I'm displaying text in a large red Arial font:
<?xml version="1.0"?> <!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd"> <html xmlns="http://www.w3.org/1999/xhtml" xml:lang="en" lang="en"> <head> <title> Using the <font> Element </title> </head> <body> <font size="6" color="#ff0000" face="Arial"> Putting the <font> element to work. </font> </body> </html>
The results of this XHTML appear in Figure 16.7.
You specify font sizes by using the values 1 through 7. In practice, font size 1 is about 6 points, font size 2 is about 12 points, and so on, but actual sizes vary by system. Here's an example showing the range of possible sizes:
<?xml version="1.0"?> <!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd"> <html xmlns="http://www.w3.org/1999/xhtml" xml:lang="en" lang="en"> <head> <title> Using the <font> Element </title> </head> <body> <center> <h1> Using the <font> Element </h1> <font size="1">This is font size 1.</font> <br /> <font size="2">This is font size 2.</font> <br /> <font size="3">This is font size 3.</font> <br /> <font size="4">This is font size 4.</font> <br /> <font size="5">This is font size 5.</font> <br /> <font size="6">This is font size 6.</font> <br /> <font size="7">This is font size 7.</font> </center> </body> </html>
The results of this XHTML appear in Figure 16.8. As mentioned earlier, <font> has been deprecated in HTML 4.0 in favor of style sheets. So how should you replace the <font> element? See the section "Formatting Text Inline (<span>)," at the end of this chapter, for a good substitute.
Besides the simple text formatting elements, HTML also contains elements to arrange text in the display; XHTML supports those elements as well.
The <br> element is an empty element that inserts a line break into text. Because this element is empty, you use it like this in XHTML: <br />. This element is supported in XHTML 1.0 Strict, XHTML 1.0 Transitional, XHTML 1.0 Frameset, and XHTML 1.1. Here are the attributes of this element:
Attribute | Description |
---|---|
class | Gives the style class of the element. |
clear | Is used to move past-aligned images or other elements. Set this to none (the default just a normal break), left (breaks line and moves down until there is a clear left margin past the aligned element), right (breaks line and moves down until there is a clear right margin past the aligned element), or all (breaks line and moves down until both margins are clear of the aligned element). (XHTML 1.0 Transitional, XHTML 1.0 Frameset.) |
id | Use the ID to refer to the element; set this attribute to a unique identifier. |
style | Set this to an inline style to specify how the browser should display the element. |
title | Contains the title of the element (which might be displayed in ToolTips). |
This element does not support any XHTML events.
You use the <br> element to arrange the text in a document by adding a line break, making the browser skip to the next text line. This usage actually does not cause any problems in the major browsers, and the fact that those browsers are capable of handling empty elements with the usual XML /> closing characters is one of the reasons that XHTML actually works as it should in HTML browsers. In fact, you can also insert line breaks as <br></br>, but that usage does turn out to be confusing to some browsers and XML validators.
Letting the Browser Handle the FormattingIdeally, you should let the browser handle text formatting as much as possible. The text flow is supposed to be handled by the browser to display that text as best as possible to fit the display area. This means that if you add a lot of line breaks, you may interfere with the best possible display (unless you're adding line breaks to specifically separate discrete elements, such as images). It's usually best to format your text into paragraphs that the browser can handle as appropriate, rather than expressly adding line breaks to text yourself. |
The <p> element enables you to break text up into paragraphs. Paragraphs are block elements that you can format as you like in style sheets or with style attributes, including indenting the first line and so forth. If you're coming to XHTML from HTML, one thing to recall is that every <p> tag needs a corresponding </p> tag, which is easy to forget because HTML doesn't require that. In addition, note that paragraphs are block elements, which in XHTML means that you cannot display other block elements, such as headings, in them. The <p> element is supported in XHTML 1.0 Strict, XHTML 1.0 Transitional, XHTML 1.0 Frameset, and XHTML 1.1. Here are this element's attributes:
Attribute | Description |
---|---|
align | Is deprecated in HTML 4. Sets the alignment of the text. Possible values include left (the default), right, center, and justify. (XHTML 1.0 Transitional, XHTML 1.0 Frameset.) |
class | Gives the style class of the element. |
dir | Sets the direction of directionally neutral text. You can set this attribute to LTR, for left-to-right text, or RTL, for right-to-left text. |
id | Use the ID to refer to the element; set this attribute to a unique identifier. |
lang | Specifies the base language used in the element. Applies only when the document is interpreted as HTML. |
style | Set this to an inline style to specify how the browser should display the element. |
title | Contains the title of the element (which might be displayed in ToolTips). |
xml:lang | Specifies the base language for the element when the document is interpreted as an XML document. |
This element supports these XHTML events: onclick, ondblclick, onmousedown, onmouseup, onmouseover, onmousemove, onmouseout, onkeypress, onkeydown, and onkeyup.
You use the <p> element to organize your text. The browser adds a little vertical space on top of paragraphs to separate them from other elements. The browser formats the text in a paragraph to fit the current page width.
Here's an example; in this case, I'm using <br> elements to introduce line breaks, and a <p> element to create a new paragraph:
<?xml version="1.0"?> <!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd"> <html xmlns="http://www.w3.org/1999/xhtml" xml:lang="en" lang="en"> <head> <title> Using the <br> and <p> Elements </title> </head> <body> <center> <h1> Using the <br> and <p> Elements </h1> </center> This is a line of text. <br /> Using a line break skips to the next line. <p style="font-weight: bold"> This is a line of bold text in a paragraph. <br /> Here's a new line of text in the same paragraph. </p> </body> </html>
The results of this code appear in Figure 16.9. As you can see, inserting a <br> element makes the browser move to the next line of text.
This example points out the difference between <br> and <p>. The <br> element is empty and just makes the flow of text skip to the next line. The <p> element, on the other hand, is a block element that encloses content. You can apply styles to the content in a <p> element, and those styles are applied to all text in the paragraph, even if they're broken up with line breaks as you see in Figure 16.9, where the bold style of text applies to both lines in the paragraph.
Another handy element to arrange text is the <hr> horizontal rule element. This element just causes the browser to draw a horizontal line to separate or group elements vertically. It's supported in XHTML 1.0 Strict, XHTML 1.0 Transitional, XHTML 1.0 Frameset, and XHTML 1.1. Here are the attributes of this element note that it includes a few attributes that have been deprecated:
Attribute | Description |
---|---|
align | Is deprecated. Sets the alignment of the rule; set this to left, center (the default), or right. To set this attribute, you must also set the width attribute. (XHTML 1.0 Transitional, XHTML 1.0 Frameset.) |
class | Gives the style class of the element. |
id | Use the ID to refer to the element; set this attribute to a unique identifier. |
noshade | Is deprecated. Displays the rule with a two-dimensional, not three-dimensional (the default), appearance. (XHTML 1.0 Transitional, XHTML 1.0 Frameset.) |
size | Is deprecated. Sets the vertical size of the horizontal rule in pixels. (XHTML 1.0 Transitional, XHTML 1.0 Frameset.) |
style | Set this to an inline style to specify how the browser should display the element. |
title | Contains the title of the element (which might be displayed in ToolTips). |
width | Is deprecated. Sets the horizontal width of the rule. You can set this attribute to a pixel measurement or a percentage of the display area. (XHTML 1.0 Transitional, XHTML 1.0 Frameset.) |
xml:lang | Specifies the base language for the element when the document is interpreted as an XML document. |
These are the XHTML events supported by this element: onclick, ondblclick, onmousedown, onmouseup, onmouseover, onmousemove, onmouseout, onkeypress, onkeydown, and onkeyup.
It's easy to break your text up with horizontal rules, using the <hr> element. This can be very useful in longer documents, and it serves to organize your document visually into sections. This element is empty and just instructs the browser to insert a horizontal rule.
As with many style attributes in HTML 4.0, the <hr> element's align, width, noshade, and size attributes are all deprecated. However, they're still in the XHTML 1.0 Transitional or Frameset DTDs. Here's an example that displays a few horizontal rules of varying width and alignment:
<?xml version="1.0"?> <!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd"> <html xmlns="http://www.w3.org/1999/xhtml" xml:lang="en" lang="en"> <head> <title> Using the <hr> Element </title> </head> <body> <center> <h1> Using the <hr> Element </h1> </center> This is <hr />: <hr /> <br /> This is <hr align="left" width="60%" />: <hr align="left" width="60%" /> <br /> This is <hr align="center" width="60%" />: <hr align="center" width="60%" /> <br /> This is <hr align="right" width="60%" />: <hr align="right" width="60%" /> <br /> </body> </html>
You can see the results of this XHTML in Figure 16.10, which shows a number of ways to configure horizontal rules. Here's another note: When you set the align attribute, you must also set the width attribute.
The <center> element does just what its name implies: It centers text and elements in the browser's display area. The W3C deprecated <center> in HTML 4, so you won't find it in the XHTML 1.0 strict or XHTML 1.1 DTDs. Nonetheless, <center> remains a favorite element and will be in use for a long time to come.
Here are the attributes of this element:
Attribute | Description |
---|---|
class | Gives the style class of the element. |
dir | Sets the direction of directionally neutral text. You can set this attribute to LTR, for left-to-right text, or RTL, for right-to-left text. |
id | You use the ID to refer to the element; set this attribute to a unique identifier. |
lang | Specifies the base language used in the element. Applies only when the document is interpreted as HTML. |
style | Set this to an inline style to specify how the browser should display the element. |
title | Contains the title of the element (which might be displayed in ToolTips). |
xml:lang | Specifies the base language for the element when the document is interpreted as an XML document. |
This element supports the following XHTML events: onclick, ondblclick, onmousedown, onmouseup, onmouseover, onmousemove, onmouseout, onkeypress, onkeydown, and onkeyup.
Here's an example of <center> at work centering multiline text:
<?xml version="1.0"?> <!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd"> <html xmlns="http://www.w3.org/1999/xhtml" xml:lang="en" lang="en"> <head> <title> Using the <center> Element </title> </head> <body> <center> <h1> Using the <center> Element </h1> </center> <center> The <center> element is a <br /> useful one for centering <br /> text made up of <br /> multiple lines. </center> </body> </html>
You can see the results of this XHTML in Figure 16.11.
The <center> element is still in widespread use, which is why I'm taking a look at it here; however, it has been deprecated, which means that it will disappear from XHTML one day. So, what are you supposed to use instead? Take a look at the next topic, the <div> element.
You can use the <div> element to select or enclose a block of text, usually so that you can apply styles to it. This element is supported in XHTML 1.0 Strict, XHTML 1.0 Transitional, XHTML 1.0 Frameset, and XHTML 1.1. Here are its attributes:
Attribute | Description |
---|---|
align | Is deprecated. Sets the horizontal alignment of the element. Set this to left (the default), right, center, or justify. (XHTML 1.0 Transitional, XHTML 1.0 Frameset.) |
class | Gives the style class of the element. |
dir | Sets the direction of directionally neutral text. You can set this attribute to LTR, for left-to-right text, or RTL, for right-to-left text. |
id | Use the ID to refer to the element; set this attribute to a unique identifier. |
lang | Specifies the base language used in the element. Applies only when the document is interpreted as HTML. |
style | Set this to an inline style to specify how the browser should display the element. |
title | Contains the title of the element (which might be displayed in ToolTips). |
xml:lang | Specifies the base language for the element when the document is interpreted as an XML document. |
This element supports these XHTML events: onclick, ondblclick, onmousedown, onmouseup, onmouseover, onmousemove, onmouseout, onkeypress, onkeydown, and onkeyup.
The <div> element enables you to refer to an entire section of your document by name. You can replace the text in it from JavaScript code, as we did in Chapter 7, "Handling XML Documents with JavaScript," where we read in XML documents and worked with them, displaying results using the <div> element's innerHTML property in Internet Explorer as in this HTML document:
<HTML> <HEAD> <TITLE> Reading XML element values </TITLE> <SCRIPT LANGUAGE="JavaScript"> function readXMLDocument() { var xmldoc, meetingsNode, meetingNode, peopleNode var first_nameNode, last_nameNode, outputText xmldoc = new ActiveXObject("Microsoft.XMLDOM") xmldoc.load("meetings.xml") meetingsNode = xmldoc.documentElement meetingNode = meetingsNode.firstChild peopleNode = meetingNode.lastChild personNode = peopleNode.lastChild first_nameNode = personNode.firstChild last_nameNode = first_nameNode.nextSibling outputText = "Third name: " + first_nameNode.firstChild.nodeValue + ' ' + last_nameNode.firstChild.nodeValue messageDIV.innerHTML=outputText } </SCRIPT> </HEAD> <BODY> <CENTER> <H1> Reading XML element values </H1> <INPUT TYPE="BUTTON" VALUE="Get the name of the third person" ONCLICK="readXMLDocument()"> <P> <DIV ID="messageDIV"></DIV> </CENTER> </BODY> </HTML>
Here's an XHTML example; in this case, I'm enclosing some text in a <div> element and styling the text in bold red italics with an XHTML <style> element. (More on the <style> element comes in the next chapter.)
<?xml version="1.0"?> <!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd"> <html xmlns="http://www.w3.org/1999/xhtml" xml:lang="en" lang="en"> <head> <title> Using the <div> tag </title> <style> div {color: red; font-weight: bold; font-style: italic} </style> </head> <body> <center> <h1> Using the <div> Element </h1> </center> <p> <div> This text, which <br /> takes up multiple lines, <br /> was formatted all at once <br /> in a single <div> element. </div> </p> </body> </html>
You can see the results of this XHTML in Figure 16.12 where, as you see, all the lines in the <div> element were styled in the same way.
The W3C suggests that you use the <div> element's align attribute to replace the now deprecated <center> element by setting align to "center". That would look like this, where I'm modifying the example from the previous section:
<?xml version="1.0"?> <!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd"> <html xmlns="http://www.w3.org/1999/xhtml" xml:lang="en" lang="en"> <head> <title> Using the <div> Element </title> </head> <body> <div align="center"> <h1> Using the <div> Element </h1> </div> <div align="center"> The <div> element is a <br /> useful one for centering <br /> text made up of <br /> multiple lines. </div> </body> </html>
In fact, although W3C documentation suggests that you use the align attribute, the W3C seems to have forgotten that it deprecated that attribute in HTML 4.0. The way to center text now is setting a <div> element's text-align style property to "center". That might look like this:
<?xml version="1.0"?> <!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd"> <html xmlns="http://www.w3.org/1999/xhtml" xml:lang="en" lang="en"> <head> <title> Using the <div> Element </title> <style> div {text-align: center} </style> </head> <body> <div> <h1> Using the <div> Element </h1> </div> <div> The <div> element is a <br /> useful one for centering <br /> text made up of <br /> multiple lines. </div> </body> </html>
This works as planned the text is indeed centered in the browser.
Using the positioning style properties, you can also position text with the <div> tag, even overlapping displayed text blocks. There's another handy element that you can use to select text and apply styles: <span>.
The <span> element lets you select inline text to apply styles. It's supported in XHTML 1.0 Strict, XHTML 1.0 Transitional, XHTML 1.0 Frameset, and XHTML 1.1. Here are the attributes of this element:
Attribute | Description |
---|---|
class | Gives the style class of the element. |
dir | Sets the direction of directionally neutral text. You can set this attribute to LTR, for left-to-right text, or RTL, for right-to-left text. |
id | Use the ID to refer to the element; set this attribute to a unique identifier. |
lang | Specifies the base language used in the element. Applies only when the document is interpreted as HTML. |
style | Set this to an inline style to specify how the browser should display the element. |
title | Contains the title of the element (which might be displayed in ToolTips). |
xml:lang | Specifies the base language for the element when the document is interpreted as an XML document. |
This element supports these XHTML events: onclick, ondblclick, onmousedown, onmouseup, onmouseover, onmousemove, onmouseout, onkeypress, onkeydown, and onkeyup.
You usually use <span> to apply styles inline, for example, in the middle of a sentence, to a few words or even characters. When styling blocks of text, you can use <div>; for individual characters, words, or sentences, use <span>.
As we saw, you can use <div> to replace the deprecated <center> element; there's also a deprecated element that you can replace with <span>: the <font> element. Using <span>, you can apply styles inline to a few characters or words, which is what Web authors previously used <font> for. For example, here I'm applying a style to a section of text using <span>, displaying that text in bold red italic:
<?xml version="1.0"?> <!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd"> <html xmlns="http://www.w3.org/1999/xhtml" xml:lang="en" lang="en"> <head> <title> Using the <span> Element </title> <style> span {color: red; font-weight: bold; font-style: italic} </style> </head> <body> <center> <h1> Using the <span> Element </h1> </center> <h2> Sometimes, for <span>emphasis</span>, you might want to target <span>specific words</span> in your text. </h2> </body> </html>
You can see the results of this XHTML in Figure 16.13, where the words we want styled in a specific way are indeed styled as we want them.
There's more XHTML to come take a look at the next chapter.
CONTENTS |