5.4 XHTML


The Web has reached a point that many pages contain "bad" HTML (see Example 5.13). The eXtensible HTML (XHTML) markup language was designed to combine the best of HTML and XML: It supports all of the elements of HTML 4.0 combined with the well formed syntax of XML. Essentially , XHTML is a reformulation of HTML 4.0 in XML. The HTML code in Example 5.15 will work fine with some browsers, even though it does not follow the HTML rules. The correct equivalent code is presented in Example 5.16.

Example 5.13 A bad HTML missing closing </p> tags.
  <p>This is a paragraph   <p>This is another paragraph  
Example 5.14 The revision in XHMTL adding closing </p> tags.
  <p>This is a paragraph</p>   <p>This is another paragraph</p>  
Example 5.15 A bad HTML document which is missing closing tags.
  <html>   <head>   <title>This is bad HTML</title>   <body>   <h1>Bad HTML  
Example 5.16 The revision in good XHTML.
  <!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Strict//EN"   "http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd">   <html>   <head><title>This is good HTML</title></head>   <body>   <h1>Good HTML   </body>   </html>  

This section contains a summary of the XHTML DTD as specified and published by the W3C [WWW]. Readers are encouraged to review the detailed specification available from www.W3C.org. Some of the most noteworthy differences between HTML and XHTML include the following:

  • XHTML uses a new document type (i.e., new DTD).

  • XHTML is case sensitive; tags and attribute names are required to be lower case.

  • In XHTML, the values of all attributes must be specified and must be quoted.

  • Empty tags are not allowed in XHTML; any open tag <xyz> must be replaced with <xyz/> .

Above and beyond these differences, being based on XML, in XHTML it is relatively easy to introduce new elements or additional element attributes. The XHTML family is designed to accommodate these extensions through XHTML modules and techniques for developing new XHTML-conforming modules. As an example, the XHTML namespace can be used with other XML namespaces to specify modules with a completely different set of tags. Specifically, it is clear that for iTV application not all of the XHTML elements will be required on all receivers. Through modularization and namespaces, the XHTML framework enables coexistence of combinations of existing and new feature sets when developing content and when designing new iTV browsers.

In XHTML, as in SGML, certain elements may be excluded from being contained within an element, even though such restrictions cannot be defined in the DTD. Such prohibitions (called exclusions) are not possible in XML. For example, the HTML 4 Strict DTD forbids the nesting of an a element within another a element to any descendant depth; such a restriction is not possible in XML. There are three key XHTML DTDs:

  • Strict XHTML : A markup free of presentational clutter. Can be used together with CSS.

     <!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Strict//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd"> 
  • Transitional XHTML : Use this when using a browser that does not yet support full XHTML, or does not yet support CSS, but does support all the presentational features of HTML.

     <!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd"> 
  • Frame Sets : Needed for pages that partition the display area using frames .

     XHTML 1.0 Frameset <!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Frameset//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-frameset.dtd"> 

Following XML, the XHTML standard defines the id attribute for the elements <a/> , <applet/> , <form/> , <frame/> , <iframe/> , <img/> , and <map/> . In addition, the name attribute, although formally deprecated, for backward compatibility with HTML it is still supported by XHTML 1.0.

The id attributes are fragment identifiers of type ID, which must be unique with the scope of a single document. To ensure that XHTML documents are well-structured XML documents, the id attribute is required even for elements that historically have also had a name attribute. See the HTML Compatibility Guidelines for additional information.

5.4.1 Character Set

Character sets are a critical aspect of a markup language: One needs to make sure that it is supported by the font selected for rendering the content. Such validation could be performed automatically, as all the information needed should be available within the selected font. XHTML uses the ISO 8879 Latin 1 character set. The XHTML DTD specifies the mapping between ISO 8879 and ISO 10646 UNICODE.

5.4.2 Elements

DTD authors need to define the content model for their DTD. XHTML provides a variety of tools, including a set of support modules, instantiated by a main Framework module specified by the XHTML DTD. That framework module is essentially a reformulation of HTML as a modular XML application. It requires that the DTD author specifies notations, data types, namespace, qualified names, events, common attributes, document model, and character entities.

The XHTML DTD follows the same principles described earlier. For example, the specification of the anchor element <a/> is given in Figure 5.11: It may contain either a qname or a content , and has the optional (i.e., #IMPLIED ) attributes of href , charset , type , hreflang , rel , rev , accesskey , and tabindex , as well as additional attributes common to many other elements.

A very important aspect of XHTML is the object element used to embed external objects as part of XHTML pages. Although its functionality overlaps that of the Applet element, it is essentially more general, and typically used to embed iTV (JavaTV) Xlets. The classid attribute typically specifies a reference to a JavaTV Xlet class, and the data attribute specifies the input data for that Xlet. In the context of iTV, the code-base attribute is not often used, because the code is not always available locally at the receiver or otherwise accessible through a return channel (i.e., remote interactivity channel).

An important use of the OBJECT element is to invoke a content-rendering plug-in. The data attribute points to the image or data file to be rendered, and the classid attribute points to the plug-in to be used for rendering that data (see the ATSC DASE specification). This concept could be further extended to render future versions of XHTML, or regional variants such as the Broadcast Markup Language (BML) used in Japan: The data attribute points to the BML file and the classid attribute points to the BML renderer or browser.

Figure 5.11 XHTML DTD definition of the Anchor element.
 <!ENTITY % a.element  "INCLUDE" > <![%a.element;[ <!ENTITY % a.content      "( #PCDATA  %InlNoAnchor.mix; )*" > <!ENTITY % a.qname  "a" > <!ELEMENT %a.qname;  %a.content; > <!-- end of a.element -->]]> <!ENTITY % a.attlist  "INCLUDE" > <![%a.attlist;[ <!ATTLIST %a.qname;       %Common.attrib;       href         %URI.datatype;           #IMPLIED       charset      %Charset.datatype;       #IMPLIED       type         %ContentType.datatype;   #IMPLIED       hreflang     %LanguageCode.datatype;  #IMPLIED       rel          %LinkTypes.datatype;     #IMPLIED       rev          %LinkTypes.datatype;     #IMPLIED       accesskey    %Character.datatype;     #IMPLIED       tabindex     %Number.datatype;        #IMPLIED> <!-- end of a.attlist -->]]> 

As mentioned earlier, specifications of standards and their implementations may select a subset of the XHTML elements, because each element (or group of elements) is regarded as a module. The subset required by an XHTML standard is specified by its framework module , which includes a list of required modules. The list of predefined XHTML elements is a derivative of the HTML 4.0 elements (see Table 5.3).

Table 5.3. Elements Supported by XHTML

Element

Description

<a/>

An anchor

<abbr/>

An abbrevation

<acronym/>

An acronym

<address/>

An address

<applet/>

An applet (transitional support only)

<area/>

An area inside an image map

<b/>

All text within this element is bold

<base/>

Base URL for all relative URLs specified within this element

<basefont/>

Defines a base font (transitional)

<bdo/>

All text within this element is bidirectional

<big/>

All text within this element is big

<blockquote/>

A long quotation

<body/>

The body element

<br/>

Line break

<button/>

A push button

<caption/>

A table caption

<center/>

Centered text (transitional)

<cite/>

Defines citation

<code/>

Code text to be distinguished from free text

<col/>

Table column

<colgroup/>

Defines group of table columns

<dd/>

Definition description

<del/>

Deleted text

<dir/>

Directory list (transitional)

<div/>

A multiline section of the document

<dl/>

Definition list

<dt/>

Definition term

<em/>

Emphasized text

< fieldset />

A fieldset

<font/>

Font definition (transitional)

<form/>

An input form

<frame/>

A subwindow (a frame)

<frameset/>

A set of frames

<h1/> ... <h6/>

Headers style

<head/>

The header that specifies information about the document

<hr/>

Horizontal rule

<html/>

The root element of an HTML document

<i/>

Italic text

<iframe/>

Defines an in-line sub-window (frame)

<img/>

An image

<input/>

An input field

<ins/>

Inserted text

<kbd/>

Keyboard text

<label/>

A label

<legend/>

A title in a fieldset

<li/>

List item

<link/>

A reference to a resource

<map/>

An image map

<menu/>

A menu list (transitional)

<meta/>

Meta-data information

<noframes/<

A noframes section

<noscript/>

A noscript section

<object/>

An embedded object such as an Applet or image

<ol/>

Defines an ordered list

<option/>

An option within a drop down list

<optgroup/>

Option group

<p/>

Paragraph

<param/>

A parameter object

<pre/>

Preformatted text

<q/>

A short quotation

<s/>

Strikethrough text (transitional)

<samp/>

Sample text

<script/>

This element contains a script

<select/>

Defines a selection drop list

<small/>

Small text

<span/>

A single line section in a document

<strike/>

Strikethrough text (transitional)

<strong/>

Strong text

<style/>

Style definition

<sub/>

Defines subscripted text

<table/>

A table

< tbody />

Table body

<td/>

Table cell

<textarea/>

A text area

< tfoot />

Table footer

<th/>

Table header

<thead/>

Table header

<title/>

Document title

<tr/>

Table row

<tt/>

Teletype text

<u/>

Underlined text

<ul/>

Unordered list

<var/>

A variable

Table 5.4. Standard XHTML Element Attributes

Attribute

Description

Not valid for

class

The class of the element

base, head, html, meta, param, script, style, title

id

A unique identifier for this element

base, head, html, meta, param, script, style, title

title

A text to display in a tool tip

base, head, html, meta, param, script, style, title

dir

Sets the direction of the text

base, br, frame, frameset, hr, iframe, param, script

lang

Sets the language code

base, br, frame, frameset, hr, iframe, param, script

accesskey

A keyboard shortcut to an element

 

tabindex

Sets the tab order of an element

 

onload

Script invoked when the doc loads

only valid for <body> and <frameset> elements

onunload

Script invoked when it unloads

only valid for <body> and <frameset> elements

onchange

Invoked when element changes

only valid for elements within forms

onsubmit

Invoked when form is submitted

only valid for a form element

onselect

Invoked when element is selected

only valid for certain form elements

onreset

Invoked when form is reset

only valid for a form element

onfocus

When element receives focus

only valid for elements within forms

onblur

When element loses focus

only valid for elements within forms

onkeydown

When key pressed over element

base, bdo, br, frame, frameset, head, html, iframe, meta, param, script, style, title

onkeypress

When key pressed or released over element

base, bdo, br, frame, frameset, head, html, iframe, meta, param, script, style, title

onkeyup

When key released over element

base, bdo, br, frame, frameset, head, html, iframe, meta, param, script, style, title

onclick

When a click occurs over element

base, bdo, br, frame, frameset, head, html, iframe, meta, param, script, style, title

ondblclick

When an element is selected via double-click equivalent

base, bdo, br, frame, frameset, head, html, iframe, meta, param, script, style, title

onmousedown

When mouse is pressed over the rendering area of an element

base, bdo, br, frame, frameset, head, html, iframe, meta, param, script, style, title

onmousemove

When moused is moved within the rendering area of the element

base, bdo, br, frame, frameset, head, html, iframe, meta, param, script, style, title

onmouseover

When mouse is moved over the rendering area of the element

base, bdo, br, frame, frameset, head, html, iframe, meta, param, script, style, title

onmouseout

When mouse leaves the rendering area of the element

base, bdo, br, frame, frameset, head, html, iframe, meta, param, script, style, title

onmouseup

When mouse is released over the rendering area of the element

base, bdo, br, frame, frameset, head, html, iframe, meta, param, script, style, title



ITV Handbook. Technologies and Standards
ITV Handbook: Technologies and Standards
ISBN: 0131003127
EAN: 2147483647
Year: 2003
Pages: 170

flylib.com © 2008-2017.
If you may any questions please contact us: flylib@qtcs.net