5.2 HTML

Tutorials on how to use HTML to specify content, such as Web pages, are out of scope for this section as they can be found in countless books and Internet sites [HTML]. Instead, the focus is on the nuts and bolts of the specification, the DTD, and some of the issues. This section presents an historical perspective specifying the HTML DTD back from the SGML days. The current state of affairs is described in the XHTML section.

HTML has been in use by the World Wide Web global information initiative since 1990 [HTML]. HTML 2.0 represents the industry consensus in 1994, and is specified by RFC 1866, which was later made obsolete by RFC 2854. HTML Version 3.2 represented an industry consensus in 1996, and HTML 4.01 was finalized in 1998.

HTML has an SGML definition [SGML] comprising of the four parts described above. Its DTD, often referred to as the HTML DTD, though cryptic and dissuasive at first, gives concise information about an element and its attributes. To render the DTD readable, comments can be used, which may spread over one or more lines. In the DTD, comments are delimited by a pair of "--" marks.

The SGML definition of HTML also specifies that some HTML elements are not required to have end tags [HTML]. The definition of each element in the reference manual indicates whether it requires an end tag. Some HTML elements have no content. For example, the line break element BR has no content; its only role is to terminate a line of text. Such "empty" elements never have end tags. The definition of each element in the reference manual indicates whether it is empty (has no content) or, if it can have content, what is considered legal content. For example, to specify the empty HR (horizontal rule) element, e.g.,

 <!ELEMENT HR    - O EMPTY> <!-- <HR>       Horizontal rule -->

Here, the comment Horizontal rule explains the use of the HR element.

5.2.1 Element Definitions

The bulk of the HTML DTD consists of the definitions of elements and their attributes [HTML]. The <!ELEMENT keyword begins an element definition and the > character ends it. Between these, are specified:

The element's name .
Which element tags are optional. Two hyphens that appear after the element name mean that the start and end tags are mandatory. One hyphen followed by the letter O (not zero) indicates that the end tag can be omitted. A pair of O s indicate that both the start and end tags can be omitted. The combination in which only the end tag appears is not allowed in HTML.
The element's content, if any. The allowed content for an element is called its content model. Elements with no content are called empty elements. Empty elements are defined with the keyword EMPTY .

For example:

 <!ELEMENT DL    - -  (DT  DD)+> <!ELEMENT (OLUL) - -  (LI)+>

The first statement defines a single element, DL . The second element, defines two elements, OL and UL . The two hyphens indicate that both the start tag and the end tag for this element are required. The content model for this element, specified by the ()+ notation, is defined to be "at least one element"; (LI)+ specifies at least one LI element, and (DTDD)+ specifies at least one DT or DD element.

The next example illustrates the definition of an empty element:

 <!ELEMENT IMG - O EMPTY>

The element being defined is IMG . The hyphen and the following "O" indicate that the end tag can be omitted, but together with the content model EMPTY , this is strengthened to the rule that the end tag must be omitted. The EMPTY keyword means the element must not have content.

5.2.1.1 Content Model Definitions

The content model describes what can be contained by an element. Content definitions may include:

The names of allowed or forbidden elements (e.g., the UL element includes instances of the LI element).
DTD entities (e.g., the LABEL element includes instances of the %inline entity).
Document text (indicated by the SGML construct).
#PCDATA ). Text may contain numeric and named character entities. Recall that these begin with & and end with a semicolon.
The content model uses the following syntax to define what markup is allowed for the content of the element:
- (...) specifies a group .
- AB specifies that both A and B are permitted in any order.
- A,B specifies that A must occur before B .
- A&B specifies that A and B must both occur once, but may do so in any order.
- A? specifies that A can occur zero or one times.
- A* specifies that A can occur zero or more times.
- A+ specifies that A can occur one or more times.
- -A specifies that A is not allowed to occur.

Here are some examples from the HTML DTD:

<!ELEMENT SELECT - - (OPTION+)> specifies that the SELECT element must contain one or more OPTION elements.
<!ELEMENT DL - - (DTDD)+> specifies that the DL element must contain one or more DT or DD elements in any order.
<!ELEMENT A - - (%heading%text)* -(A)> specifies that the element A (anchor) may contain either the %heading or %text strings and cannot be nested in another element; "%heading,%text" refers to the heading and text entities, respectively (see next subsection).
<!ELEMENT FORM - - %body.content -(FORM) +(INPUT SELECT TEXTAREA) > forbids nested forms and allows only three types of children: INPUT , SELECT , and TEXTAREA ; there are no order constraints and each of these may repeat a number of times.

5.2.2 Attribute Definitions

The <!ATTLIST keyword begins the definition of attributes that an element may take (and the > character ends it). It is followed by the name of the element in question and a list of attribute definitions. An attribute definition is a triplet that defines the following:

The name of an attribute.
The type of the attribute's value or an explicit set of possible values. Values defined explicitly by the DTD are case-insensitive. The HTML DTD uses the types of CDATA , NAME (to be distinguished from NAME attributes), NUMBER , ID , and other data types defined by ISO 8879.
Whether the default value of the attribute is implicit (keyword #IMPLIED ), in which case the default value must be supplied by the browser (in some cases via inheritance from parent elements); always required (keyword #REQUIRED ); or fixed to the given value (keyword #FIXED ). Some attributes explicitly specify a default value for the attribute.

The following examples illustrate possible attribute definitions:

rows NUMBER #REQUIRED specifies the number of rows in a TEXTAREA . This attribute requires values of type NUMBER .
HREF CDATA #IMPLIED specifies the HREF attribute of an anchor element. It indicates that the HREF attribute is optional, and if specified, it has a value of type CDATA .
ALIGN (top middle bottom) #IMPLIED specifies alignment options. It indicates that the optional ALIGN attribute is constrained to take values from the set {top, middle, bottom}.

Embedding attribute definitions within the ATTLIST definition associates them with the elements. For example, <!ATTLIST A HREF CDATA #IMPLIED> defines the attribute HREF for the A element to be #IMPLIED, namely optional, and to contain # CDATA , namely text that may include character entities.

5.2.3 Entities

The HTML DTD begins with a series of entity definitions [HTML]. An entity definition (not to be confused with an SGML entity) defines a kind of macro that may be expanded elsewhere in the DTD. When the macro is referred to by name in the DTD, it is expanded into a string.

An entity definition begins with the keyword <!ENTITY % followed by the entity name, the quoted string the entity expands to, and finally a closing >. For example, the following, extracted from RFC 1886 (made obsolete by RFC 2854), defines the string that the % InputType entity expands to, which, in this case, specifies the six types of input controls supported:

 <!ENTITY % InputType "(TEXT  PASSWORD  CHECKBOX                          RADIO  SUBMIT  RESET                          IMAGE  HIDDEN )">

The string the entity expands to may contain other entity names. These names are expanded recursively. In the following example, the %block entity is defined to include %list , %preformatted , and %block.forms .

 <!ENTITY % block "P  %list  DL  %preformatted  %block.forms">

The %block entity is used frequently to specify a content model that includes block level elements.

5.2.4 Block versus Text Elements

HTML elements are classified into block and text level elements [HTML]. The intuition guiding structural distinction is that block elements create structures larger than text (in-line) elements. Specifically, block level elements may contain in-line elements and other block level elements. In-line elements may generally contain only text and other in-line elements. Block level elements generally begin on new lines whereas text elements generally do not. Further, block level elements end an unterminated paragraph element.

5.2.5 Selected Elements

Complete coverage of all HTML elements is not feasible within the boundaries of this chapter; see the HTML tutorial [HTML-TUTORIAL] for a complete reference. Next, summarized is a list of elements that seem to be noteworthy from an architectural point of view. This provides some of the flavor of the HTML model to those not familiar with it. A simplified conceptual structure of an HTML document is depicted in Example 5.1.

Example 5.1 A simplified example HTML document

  <HTML>   <HEAD>   <TITLE>A study of ...</TITLE>   ...  other head elements  ...   </HEAD>   <BODY>   ...  document body  ...   </BODY>   </HTML>

5.2.5.1 The `<BODY>` Element

The body of a document contains the document's content [HTML] [HTML-TUTORIAL]. For iTV, the body is a canvas where the content appears: text, images, colors, graphics, and so on. Since style sheets are now the preferred way to specify a document's presentation, the presentational attributes of BODY have been deprecated and should not be used in iTV programs. The following attributes are not deprecated by style sheets:

id : The id attribute assigns a document wide name to the body element; this attribute shares the same namespace as the name attribute.
class : The class attribute specifies a class or set of classes assigned to the body element.
lang : The lang attribute specifies the primary language of an element's text content using the language code as given by RFC 1766.
title : The content of the title attribute may be rendered by iTV browsers as a tool tip or a short message that appears when the element is selected using the remote control.
onload : The value of the onload attribute specifies a script to be executed when the load event occurs. The load event occurs when the browser completes loading a window (or all frames within a FRAMESET - see below for discussion about frames).
onunload : The value of the onunload attribute specifies a script to be executed when the unload event occurs. The onunload event occurs when the browser removes a document from a window (or frame).
bgcolor : Specifies the background color ; see Table 5.1 for commonly used values.
onclick : The value of the onclick attribute specifies a script to be executed when the click event occurs, for example, when the remote control operation selects this element for highlighting. The click event occurs when the pointing device button is clicked over an element. This attribute can be used with most elements.
ondblclick : The value of the ondblclick attribute specifies a script to be executed when the double-click event occurs. The double-click event occurs when the element is selected for activation, for example, using the remote control selection button. This attribute can be used with most elements.
onmousedown : The value of the onmousedown attribute specifies a script to be executed when the mouse-down event occurs. The mouse-down event occurs when the pointing device button is pressed or a touch-screen is pressed over an element. If the sister event mouse-up occurs immediately thereafter on the same element, then such occurrence often triggers the click event, and sometimes it may trigger the double-click event. This attribute can be used with most elements.
onmouseup : The value of the onmouseup attribute specifies a script to be executed when the mouse-up event occurs. The mouse-up event occurs when the pointing device button is released or a touch-screen is released over an element. If immediately prior to this event the sister event, mouse-down, occurs on the same element, then such occurrence often triggers the click event, and sometimes it may trigger the double-click event. This attribute can be used with most elements.
onmouseover : The onmouseover attribute specifies a script to be executed when the mouse-over event occurs. The mouse-over event occurs when the pointing device button is moved over the element or a touch screen stroke passes over the element. This attribute can be used with most elements.
onmousemove : The onmousemove attribute is similar to the onmouseover attribute.
onmouseout : The onmouseout attribute is similar to the onmouseover attribute, but triggers an event when leaving the rendering area of an element.
onkeypress : The onkeypress attribute specifies a script to be executed when the key-pressed event occurs. The key-pressed event occurs when either a key is pressed or released during the time the pointing device is over an element, during the time that the element is selected, or in focus. This attribute can be used with most elements.
onkeydown : The onkeydown attribute is similar to the onkeypress attribute with the exception that a key is pressed and before it is released.
onkeyup : The onkeyup attribute is similar to the onkeypress attribute with the exception that a key is released.
style : The style attribute specifies style information for a single element using the default style sheet language; this attribute can be used by other elements as well. Example 5.2 below presents an example that sets color and font size information for the text in the body. The syntax of a CSS declaration is such that each property declaration is a name-value pair, e.g., " name:value ", and property declarations are separated by a semicolon.

Example 5.2 Usage of the style attribute.

  <BODY type="text/css" style="font-size: 12pt; color: red">   ... some text and other content ...   </BODY>

Table 5.1. Colors Using sRGB: #RRGGBB as Hex Values

Color Name	Hex Value	Color Name	Hex Value
Black	`#000000`	Red	`#FF0000`
White	`#FFFFFF`	Green	`#008000`
Silver	`#C0C0C0`	Blue	`#0000FF`
Gray	`#808080`	Lime	`#00FF00`
Maroon	`#800000`	Yellow	`#FFFF00`
Fuchsia	`#FF00FF`	Aqua	`#00FFFF`
Purple	`#800080`	Navy	`#000080`
Teal	`#008080`	Olive	`#808000`

5.2.5.2 The <STYLE> Element

The style attribute described earlier is appropriate for applying a particular style to an individual element [HTML] [HTML-TUTORIAL]. To allow the style to be reused by several elements, the STYLE element should be used. HTML permits any number of STYLE elements, specifying styling rules in the HEAD section of a document (see Example 5.3 and Example 5.4). For this element, both starting and ending tags are required. A general guideline is to place STYLE elements in style sheet files separate from the HTML document.

The type attribute is a #CDATA which specifies the style sheet language of the element's contents, thus overriding the default style sheet language. The style sheet language is specified as a media type, and commonly refers to text/css . In some iTV implementations , however, other media types could be used, such as those specified in Table 5.2.

Table 5.2. Possible iTV Style Content Types

Type Name	Possible File extension
audio/ac3	.ac3
audio/basic	.au
image/jpeg	.jpg
image/png	.png
text/css	.css
text/ecmascript	.es
video/mng	.mng
video/mpeg	.mpg
video/mpv	.mpg
application/mpeg	.mpg

The media attribute is #CDATA -list that specifies the intended destination medium for style information. It may be a single media type or a comma-separated list. Possible media types includes the following:

screen : Output is intended for non-paged computer screens. This is the default value.
print : Output is intended for paged, opaque material and for documents on screen viewed in print preview mode.
projection : Output is intended for projectors.
braille : Output is intended for braille tactile feedback devices.
speech : Output is intended for a speech synthesizer.
all : Applies to all devices.

Example 5.3 Places a border around every H1 and centers it on the page.

  <HEAD>   <STYLE type="text/css">   H1 {border-width: 1; border: solid; text-align: center}   </STYLE>   </HEAD>

Example 5.4 Usage of style classes.

  <HEAD>   <STYLE type="text/css">   H1.myclass {border-width:1; border:solid; text-align:center}   </STYLE>   </HEAD>   <BODY>   <H1 class="myclass"> Style applies to this H1 </H1>   <H1> Style doesn't apply here ... </H1>   <H1 class="otherclass"> Style doesn't apply here ... </H1>   </BODY>

5.2.5.3 The Anchor `<A>` Element

The anchor <A> element may define an anchor, a link, or both [HTML] [HTML-TUTORIAL]. When this link is activated by selecting it using the remote control navigation, for example, the browser (also known as the user agent) will retrieve the data pointed to by this link. Links may point to other HTML pages, images, audio, or video files, and other media types.

iTV browsers generally render links in such a way as to make their text distinguishable from text that is not between the <A> and </A> tags. The distinction may be by using a different color, underlined , or flashing text (rarely). For example, the text "the W3C technologies site" in Example 5.5 will be distinguishable from the text "For details see".

The name attribute could be used to specify an entry point within the document; see example in Example 5.5. The href attribute must specify a URI, which is an extensible mechanism for identifying a resource, which may be a file, a stream, a virtual channel, a multicast, or another type. The URI framework provides a way to concatenate a base-URI with a relative-URI to form an absolute URI identifying a resource. Moreover, a URI can be classified as a locator, a name, or both. URL refers to the subset of URIs that identify resources via a representation of their primary access mechanism (e.g., their network "location"), rather than identifying the resource by name or by some other attribute(s) of that resource. The term Uniform Resource Name (URN) refers to the subset of URI that are required to remain globally unique and persistent even when the resource ceases to exist or becomes unavailable. Some URIs are both URLs and URNs. Some are neither .

In addition to referencing a resource as a whole, a URI may contain references within resources using the # separator. For example, when the reference points to an HTML file, it is possible to use the # separator to specify the name of the component within the document.

Example 5.5 Usage of the anchor tag.

  For details see <A href="http://www.w3.org/#technologies">the W3C technologies site</A>   ... where the HTML file retrieved from http://www.w3.org contains:   <h3 class="navhead"><a name="technologies">W3C A to Z</a></h3>

5.2.5.4 The `<TABLE>` Element

One of the most powerful yet simple methods for producing information-rich interfaces is the use of tables [HTML] [HTML-TUTORIAL]. An obvious application is iTV electronic program guides. In HTML, the TABLE element is used for that purpose. The TABLE element contains, as children, all other elements that specify caption, rows, content, and formatting (see Example 5.6). The number of rows in a table is equal to the number of child TR elements. Determining the number of columns is more complicated, but can be done using the following heuristics:

Scan each row in turn to compute the minimum number of columns needed (taking column spans into account). If the column count for the table exceeds the number of cells in a given row (including spanned rows), the end of that row is padded with empty cells. The "end" of a row depends on the direction of the table.
Count the number of columns as specified by TH , TD , COL , and COLGROUP elements that can only occur at the start of the table (after the optional CAPTION ).
Use the cols attribute on the TABLE element. This is the weakest method as you don't get any additonal information on column widths. This may not matter if you use style sheets to specify widths.

Figure 5.2 The DTD specification of the TABLE element.

 <!ELEMENT TABLE - - (CAPTION?,(COL*COLGROUP*),THEAD?, TFOOT?, TBODY+)> <!ATTLIST TABLE                 -- table element -- %attrs;                          -- %coreattrs, %i18n, %events -- align %TAlign; #IMPLIED  -- table position relative to window -- bgcolor %Color #IMPLIED  -- background color for cells -- width CDATA #IMPLIED  -- table width relative to window -- cols NUMBER #IMPLIED  -- used for immediate display mode -- border CDATA #IMPLIED  -- controls frame width around table -- frame %TFrame; #IMPLIED  -- which parts of table frame to include -- rules %TRules; #IMPLIED  -- rulings between rows and cols -- cellspacing CDATA #IMPLIED  -- spacing between cells -- cellpadding CDATA #IMPLIED  -- spacing within cells -- >

Example 5.6 Usage of the TABLE element.

  <TABLE cols="3">   <CAPTION>Table caption goes here ...</CAPTION>   <COLGROUP align="center">   <COL width="1*">   <COL width="3*" align="char" char=":">   /<COLGROUP>   <THEAD>   <TR> <TH> HeadA </TH> <TH> HeadB </TH> </TR>   </THEAD>   <TFOOT>   <TR> <TH> FootA </TH> <TH> FootB </TH> </TR>   </TFOOT>   <TBODY>   <TR> <TD> DataA1 </TD> <TD> DataB1 </TD> </TR>   <TR> <TD> DataA2 </TD> <TD> DataB2 </TD> </TR>   <TR> <TD> DataA3 </TD> <TD> DataB3 </TD> </TR>   </TBODY>   <TBODY>   <TR> <TD> A </TD> <TD> B </TD> </TR>   </TBODY>   </TABLE>

The CAPTION element's text should describe the nature of the table. The CAPTION element must come immediately after the <TABLE> start tag. The THEAD , TFOOT , and TBODY element can be used to mark the header, footer, and body of the table, respectively. Within the THEAD element, the TH element should be used to mark header text. The COL and COLGROUP elements could also be used to further enhance the rendering of the table.

There is no standard rendering method for tables of various configurations defined with these elements. Each iTV receiver manufacturer has its own look-and-feel. Volumes of related and somewhat more detailed information can be found in books that focus on HTML.

5.2.5.5 The <OBJECT> Element

The OBJECT element allows authors to control whether included objects are handled by browsers internally or externally [HTML] [HTML-TUTORIAL]. The PARAM element, which must be a child of the OBJECT element (see Figure 5.3 for the specification) specifies a set of run-time values. In the most general case, the OBJECT (and the nested PARAM ) element passes to an inserted a plug-in rendering mechanism three types of information:

A reference to the rendering code
A reference to the data to be rendered
Additional run-time (configuration) parameters

Figure 5.3 The DTD specification of the `OBJECT` element.

 <!ELEMENT OBJECT - - (PARAM  %block)*> <!ATTLIST OBJECT   %attrs                           -- %coreattrs, %i18n, %events --   declare (declare)  #IMPLIED -- declare but don't instantiate flag --   classid %URL #IMPLIED  -- identifies an implementation --   codebase %URL #IMPLIED  -- some systems need an additional URL --   data %URL #IMPLIED  -- reference to object's data --   type %ContentType #IMPLIED  -- Internet content type for data --   codetype %ContentType #IMPLIED  -- Internet content type for code --   standby CDATA #IMPLIED  -- message to show while loading --   align %OAlign #IMPLIED  -- positioning inside document --   height %Length #IMPLIED  -- suggested height --   width %Length #IMPLIED  -- suggested width --   border %Length #IMPLIED  -- suggested link border width --   hspace %Length #IMPLIED  -- suggested horizontal gutter --   vspace %Length #IMPLIED  -- suggested vertical gutter --   usemap %URL #IMPLIED  -- reference to image map --   shapes (shapes) #IMPLIED  -- object has shaped hypertext links --   name %URL #IMPLIED  -- submit as part of form --   tabindex NUMBER #IMPLIED  -- position in tabbing order -- >

In certain situations, it may not be necessary to specify all of this information. For example, some rendering mechanisms might not require data (e.g., a self-contained applet that reforms a small animation). Other rendering mechanisms might not require run-time initialization. Finally, some rendering mechanisms may not require additional implementation information, i.e., the browser itself may already know how to render that type of data (e.g., GIF images).

One significant consequence of the OBJECT element's design is that it offers a mechanism for specifying alternate object renderings ; each embedded OBJECT declaration may specify an alternate rendering mechanism. If an iTV browser cannot render the outermost OBJECT , it tries to render the contents, which may be another OBJECT element. In Example 5.7 several OBJECT declarations are nested to illustrate how alternate renderings work. iTV browsers should attempt to render the first OBJECT element it can, in the following order: (a) an applet, (b) an MPEG animation of the Earth, (c) a GIF image of the Earth, (d) alternate text.

Example 5.7 Using nested OBJECT elements for alternate rendering options.

  <OBJECT title="The Earth as seen from space"   classid="http://www.observer.mars/TheEarth.class">   <OBJECT data="TheEarth.mpeg" type="application/mpeg">   <OBJECT src="TheEarth.gif">   The <STRONG>Earth</STRONG> as seen from space.   </OBJECT>   </OBJECT>   </OBJECT>

Sometimes, rather than linking to another document, it is helpful to include the contents of an HTML document in another HTML document (see Example 5.8). It is recommend to use the OBJECT element with the data attribute for this purpose. For instance, the following line will include the contents of piece_to_include.html at the location where the OBJECT definition occurs. Note that the warning text, which is the contents of OBJECT , must only be rendered if the file specified by the data attribute cannot be loaded.

Example 5.8 Using `OBJECT` to include an HTML document within another HTML document.

  ...text before...   <OBJECT data="file_to_include.html">   Warning: file_to_include.html could not be included.   </OBJECT>   ...text after...

5.2.5.6 The `<IMG>` Element

The IMG element embeds an image in the current document at the location of the element's definition. The height and width attributes of this element cause the rescaling of the image to fit the specified size. The specification is provided in Figure 5.4. As an example, to mark an image family.png accessible from www.foo.com whose title is "A family photo" one could use

 <IMG src="http://www.foo.com/vacation/family.png" alt="A family photo">

A much better alternative to the IMG element is to use the OBJECT element as follows :

 <OBJECT data="http://www.foo.com/vacation/family.png" type="image/png"> A family photo</OBJECT>.

Figure 5.4 The DTD specification of the IMG element.

 <!ELEMENT IMG - O EMPTY      --  Embedded image --> <!ATTLIST IMG   %attrs;                          -- %coreattrs, %i18n, %events --   src %URL #REQUIRED -- URL of image to embed --   alt CDATA #IMPLIED  -- description for text only browsers --   align %IAlign #IMPLIED  -- vertical or horizontal alignment --   height %Pixels #IMPLIED  -- suggested height in pixels --   width %Pixels #IMPLIED  -- suggested width in pixels --   border %Pixels #IMPLIED  -- suggested link border width --   hspace %Pixels #IMPLIED  -- suggested horizontal gutter --   vspace %Pixels #IMPLIED  -- suggested vertical gutter --   usemap %URL #IMPLIED  -- use client-side image map --   ismap (ismap) #IMPLIED  -- use server-side image map -- >

5.2.5.7 The `<FORM>` Element

The FORM element acts as a container for controls [HTML] [HTML-TUTORIAL]. It specifies the following:

The layout of the form (given by the contents of the element).
The program that will handle the completed and submitted form (the action attribute). The receiving program must be able to parse name-value pairs to make use of them.
The method by which user data will be sent to the server (the method attribute).
A character encoding that must be accepted by the server to handle this form (the accept-charset attribute). Browsers may advise the user of the value of the accept-charset attribute or restrict the user's ability to enter unrecognized characters .

The action attribute is a URI that specifies a program for handling the submitted form. It can be an HTTP URL (to submit the form to a program) or a MAILTO URL (to email the form).

The method attribute may specify either get or post , which indicates which HTTP method will be used to submit name-value pairs to the form handler.

post : Use the HTTP POST method. The POST method includes name-value pairs in the body of the form and not in the URL specified by the action attribute.
get : Use the HTTP GET method. The GET method appends name-value pairs to the URL specified by action and sends this new URL to the server. This is the default value for backwards compatibility. This value has been deprecated for reasons of internationalization, and this method should not be used as it is deprecated.

The enctype attribute contains #CDATA that specifies the MIME types [MIME] used to submit the form to the server (when the value of method is POST ). The value multipart/form-data should be used when submitting files.

The accept-charset attribute contains #CDATA that specifies the list of character encodings for input data that must be accepted by the server processing this form. The value is a space or comma-delimited list of charsets as defined in RFC 2045. The server must be able to accept any single character encoding per entity received.

The accept attribute contains #CDATA that specifies a comma-separated list of MIME types that a server processing this form will handle correctly. iTV browsers may use this information to filter out non-conformant files when prompting a viewer to select files to be sent to the server.

Forms are populated by control elements that generally appear as the FORM 's child elements. However, these elements may also appear outside of a FORM element declaration when they are used to build user interfaces.

Input Element

The INPUT element specifies the controls that allow viewers to enter data. The following types of input elements are supported by HTML:

text : The text input type creates a single line text box. The value submitted by a text control is the input text.
password : The password input type is similar to the text input type, but the input text is rendered in such a way as to hide the characters (e.g., a series of asterisks ). This control is used for sensitive input such as passwords. The value submitted by a password control is the input text (not the rendering).
checkbox : The checkbox input type is an on-off switch. When the switch is on, the value of the checkbox is active . When the switch is off, the value is inactive . The checkbox value is only submitted with the form when the switch is on. Several check boxes within the same form may bear the same name. On submission, each on check box with the same name submits a name-value pair with the same name component. This allows users to select more than one value for a given property.
radio : The radio input type is a radio button on-off switch. When the switch is on, the value of the radio button is active . When the switch is off, the value is inactive . The radio button value is only submitted with the form when the switch is on. Several radio buttons within the same form may bear the same name. However, only one of these buttons may be on at any one time. All related buttons are set to off as soon as one is set to on . Thus, for related radio buttons, only one name-value pair is ever submitted.
submit : Creates a submit button. When this button is activated by the user, the form is submitted to the location specified by the action attribute of the enveloping FORM element. A form may contain more than one submit button. Only the name-value pair of the activated submit button is submitted with the form.
image : Creates a graphic submit button. The value of the src attribute specifies the URL of the image that will decorate the button. When a pointing device, e.g., an iTV track ball, is used to click on the image, the form is submitted and the location passed to the server in terms of x and y coordinates. The x value is measured in pixels from the left of the image, and the y value in pixels from the top of the image. The submitted data includes name .x=x-value and name .y=y-value where name is the value of the name attribute, and x-value and y-value are the x and y coordinate values, respectively.
reset : Creates a reset button. When this button is activated by the user, all of the form's controls have their values reset to the initial values specified by their value attributes. The name-value pair of a reset button is not submitted with the form.
button : Creates a push button that has no default behavior. The behavior of the button is defined by associating the button with client-side scripts that are triggered when events affecting the button occur (e.g., clicking the button). The value of the value attribute is the label used for the button. For example, the following declaration causes the function named verify() to be executed when the button is clicked; the script containing the verify() function must be defined by a SCRIPT element.
```
 <INPUT type="button" value="Click Me" onclick="verify()"> 
```
hidden : Creates a hidden element that is not rendered by the browser. However, the element's name and value are submitted with the form. This control type is generally used to store session state information that would otherwise be lost due to the stateless nature of HTTP. Controls of the hidden type have their values submitted with the form. For example, the following control, although hidden by the browser, will have its value submitted with the form. Note that the content of the form may be exposed using the view source option available in most PC-based Web browsers.
```
 <INPUT type="password" style="display:none"        name="invisible-password"        value="mypassword"> 
```
file : Prompts the user for a file name. When the form is submitted, the contents of the file are submitted to the server as well as other user input. Browsers should encapsulate multiple files in a MIME multipart document [MIME]. This mechanism encapsulates each file in a body-part of a multipart MIME body that is sent as the HTTP entity. Each body part can be labeled with an appropriate content type, including, if necessary, a charset parameter that specifies the character encoding.

Example 5.9 depicts a sample HTML fragment defining a simple form that allows the user to enter a first name, last name, email address, and sex. When the submit button is activated, the form is sent to the program specified by the action attribute. A possible rendering of this form is depicted in Figure 5.5 and the DTD specification is shown in Figure 5.6. To render the form viewer-friendly, it is recommended to use tables and other layout tools such as style sheets.

Example 5.9 A sample `FORM` embedded within a `TABLE` .

  <FORM action="http://somesite.com/prog/adduser" method="post">   <TABLE>   <TR>   <TD>First name:</TD>   <TD><INPUT type="text" name="firstname"></TD>   </TR>   <TR>   <TD>Last name: </TD>   <TD><INPUT type="text" name="lastname"></TD>   </TR>   <TR>   <TD>email:     </TD>   <TD><INPUT type="text" name="email"></TD>   </TR>   <TR>   <TD><INPUT type="radio" name="sex" value="Male">Male</TD>   <TD><INPUT type="radio" name="sex" value="Female">Female</TD>   </TR>   <TR>   <TD colspan=2 align=center>   <INPUT type="submit" value="Send"> <INPUT type="reset">   </TD>   </TR>   </TABLE>   </FORM>

Figure 5.5. A possible rendering of the sample `FORM` in Example 5.9.

Figure 5.6 The DTD specification of the `FORM` element.

 <!ELEMENT FORM - - %block -(FORM)> <!ATTLIST FORM %attrs; -- %coreattrs, %i18n, %events --   action %URL #REQUIRED -- server-side form handler --   method (GETPOST) GET -- HTTP method used to submit the form --   enctype %ContentType; "application/x-www-form-urlencoded"   onsubmit %Script #IMPLIED  -- the form was submitted --   onreset %Script #IMPLIED  -- the form was reset --   target CDATA #IMPLIED  -- where to render result --   accept-charset CDATA #IMPLIED -- list of supported charsets -- >

`BUTTON` Element

A BUTTON element with a type of submit is similar to an INPUT element with the same type: They both cause a form to be submitted, but the BUTTON element allows richer presentational possibilities [HTML] [HTML-TUTORIAL].

A BUTTON element with a type of submit and whose content is an image element IMG is very similar to an INPUT element with a type of image . They both cause a form to be submitted, but their presentation is different: An INPUT element is supposed to be rendered as a flat image, whereas a BUTTON is supposed to be rendered as a 3D button with relief and an up and down motion when clicked.

`SELECT` and `OPTION` Elements

The SELECT element creates a list of choices that may be selected by the user [HTML] [HTML-TUTORIAL]. Each SELECT element must contain at least one choice. Each choice is specified by an instance of the OPTION element.

When the form is submitted, each selected choice will be paired with the name "component-select" and submitted. The submitted value of each OPTION will be its contents, except where overridden by the value attribute (here, in the first two components ).

`TEXTAREA` Element

The TEXTAREA element creates a multiline text input control (as opposed to a single-line input control) [HTML] [HTML-TUTORIAL]. The content of this element provides the initial text presented by the control. Setting the read-only attribute allows authors to display un-modifiable text in a TEXTAREA . This differs from using standard marked -up text in a document because the value of TEXTAREA is submitted with the form.

`LABEL` Element

The LABEL element may be used to attach information to other control elements (excluding other LABEL elements) [HTML] [HTML-TUTORIAL]. Labels may be rendered by browsers in a number of ways, including visually or read by speech synthesizers. When a LABEL element receives focus, it passes the focus on to its associated control.

Associating a label with another control implicitly, is achieved by placing control as the contents of the LABEL . In this case, the LABEL may only contain one other control element. The label itself may be positioned before or after the associated control. More than one LABEL may be associated with the same control by creating multiple references using the for attribute.

`FIELDSET` Element

The FIELDSET element allows form designers to group thematically related controls together [HTML] [HTML-TUTORIAL]. Grouping controls makes it easier for users to understand their purpose while simultaneously facilitating tabbing navigation for visual browsers and speech navigation for speech-oriented browsers. The proper use of this element makes documents more accessible to people with disabilities .

`LEGEND` Element

The LEGEND element allows designers to assign a caption to a FIELDSET [HTML] [HTML-TUTORIAL]. The legend improves accessibility when the FIELDSET is rendered non-visually. When rendered visually, setting the align attribute on the LEGEND element aligns it with respect to the FIELDSET .

5.2.5.8 The `<FRAMESET>` and `<FRAME>` element

The FRAMESET element specifies the layout of the main entry window in terms of rectangular subspaces; they may be nested to any level [HTML] [HTML-TUTORIAL]. Sample HTML is presented in Example 5.10; the resulting layout is depicted in Figure 5.7. Finally, the DTD specification is presented in Figure 5.8.

Figure 5.7. The layout of the sample nested `FRAMESET` .

Setting the rows attribute defines the number of horizontal subspaces. Setting the cols attribute defines the number of vertical subspaces. Both attributes may be set simultaneously to create a grid. If the rows attribute is not set, each column extends the entire length of the page. If the cols attribute is not set, each row extends the entire width of the page. If neither attribute is set, the frame takes up exactly the size of the page.

The rows and cols attributes must have values that are comma-separated lists of lengths. A length may be absolute (given as a number of pixels or a percentage of the screen) or a relative length, indicated by the form i* , where i is an integer. When allotting space to rows and columns, browsers allot absolute lengths first, then divide up remaining space among relative length rows or columns. The value * is equivalent to 1* .

The FRAME element defines the contents and appearance of a single view. The src attribute of the FRAME element specifies the initial document the frame will contain; it is not possible for the contents of a frame to be in the same document as the frame's definition. Views are created left to right for columns and top to bottom for rows. When both attributes are specified, views are created left to right in the top row, left to right in the second row, and so on.

Example 5.10 A nested `FRAMESET` with `FRAME` views.

  <HTML>   <FRAMESET cols="33%,33%,33%">   <FRAMESET rows="*,200">   <FRAME src="contents_of_frame1.html">   <FRAME src="contents_of_frame2.gif">   </FRAMESET>   <FRAME src="contents_of_frame3.html">   <FRAME src="contents_of_frame4.html">   </FRAMESET>   </HTML>

Figure 5.8 The DTD specification of the FRAMESET and FRAME elements.

 <!ELEMENT FRAMESET - - ((FRAMESETFRAME)+ & NOFRAMES?)> <!ATTLIST FRAMESET   -- absolute pixel values, percentages or relative scales. --   rows CDATA #IMPLIED  -- if not given, default is 1 row --   cols CDATA #IMPLIED  -- if not given, default is 1 column --   onload %Script #IMPLIED  -- all the frames have been loaded  --   onunload %Script #IMPLIED  -- all the frames have been removed -- > <!ELEMENT FRAME - O EMPTY> <!ATTLIST FRAME   name CDATA      #IMPLIED -- name of frame for targeting --   src %URL       #IMPLIED -- source of frame content --   frameborder (10) 1 -- request frame borders? --   marginwidth %Pixels #IMPLIED -- margin widths in pixels --   marginheight %Pixels #IMPLIED -- margin height in pixels --   noresize (noresize) #IMPLIED -- allow users to resize frames? --   scrolling (yesnoauto) auto     -- scrollbar or none -- >

5.2.5.9 The `<SCRIPT>` element

The SCRIPT element places a script within a document. This element may appear any number of times in the HEAD or BODY of an HTML document. The DTD specification of the SCRIPT element is given in Figure 5.9. The script may be defined within the contents of the SCRIPT element or in an external file. If the src attribute is not set, browsers must interpret the contents of the element as the script. If the src has a URL value, browsers must ignore the element's contents and retrieve the script via the URL. Scripts are evaluated by script engines that must be known to a browser.

As HTML does not rely on a specific scripting language, document authors must explicitly tell browsers the language of each script. This may be done either through a default declaration or a local declaration. Documents that contain neither a default scripting language declaration nor a local one for a SCRIPT element are incorrect. Browsers might still try to interpret the script but are not required to.

Figure 5.9 The DTD specification of the SCRIPT element.

 <!ELEMENT SCRIPT - - CDATA      -- script statements --> <!ATTLIST SCRIPT   type CDATA #IMPLIED  -- content type for script language --   language CDATA #IMPLIED  -- predefined script language name --   src %URL #IMPLIED  -- URL for an external script -- >

To specify the default scripting language for all scripts in a document include the following META declaration in the HEAD of a document:

 <META http-equiv="Content-Script-Type" content="type">

where type is an Internet media (also known as MIME [MIME]) type naming the scripting language. Examples of values include text/tcl , text/javascript , text/vbscript .

It is also possible to specify the scripting language in each SCRIPT element via the type attribute. In the absence of a default scripting language specification, this attribute must be set on each SCRIPT element. When a default scripting language has been specified, the type attribute overrides it.

Each scripting language has its own conventions for referring to HTML objects from within a script. Typically, however, the DOM is used to make such references [DOM]. According to this model, scripts refer to an element according to its name assigned to them using the name or id attributes, where a name attribute takes precedence over the id attribute if both are set; otherwise, one or the other may be used.

5.2.1 Element Definitions

5.2.1.1 Content Model Definitions

5.2.2 Attribute Definitions

5.2.3 Entities

5.2.4 Block versus Text Elements

5.2.5 Selected Elements

Example 5.1 A simplified example HTML document

5.2.5.1 The <BODY> Element

Example 5.2 Usage of the style attribute.

Table 5.1. Colors Using sRGB: #RRGGBB as Hex Values

5.2.5.2 The <STYLE> Element

Table 5.2. Possible iTV Style Content Types

Example 5.3 Places a border around every H1 and centers it on the page.

Example 5.4 Usage of style classes.

5.2.5.3 The Anchor <A> Element

Example 5.5 Usage of the anchor tag.

5.2.5.4 The <TABLE> Element

Figure 5.2 The DTD specification of the TABLE element.

Example 5.6 Usage of the TABLE element.

5.2.5.5 The <OBJECT> Element

Figure 5.3 The DTD specification of the OBJECT element.

Example 5.7 Using nested OBJECT elements for alternate rendering options.

Example 5.8 Using OBJECT to include an HTML document within another HTML document.

5.2.5.6 The <IMG> Element

Figure 5.4 The DTD specification of the IMG element.

5.2.5.7 The <FORM> Element

Input Element

Example 5.9 A sample FORM embedded within a TABLE .

Figure 5.5. A possible rendering of the sample FORM in Example 5.9.

Figure 5.6 The DTD specification of the FORM element.

BUTTON Element

SELECT and OPTION Elements

TEXTAREA Element

LABEL Element

FIELDSET Element

LEGEND Element

5.2.5.8 The <FRAMESET> and <FRAME> element

Figure 5.7. The layout of the sample nested FRAMESET .

Example 5.10 A nested FRAMESET with FRAME views.