Section 10.3. The DOM HTML API | Learning JavaScript, 2nd Edition

10.3. The DOM HTML API

The core API works with any valid XML, including XHTML; the HTML API is specific to valid XHTML and HTML only. It consists of a set of HTML objects, each associated with a valid HTML element tag; all have properties and methods appropriate to the object.

Though a separate set of objects, the two modelscore and HTMLoverlap, with the HTML API objects incorporating methods and properties from both models. As such, HTML API objects inherit properties and methods of a basic HTML Element, as well as the core Node object (discussed in the next section).

10.3.1. The HTML Objects and Their Properties

The HTML API is a set of interfaces rather than actual classes. These interfaces can access existing or newly created page objects, and each is associated with a specific type of page object.

I've introduced a new term, interface. For our purposes, an interface is an object representing the specific page element. It differs from a class in that there is no constructor; objects are created through other functions rather than directly.

Most HTML interface objects inherit the properties and methods of the Element and Node objectsboth of which are part of the core model, and discussed later in the chapter. Most also inherit from HTMLElement, which has the following properties (based on the set of attributes of the same name allowed for all HTML elements): id, title, lang, dir, and className.

Each interface object takes its name from the HTML formal element name, not necessarily the element tag. As such, HTMLFormElement is the HTML form element's interface object, but HTMLParagraphElement is the object for the paragraph (P) tag. The objects provide access to all valid attributes for the elements, such as align for HTMLDivElement, and src for HTMLImageElement.

Most of these properties are read and write, which means they can be altered as well as accessed from JavaScript. To demonstrate, in Example 10-1, an image is accessed using the document images collection. The image attributes are concatenated to a string which is then output via an alert. Following the message, the image attributes are modified.

Example 10-1. Reading and modifying image element's properties

<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd"> <html> <head> <meta http-equiv="Content-Type" content="text/html; charset=utf-8" /> <title>Accessing/Modifying HTML Elements</title> <script type="text/javascript"> //<![CDATA[ function procImage(  ) {    var img = document.images[0];       // get existing image attributes    var imgAttr = img.align + " " + img.alt + " " + img.src                  + " " + img.width + " " + img.height;    alert(imgAttr);    // modify    img.src="/books/4/327/1/html/2/upright.gif";    img.width="100";    img.height="100";    img.alt="Alternative";    img.align="left";    img.title="Upright";    document.close(  ); } //]]> </script> <body onload="procImage(  );"> <img src="/books/4/327/1/html/2/dotty.gif" alt="Dotty" /> </body> </html>

Several of the DOM HTML interface objects also provide methods to create, remove, or otherwise modify the associated page elements. The table elements, in particular, have a set of such methods and associated objects. However, the process is somewhat code-intensive, made more so because of the fact (as mentioned in a note earlier) that the API objects have no constructor. To create new objects, you'll need to use one of the factory methods, as demonstrated in Example 10-2.

If you've not been exposed to programming languages that support interfaces, think of them as code wrappers that isolate the mechanics of the underlying objects. When working with an interface, the API provides methods, usually referred to as factory methods, that can create and return the objects they wrap.

In Example 10-2, an image and an empty HTML table are added to the document. When the document loads, a function is called that accesses the table and image using getElementById on the document object.

To add to the table, you call the insertRow method on the table element, passing in a value of 1, which appends the row to the end of the table. This method returns an object that implements the HTMLElement interface. Thanks to JavaScript's loose typing, this object also implements the HTMLTableRowElement interface.

Example 10-2. Outputting image properties to table using DOM HTML interfaces

<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd"> <html> <head> <meta http-equiv="Content-Type" content="text/html; charset=utf-8" /> <title>Build-o-Table</title> <script type="text/javascript"> //<![CDATA[ function procImage(  ) {    // get table and image    var tbl = document.getElementById('table1');    tbl.border="5px";    tbl.cellPadding="5px";    var img = document.getElementById("img1");    img.vspace="10";        // for each attribute, add table row    var row1 = tbl.insertRow(-1);    // create two table cells    var cell1 = row1.insertCell(0);    var cell2 = row1.insertCell(1);    // create text values    var txtAttr1 = document.createTextNode("src");    var txtAttr1Val = document.createTextNode(img.src);    // append to text values to cells    cell1.appendChild(txtAttr1);    cell2.appendChild(txtAttr1Val); } //]]> </script> <body onload="procImage(  );"> <img  src="/books/4/327/1/html/2/dotty.gif" /> <table > </table> </body> </html>

There's a method on the HTMLTableRowElement interface, insertCell, which in turn creates another HTMLElement representing a specific table-row cell. Two such cells are created through insertCell: one for each TD (table data) element in the table.

To add text, the createTextNode factory object creates a text object consisting of a string passed to the method. The text object is appended to the table cell object using appendChild. (If you want to remove the row, use removeRow, passing in the row number.)

As you can see, adding and removing objects in the web page using the DOM HTML API isn't complicated, but it can be tedious.

There are other DOM HTML interfaces that don't directly represent specific HTML elements. The collections of objects that can be accessed through the document object are represented by the HTMLCollection interface. It has one property, length, and two methods: item, which takes a number index, and namedItem, which takes a string. Both return objects in the collection.

The HTMLOptionsCollections represents the list of options for a select element, itself represented by HTMLSelectElement. Accessing the options property on this later interface returns the HTMLOptionsCollections object with options. As with HTMLCollections, access the individual items with item and namedItem.

The last interface object I'll cover is HTMLDocumentElement. It inherits functionality from the Core model document object, and if you explored document in Chapter 9, you won't be surprised at the provided methods and properties. Images, applets, links, forms, and anchors are included as properties returning a collection. Other properties include cookie, title, referrer, domain, URL, and body (for the body object).

The methods HTMLDocumentElement exposes, again, will seem very familiar: open, close, write, and writeln. However, one that hasn't been demonstrated is getElementsByName, and we'll look at that next.

This page (http://www.w3.org/TR/DOM-Level-2-HTML/ecma-script-binding.html) at the W3C provides a look at the ECMAScript binding (JavaScript implementation) of the Level 2 HTML API.

10.3.2. Accessing HTML Objects and Browser Differences

There are different techniques you can use to access the DOM HTML representation of a page element. The first gives it a specific identifier (id) and then uses the document's getElementById method:

<div > ... var div1 = document.getElementById("div1");

You can also access the elements using their relationship with one another. For instance, in the following HTML:

<form> <input type="text" /> </form>

Access the form field through the forms collection on the document object:

document.forms[0].fields[0];

We've looked at both approaches in previous examples. A third way to access an individual element is by using the document object's getElementsByName, and then passing in the element's name. This method returns a nodeList containing a collection of nodes of the same name. All browsers support document.getElementsByName, but not all browsers return the same nodeList.

Example 10-3 uses getElementsByName to access all elements with given names within the web page. There are several different types of HTML elements, each given a unique name: a DIV element, a link, an unordered list and one of its items, a form and a form field, and a paragraph. Once the named list is returned, the element's typefound in the tagName property of each nodeis concatenated to a string and output via a dialog window at the end of the application.

Example 10-3. Finding elements by name and printing out their associated class name

<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd"> <html> <head> <meta http-equiv="Content-Type" content="text/html; charset=utf-8" /> <title>Modifying Named Elements</title> <script type="text/javascript"> //<![CDATA[ function findName(  ) {    // get all elements named 'elem' + number    for (var i = 1; i <= 7; i++) {       var nmStr = "elem" + i;       var nmList = document.getElementsByName(nmStr);       // create string of types       var typeStr =  "";       for (var j = 0; j < nmList.length; j++) {          typeStr += nmList[j].tagName + " ";       }       // output string       alert(typeStr);    } } //]]> </script> </head> <body onload="findName(  );"> <div name="elem1"> <ul name="elem2"> <li>option 1</li> <li name="elem3">option 2</li> </ul> </div> <a href="ch10-02.htm" name="elem4">Example 1</a> <p name="elem5">Paragraph</p> <form name="elem6"> <input type="text" name="elem7" /> </form> </body> </html>

As expected, this application works in Safari, Firefox, Netscape Navigator, Opera, and Internet Explorer, but the string returned differs.

Firefox, Safari, and Netscape Navigator return a string of:

DIV UL LI A P FORM INPUT

Opera and Internet Explorer return:

A FORM INPUT

Why the discrepancy? Well, in this case, Opera and Internet Explorer have it right. Running the page through the W3C validator, it doesn't validate as transitional XHTML (the current doctype), or when an override to HTML 4.01 is in effect. The reason is that the name attribute is not supported on DIV, UL, LI, and P tagsexactly the ones that IE and Opera did not list.

Another odd thing: valid HTML does not support multiple elements with the same name, though several browsers do. If I had given all the elements the same name, the example would still work with Firefox, Safari, and Navigator. This is a good example of how browser-specific JavaScript may forgive more than it should.

Internet Explorer has received a great deal of criticism in the last few years for its noncompliance to more universal norms. Much of it is deserved, as the industry struggled with cross-browser issues related to an old and outdated Internet Explorer 6.x. Many of the noncompliance issues still are not resolved with Internet Explorer 7+, though there is much improvement.

However, not all acts of noncompliance rest completely on IE. As this section demonstrated, sometimes a loose interpretation of a specification can be just as erroneous as a missing one.

One way around such browser differences is to avoid using the DOM HTML interfaces, code your web pages in compliant XHTML instead of HTML, and then use the Core API as much as possible.