Section 8.2. Markup Basics


8.2. Markup Basics

An HTML or XHTML document is an ASCII (plain text), or more often, Unicode (e.g., UTF-8) document that has been marked up with tags that indicate elements and other necessary declarations (such as the markup language it is written in). An element is a structural component (such as a paragraph) or a desired behavior (such as a line break). This section introduces the key components and behaviors of HTML documents, including elements, attributes, how elements may be nested, and information in a document that is ignored by browsers.

8.2.1. Elements

Elements are denoted in the text source by the insertion of special bracketed HTML tags. Most elements follow this syntax.

 <element-name>content</element-name> 

The element name appears in the start tag (also called the opening tag) and again in the end (or closing ) tag, preceded by a slash (/). The end tag works something like an "off" switch for the element. Nothing within the brackets is displayed by the browser or other user agent. It is important to note that the element includes both the content and its markup (the start and end tags ).

In XHTML, all element and attribute names must be lowercase. HTML is not case sensitive.


Consider this example of HTML markup that identifies the content at the beginning of this section as a second-level heading element and a paragraph element:

 <h2>Elements</h2> <p>Elements are denoted in the text source by the insertion of special bracketed HTML tags. Most elements follow this syntax.</p> 

In HTML 4.01 and earlier, the end tag for some elements is optional, and the browser determines when the tag ends by context. This practice is most common with the p (paragraph) element. Most browsers automatically end a paragraph when they encounter a new start tag. In XHTML, end tags are always required.


Some elements do not have content because they are used to provide a simple directive. These elements are said to be empty. The image element (img) is an example of such an element; it tells the browser to call a graphic file from an external location into the current page. Other empty elements include the line break (br), horizontal rule (hr), and elements that provide information about a document and don't affect its displayed content, such as the meta and base elements. Table 8-1 lists all the empty elements in HTML.

In HTML 4.01 and earlier, empty elements simply didn't have a closing tag. In XML, termination is required for all elements. The convention is to use a trailing slash within the tag to signify the element's termination, as in <img/>, <br/>, and <hr/>. For reasons of backward compatibility, it is recommended to add a space before the slash, as shown in Table 8-1. The space is necessary if you are sending your XHTML with the HTTP Content-Type of text/html.

Table 8-1. Empty elements

<area />

<frame />

<link />

<base />

<hr />

<meta />

<basefont />

<img />

<param />

<br />

<input />

 

<col />

<isindex />

 


An excellent resource for HTML element information is Index DOT Html (www.blooberry.com/indexdot/html/), which was created and is maintained by Brian Wilson. It provides an alphabetical listing of every HTML element and its attributes , with explanations, standards details, and browser support information.

8.2.2. Attributes

An attribute clarifies or modifies an element's actions. Attributes are indicated by attribute name and value pairs added to the start tag of the element (end tags never contain attributes). Attribute names and their accepted values are declared in the DTD; in other words, you cannot make up your own. You can add multiple attributes within a single opening tag. Attributes, if any, go after the tag name, each separated by one or more spaces. Their order of appearance is not important.

The syntax for an element with attributes is as follows:

 <element attribute="value">content</element> 

The following are examples of elements that contain attributes:

 <head profile="http://gmpg.org/xfn/11">...</head> <img src="/books/4/439/1/html/2/graphics/pixie.gif" alt="pixie" /> <table summary="This is a conference schedule.">...</table> 

Most browsers cannot handle attribute values more than 1,024 characters in length. Values may be case-sensitive, particularly filenames or URLs.

XHTML requires that all attribute values be enclosed in quotation marks. Single or double quotation marks may be used, as long as they are used consistently throughout the document.

In HTML 4.01 and earlier, some values are permitted to go unquoted; for instance, if the value is a single word containing only letters (a-z or A-Z), digits (0-9), hyphens (-), periods (.), underscores (_), and colons (:). It is the best practice to quote all values, regardless of the Recommendation you are following.

Be careful not to leave out the closing quotation mark, or all the content from the opening quotation mark until the browser encounters a subsequent quotation mark will be interpreted as part of the value and won't display in the browser. This is a simple mistake that can cause hours of debugging frustration.


8.2.3. Nested Elements

HTML elements may contain other elements. This is called nesting, and to do it properly, the entire element (including its markup) must be within the start and end tags of the containing element (the parent). Proper nesting is one of the criteria of a well-formed document (a requirement for XHTML).

In this example, list items (li) are nested within an unordered list element (ul).

 <ul>   <li>Example 1</li>   <li>Example 2</li> </ul> 

A common mistake made when nesting elements is to close the parent element before the element it contains (its child) has been closed. This results in an incorrect overlapping of elements that would make an XHTML document malformed and may cause rendering problems for HTML documents. In this example, the elements are incorrectly nested because the strong element should have been closed before the a (anchor).

 INCORRECT:  <a href="#">Click <strong>here.</a></strong> 

8.2.4. Information Browsers Ignore

Some information in an HTML document, including certain markup, is ignored or has little to no impact on presentation when the document is viewed in a browser or other user agent. These include:


Line breaks

Line returns in the HTML document are treated as spaces, which then typically collapse with other spaces (see next point). Text and elements wrap continuously until they encounter a p or br element within the flow of the document text. Line breaks are displayed, however, when text is marked up as a preformatted (pre) element or styled with the white-space: pre property in a style sheet.


Tabs and multiple spaces

When a user agent encounters more than one consecutive blank character space in an HTML document, it displays it as a single space. So, if the document contains:

 far,            far                away 

the browser displays:

far, far away

Extra spaces can be added within the flow of text by using the non-breaking space character entity (&nbsp;). Multiple spaces are displayed, however, when text is marked up as preformatted text (pre) or with the white-space: pre property in a style sheet. Tabs in the source document are problematic for some browsers and are best avoided.


Empty p elements

Empty paragraph elements (<p>...</p> or <p> alone) with no intervening text are interpreted as redundant by all browsers and displayed as though they were only a single paragraph break. Most browsers display multiple br elements as multiple line breaks.


Unrecognized element

A browser simply ignores any element it doesn't understand or that was incorrectly specified. Depending on the element and the browser, this can have varied results. Browsers typically display the contents of the element and its markup as though it were normal text, although some older browsers may display nothing at all.


Text in comments

Browsers do not display text between the special <!-- and --> elements used to denote a comment. Here is a sample comment:

 <!-- This is a comment --> <!-- This is a multiple line comment that ends here. --> 

There must be a space after the initial <!-- and preceding the final -->, but you can put nearly anything inside the comment otherwise. You cannot nest comments. Comments are useful for leaving notes within a long HTML file, for example:

 <!-- navigation table starts here --> 

HTML markup that is contained within comments will not display, therefore comments may be useful for temporarily hiding content without permanently removing it from the document.




Web Design in a Nutshell
Web Design in a Nutshell: A Desktop Quick Reference (In a Nutshell (OReilly))
ISBN: 0596009879
EAN: 2147483647
Year: 2006
Pages: 325

flylib.com © 2008-2017.
If you may any questions please contact us: flylib@qtcs.net