Chapter 1: Introduction to HTML and XHTML

 <  Day Day Up  >  


Markup is information that is added to a document to convey information about the document's structure or presentation. Markup languages are all around us in everyday computing. While you may not see it, word processing documents are filled with codes indicating the structure and presentation of the document. What you see on your screen just looks like a page of text, but the formatting is done "behind the scenes" by the markup. Hypertext Markup Language (HTML) and its successor, XHTML, are the not-so-behind-the-scenes markup languages that are used to tell Web browsers how to structure and, some may say, display Web pages.

First Look at HTML

In the case of HTML, markup commands applied to your Web-based content relay the structure of the document to the browser software and, though perhaps unfortunate at times, how you want the content to be displayed. For example, if you want to show that a section of text is important, you surround the corresponding text with the markup tags, <strong> and </strong> , as shown here:

  <strong>  This is important text!  </strong>  

When a Web browser reads a document that has HTML markup in it, it determines how to render the document onscreen by considering the HTML elements embedded within it (see Figure 1-1). Be aware that browsers don't always render things in the way that you think they will. This is due partially to the design of HTML and partially to the differences in the variety of Web browsers currently in use.

click to expand
Figure 1-1: Interpretation of Web page with HTML markup

So we see that an HTML document is simply a text file that contains the information you want to publish and the appropriate markup instructions indicating how the browser should structure or present the document. These markup e lements are made up of a start tag such as <strong> , and also might include an end tag, which is indicated by a slash within the tag such as </strong> . The tag pair should fully enclose any content to be affected by the element, including text and other HTML markup. However, under traditional HTML (not XHTML) some HTML elements have optional close tags because their closure can be inferred. Other HTML elements, called empty elements , do not enclose any content, and thus need no close tags at all, or in the case of XHTML, use a self-close identification scheme. For example, to insert a line break, use the <br> tag, which represents the empty br element as it doesn't enclose any content and has no corresponding close tag.

  <br>  

An unclosed tag is not allowed in XHTML, however, so we need to close the tag like so:

  <br></br>  

More often, we use a self-identification scheme of closure like so:

<br />The start tag of an HTML element might contain attributes that modify the meaning of the tag. The inclusion of the noshade attribute in the < hr> tag, as shown here,

  <hr noshade>  

indicates that there should be no shading applied to the horizontal rule element. Under XHTML, such existence style attributes are not allowed. All attributes must have a value, so instead we use a syntax like the following:

  <hr noshade="noshade" />  

As the last example shows, attributes do require value and they are specified with an equal sign; these values should be enclosed within double or single quotes. For example,

  <img src="logo.gif" alt="Demo Company" height="100" width="100" />  

specifies four attributes for the < img> tag that are used to provide more information about the use of the included image. A complete overview of the structure of HTML elements is shown here:

click to expand

Given these basic rules for HTML tags, it is best now to simply look at an example document to see how they are used. Our first complete example written in transitional HTML 4 is shown here:

  <!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN"   "http://www.w3.org/TR/html4/loose.dtd">   <html>   <head>   <title>  First HTML Example  </title  >  </head>   <body>   <h1>  Welcome to the World of HTML  </h1>   <hr>   <p>HTML <b>really</b>  isn't so hard!  </p>   <p>  You can put in lots of text if you want to. In fact, you could keep on typing and make up more sentences and continue on and on.  </p>   </body>   </html>  

In the case of XHTML, which is a stricter and cleaner version of HTML, we really don't see many changes yet, as shown in the example that follows :

  <!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN"   "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd">   <html xmlns="http://www.w3.org/1999/xhtml" lang="en">   <head>   <title>  First XHTML Example  </title>   </head>   <body>   <h1>  Welcome to the World of XHTML  </h1>   <hr />   <p>  XHTML  <b>  really  </b>  isn't so hard!  </p>   <p>  You can put in lots of text if you want to. In fact, you could keep on typing and make up more sentences and continue on and on.  </p>   </body>   </html>  

The preceding example uses some of the most common elements found in HTML documents:

  • The <!DOCTYPE> statement indicates the particular version of HTML or XHTML being used in the document. In the first example, the transitional 4.01 specification was used, while in the second, the transitional XHTML 1.0 specification was employed.

  • The <html> , <head> , and <body> tag pairs are used to specify the general structure of the document. Notice that under XHTML you need to have a little more information about the language you are using.

  • The <title> and </title> tag pair specifies the title of the document that generally appears in the title bar of the Web browser.

  • The <h1> and </h1> header tag pair creates a headline indicating some important information.

  • The <hr /> tag, which has no end tag making its syntax different in XHTML, inserts a horizontal rule, or bar, across the screen.

  • The <p> and </p> paragraph tag pair indicates a paragraph of text.

If you are using a text editor, you could type in the previous listing and save it with a filename such as "firstexample.htm" or "firstexample.html." For a browser to read your file properly off your local disk, it must end either in the .htm or .html extension. If you don't save your file with the appropriate extension, the browser probably won't attempt to interpret the HTML markup. When this happens, the markup elements may appear in the browser window, as shown in Figure 1-2. However, note that some browsers will let you get away with bad file extensions locally, but on a server you could run into problems.

click to expand
Figure 1-2: Raw HTML mistakenly displayed in browser window

After you save the example file on your system, open it in your browser by using the Open, Open Page, or Open File command, which should be found in the browser's File menu. After your browser reads the file, it should render a page like the one shown in Figure 1-3.

click to expand
Figure 1-3: An HTML page displayed in a browser

If your page does not display properly, review your file to make sure that you typed in the markup correctly. If you find a mistake and make a change to the file, save the file, go back to your browser, and click the Reload or Refresh button. Sometimes the browser will still reload the page from its memory cache; if a page does not update correctly on reload, hold down the SHIFT key while clicking the Reload button, and the browser should refetch the page. During this simple test, it's a good idea to keep the browser and text editor open simultaneously to avoid having to constantly reopen one or the other. Once you get the hang of HTML design, you'll see that, at this raw level, it is much like the edit, compile, and run cycle so familiar to programmers. However, you certainly don't want to use this manual process to develop Web pages because it can be tedious , error prone, and inefficient for page structure and visual design. For use here as illustration to learn the language, however, it works fine. Better approaches to HTML document creation are discussed in Chapter 2.

Given the simple example just presented, you might surmise that learning HTML is merely a matter of learning the multitude of markup tags, such as <b>, <i>, <p>, and so on, that specify the format and/or structure of documents to browsers. While this certainly is an important first step, it trivializes the role markup plays on the Web, and would be similar to trying to learn writing and print publishing by understanding only the various commands available in Microsoft Word, while disregarding page layout, document structure, and output formats. Similarly on the Web, in addition to learning the various markup tags, you need to consider document structure, visual design and page layout, client and server-side programming, navigation and interface design, and the method by which Web pages are actually delivered. These topics are discussed only to a limited degree in this book as they intersect with HTML and XHTML. However, interested readers are encouraged to reference Web Design: The Complete Reference, Second Edition (Powell, 2002) , which presents these topics and many others required for full site creation. However, for now let's concern ourselves primarily with understanding basic HTML syntax, as that alone can be a challenge to fully master.

HTML: A Structured Language

HTML has a very well-defined syntax; all HTML documents should follow a formal structure. The World Wide Web Consortium (W3C) is the primary organization that attempts to standardize HTML (as well as many other technologies used on the Web). To provide a standard, the W3C must carefully specify all aspects of the technology. In the case of HTML, this means precisely defining the elements in the language. The W3C has defined HTML as an application of the Standard Generalized Markup Language (SGML). In short, SGML is a language used to define other languages by specifying the allowed document structure in the form of a document type definition (DTD) , which indicates the syntax that can be used for the various elements of a language such as HTML. In 1999, the W3C rewrote HTML as an application of XML (Extensible Markup Language) and renamed it XHTML. XML, in this situation, serves the same purpose as SGML: a language in which to write the rules of a language. In fact, XML is in some sense just a limited form of SGML. The implications of this change are significant and will be touched upon throughout the book. I present both languages and compare them in the chapters that follow so that you can better appreciate the changes. While some standards zealots may disagree with the mere presentation of HTML in the traditional or even loose "tag soup" form, it is still by far the most dominant form of markup on the Web and the author would be remiss in not covering it properly.

From the HTML 4.01 DTD, a basic template can be derived for a basic HTML document, as shown here:

  <!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN"   "http://www.w3.org/TR/html4/loose.dtd">   <html>   <head>   <title>   Document Title Goes Here<   /title>   ...Head information describing the document and providing   supplementary information goes here....   </head>   <body>   ...Document content and markup go here....   </body>   </html>  

The first line of the template is the <!DOCTYPE> indicator, which shows the particular version of HTML being used; in this case, 4.01 transitional. Within the <html> tag, the basic structure of a document reveals two primary sections: the "head" and the "body." The head of the document, as indicated by the head element, contains information and tags describing the document such as its title. The body of the document, as indicated by the body element, contains the document itself with associated markup required for structure or presentation. The structure of an XHTML document is pretty much the same with the exception of a different <!DOCTYPE> indicator and a few extra attributes added to the <html> tag.

  <!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN"   "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd">  <html xmlns="http://www.w3.org/1999/xhtml" xml:lang="en" lang="en"> <head> <title>  Document Title Goes Here<  /title>  ...Head information describing the document and providing   supplementary information goes here....  </head> <body>  ...Document content and markup go here....  </body> </html> 

Alternatively, in either HTML or XHTML we might replace the <body> tag with a <frameset> tag, which encloses potentially numerous <frame> tags corresponding to individual portions of the browser window, termed frames . Each frame in turn would reference another HTML/XHTML document containing either a standard document complete with <html> , <head> , and <body> , or perhaps yet another framed document. The <frameset> tag also should include a <noframes> tag that provides a version of the page for browsers that do not support frames. Within this element often occurs the <body> tag for non-framesupporting browsers. An example template for a frameset document is shown here, though we omit the XHTML version, which is nearly identical. Note that the DTD for a framed document is different from that of a normal document.

  <!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Frameset//EN"   "http://www.w3.org/TR/html4/frameset.dtd">  <  html>  <  head>  <  title>   Document Title Goes Here<   /title>   ...Head information describing the frameset and providing   upplementary information goes here...  <  /head>  <  frameset>   numerous frame elements here  <  noframes>  <  body>   ...Alternative content for non-frame aware browsers...  <  /body>  <  /noframes>  <  /frameset>  <  /html>  

Framed documents are discussed in greater depth in Chapter 8. For now, let's concentrate on a typical document template of < !DOCTYPE> , < html> , < head> , and < body> and examine each piece more in depth.



 <  Day Day Up  >  


HTML & XHTML
HTML & XHTML: The Complete Reference (Osborne Complete Reference Series)
ISBN: 007222942X
EAN: 2147483647
Year: 2003
Pages: 252
Authors: Thomas Powell

flylib.com © 2008-2017.
If you may any questions please contact us: flylib@qtcs.net