Section 2.1. The Anatomy of a Web Page

2.1. The Anatomy of a Web Page

Web pages are written in HTML (HyperText Markup Language), which is the language of the Web. It doesn't matter whether your Web page contains a series of plain text blog entries, a dozen pictures of your pet lemur, or a heavily formatted screenplayodds are if you're looking at it in a browser, it's an HTML page.

HTML plays two key roles:

  • HTML tells a Web browser how to format a page . Although there are plenty of computer programs that can format text (take Microsoft Word, for instance), it's almost impossible to find a single standard that's supported on every type of computer, operating system, and Web-enabled device. HTML fills the gap by supplying information that any browser can interpret. These formatting details include specifications about colors, headings, text alignment, and so on.

  • HTML links different documents together . These links can take several forms. You can use hyperlinks (discussed in Chapter 8) to let people surf from one Web page to another. You can also use HTML instructions to call up pictures (Chapter 7) or even other Web pages (Chapter 10) and combine them into a single Web page.

HTML is such an important standard that you'll spend a good portion of this book digging through most of its features, frills, and shortcomings. Every Web page you'll build along the way is a bona fide HTML document.

2.1.1. Cracking Open an HTML Document

On the inside, an HTML page is actually nothing more than a plain- vanilla text file. That means every Web page consists entirely of letters , numbers , and just a few special characters (like spaces, punctuation, and everything else you can spot on your keyboard). This file is quite different than what you would find if you cracked open a typical binary file on your computer. (A binary file contains genuine computer languagea series of 1s and 0s. If another program is foolish enough to try and convert this binary information into text, you end up with gibberish.)

To understand the difference, take a look at Figure 2-1, which examines a Word document under the microscope. Compare that with what you see in Figure 2-2, which dissects an HTML document containing the same content.

To take a look at an HTML document, all you need is an ordinary text editor, like Notepad, which is included on all Windows computers. To run Notepad, click the Start button and select Programs Accessories Notepad. Then choose File Open and begin hunting around for the HTML file you want. On the Mac, try TextEdit, which you can find at Applications TextEdit. Choose File Open and then find the HTML file. If youve downloaded the companion content for this book (all of which you'll find on the "Missing CD" page at www.missingmanuals.com), try opening the popsicles.htm file, shown in Figure 2-2.

Figure 2-1. Word documents are stored as binary information, as are documents in most file formats used by most computer programs.
Top: Even if your document looks relatively simple in the Word window, it doesn't look nearly as pretty when you bypass Word and open the file in an ordinary text editor like Notepad or TextEdit.
Bottom: Depending on the program you use, the string of ones and zeroes in the file is usually converted into a meaningless stream of intimidating gibberish. The actual text is there somewhere, but it's buried in computer gobbledygook.


Unfortunately, most text editors don't let you open a Web page directly from the Internet. In order to do that, they'd need to be able to send a request over the Internet to a Web server, which is a job that's best left to the Web browser. However, most browsers do give you the chance to look at the raw HTML for a Web page. Here's what you need to do:

  1. Open your preferred browser .

  2. Navigate to the Web page you want to examine .

  3. In your browser, look for a menu command that allows you to view the source content of the Web page. In Internet Explorer (or Opera), select View Source. In Firefox and Netscape, use View Page Source. In Safari, View View Source does the trick. Isnt diversity a wonderful thing ?

Once you make your selection, a new window appears showing you the HTML used to create the Web page. This window may represent a built-in text viewer that's included with the browser, or it may just be Notepad or TextEdit. Either way, you'll see the raw HTML.

Figure 2-2. HTML documents are stored as ordinary text.
Top: What you see in the Web browser is much easier to understand than what you see in an ordinary text editor.
Bottom: You can easily spot all the text from the original, along with a few extra pieces of information inside angled brackets (< >). These are HTML tags.



Tip: Firefox has a handy feature that lets you home in on part of the HTML in a complex page. Just select the text you're interested in on the page, right-click it, and then choose View Selection Source.

Most Web pages are considerably more complex than the popsicles.htm example shown in Figure 2-2, so you'll need to wade through many more HTML tags. But once you've acclimated yourself to the jumble of information, you'll have an extremely useful way to peer under the covers of any Web page. In fact, professional Web developers often use this trick to check out the snazziest work of their competitors .

POWER USERS' CLINIC
Going Beyond HTML

The creators of HTML designed it perfectly for putting research papers and other unchanging documents on the Web. They didn't envision a world of Internet auctions, e-commerce shops , and browser-based games . To add all these features to the modern Web browsing experience, crafty people have supplemented HTML with some tricky workarounds. And although it's more than a little confusing to consider all the ways you can extend HTML, doing so is the best way to really understand what's possible on your own Web site.

Here's an overview of the two most common ways to go beyond HTML:

  • Embedded applications . Most modern browsers support Java applets , which are small programs than run inside your Web browser, and display information in a window inside a Web page. (To try one out and play some head-scratching Java Checkers against a computer opponent , surf to http://thinks.com/java/checkers/checkers.htm.) Internet Explorer can also host special tools called ActiveX controls . ActiveX is a Microsoft- backed technology for sharing useful widgets between different programs and Web pages. (To see an ActiveX control in use, check out TrendMicro's free virus scanner at http://housecall.trendmicro.com.) Both Java applets and ActiveX controls are miniature programs that can be used in a Web page (if the browser supports it), but neither are written in HTML.

  • Browser plug-ins . Browsers are designed to deal with HTML, and they don't recognize other types of content. For example, browsers don't have the ability to interpret an Adobe PDF document, which is a specialized format used to preserve the formatting of documents. However, depending on how your browser is configured, you may find that when you click a hyperlink that points to a PDF file, a PDF reader launches. The automatic launch happens if you've installed a plug-in from Adobe that runs the Acrobat software (which displays PDF files). (To see for yourself, request the sample chapter www.oreilly.com/catalog/exceltmm/chapter/ch04.pdf from Excel: The Missing Manual .) Another example of a common plug-in is Macromedia Flash, which shows animations on a Web page. If you surf to a page that includes a Flash animation and you don't have the plug-in, you'll be asked if you want to download it. (Check out www.orsinal.com to play some of the best free Flash games around.)

Unfortunately, there's no surefire way to tell what extensions are at work on a particular page. In time, you'll learn to spot many of the telltale signs, because each type of content looks distinctly different.


2.1.2. Creating Your Own HTML Files

Here's one of the best-kept secrets of Web page writing: You don't need a live Web site to start creating your own Web pages. That's because you can easily build and test Web pages using only your own computer. In fact, you don't even need an Internet connection.

The basic approach is simple:

  1. Fire up your favorite text editor .

  2. Start writing HTML content .

    Of course, this part is a little tricky because you haven't explored the HTML standard yet. Hang onhelp is on the way in the next section.

  3. When you've finished your Web page, save the document (a simple File Save usually does it) .

    By convention, HTML documents typically have the file extension .htm or .html , as in LimeGreenPyjamas.html . Strictly speaking, these extensions aren't necessary, because browsers are perfectly happy displaying Web pages with any file extension. You're free to choose any file extension you want for your Web pages. The only rule is that the file has to contain valid HTML content. However, using the .htm or .html file extensions is still a good idea; not only does it save confusion, it also helps your computer recognize that the file contains HTML in other situations. For example, when you double-click a file with the .htm or .html extension, it opens in your Web browser automatically.

  4. To take a look at your work, open the file in a Web browser .

    If you've used the extension .htm or .html , it's usually as easy as double-clicking the file. If not, you may need to type in the full file path in your Web browser's address bar, as shown in Figure 2-3.

    Remember, when you compose your HTML document in a text editor, you won't be able to see what the formatting actually looks like. All you'll see is the plain text and the HTML formatting instructions.


Tip: If you change and save the file after you open it in your Web browser, you can take a look at your recent changes by hitting the Refresh button.


Creating Web Sites. The Missing Manual
Creating Web Sites: The Missing Manual
ISBN: B0057DA53M
EAN: N/A
Year: 2003
Pages: 135

flylib.com Ā© 2008-2017.
If you may any questions please contact us: flylib@qtcs.net