Section 17.2. Cleaning Up After Your HTML Editor


17.2. Cleaning Up After Your HTML Editor

Although you can create and edit HTML/XHTML documents with a text editor, such as vi or Notepad, most HTML authors use an application that is designed for creating web pagesseveral are free of charge, many offer a free evaluation period, and most are available for download over the Web. Be forewarned, though; in our experience, you will rarely (if ever) be able to create a web document from one of these editors without having to inspect, add to, edit, and sometimes even repair the source HTML that the editor generates. The following sections discuss a few things that you should know about and watch out for.

17.2.1. Where Did My Document Go?

One of the first things you will notice is that many of the HTML editors automatically introduce into your document markup that you did not explicitly select or write. Remember this very simple HTML document that we started with in Chapter 2?

 <html> <head> <title>My first HTML document</title> </head> <body> <h2>My first HTML document</h2> Hello, <i>World Wide Web!</i>  <!-- No "Hello, World" for us --> <p> Greetings from<br> <a href="http://www.ora.com">O'Reilly Media</a> <p> Composed with care by: <cite>(insert your name here)</cite> <br>&copy;2000 and beyond </body> </html> 

Here is what the source looks like after you load it into Microsoft Word from Office XP:

 <html xmlns:o="urn:schemas-microsoft-com:office:office" xmlns:w="urn:schemas-microsoft-com:office:word" xmlns="http://www.w3.org/TR/REC-html40"> <head> <meta http-equiv=Content-Type content="text/html; charset=windows-1252"> <meta name=ProgId content=Word.Document> <meta name=Generator content="Microsoft Word 10"> <meta name=Originator content="Microsoft Word 10"> <link rel=File-List href="html_files/filelist.xml"> <title>&lt;html&gt;</title> <!--[if gte mso 9]><xml>  <w:WordDocument>   <w:Compatibility>    <w:BreakWrappedTables/>    <w:SnapToGridInCell/>    <w:WrapTextWithPunct/>    <w:UseAsianBreakRules/>   </w:Compatibility> <w:BrowserLevel>MicrosoftInternetExplorer4</w:BrowserLevel>  </w:WordDocument> </xml><![endif]--> <style> <!--  /* Style Definitions */  p.MsoNormal, li.MsoNormal, div.MsoNormal     {mso-style-parent:"";     margin:0in;     margin-bottom:.0001pt;     mso-pagination:widow-orphan;     font-size:12.0pt;     font-family:"Times New Roman";     mso-fareast-font-family:"Times New Roman";} p.MsoPlainText, li.MsoPlainText, div.MsoPlainText     {margin:0in;     margin-bottom:.0001pt;     mso-pagination:widow-orphan;     font-size:10.0pt;     font-family:"Courier New";     mso-fareast-font-family:"Times New Roman";} @page Section1     {size:8.5in 11.0in;     margin:1.0in 65.95pt 1.0in 65.95pt;     mso-header-margin:.5in;     mso-footer-margin:.5in;     mso-paper-source:0;} div.Section1     {page:Section1;} --> </style> <!--[if gte mso 10]> <style>  /* Style Definitions */  table.MsoNormalTable     {mso-style-name:"Table Normal";     mso-tstyle-rowband-size:0;     mso-tstyle-colband-size:0;     mso-style-noshow:yes;     mso-style-parent:"";     mso-padding-alt:0in 5.4pt 0in 5.4pt;     mso-para-margin:0in;     mso-para-margin-bottom:.0001pt;     mso-pagination:widow-orphan;     font-size:10.0pt;     font-family:"Times New Roman";} </style> <![endif]--> </head> <body lang=EN-US style='tab-interval:.5in'> <div class=Section1> <p class=MsoPlainText>&lt;html&gt;<o:p></o:p></p> <p class=MsoPlainText>&lt;head&gt;<o:p></o:p></p> <p class=MsoPlainText>&lt;title&gt;My first HTML document&lt;/title&gt;<o:p></o:p></p> <p class=MsoPlainText>&lt;/head&gt;<o:p></o:p></p> <p class=MsoPlainText>&lt;body&gt;<o:p></o:p></p> <p class=MsoPlainText>&lt;h2&gt;My first HTML document&lt;/h2&gt;<o:p></o:p></p> <p class=MsoPlainText>Hello, &lt;i&gt;World Wide Web!&lt;/i&gt;<o:p></o:p></p> <p class=MsoPlainText><span style='mso-spacerun:yes'> </span>&lt;!-- No &quot;Hello, World&quot; for us --&gt;<o:p></o:p></p> <p class=MsoPlainText>&lt;p&gt;<o:p></o:p></p> <p class=MsoPlainText>Greetings from&lt;br&gt;<o:p></o:p></p> <p class=MsoPlainText>&lt;a href=&quot;http://www.ora.com&quot;&gt;O'Reilly Media&lt;/a&gt;<o:p></o:p></p> <p class=MsoPlainText>&lt;p&gt;<o:p></o:p></p> <p class=MsoPlainText>Composed with care by: <o:p></o:p></p> <p class=MsoPlainText>&lt;cite&gt;(insert your name here)&lt;/cite&gt;<o:p></o:p></p> <p class=MsoPlainText>&lt;br&gt;&amp;copy;2000 and beyond<o:p></o:p></p> <p class=MsoPlainText>&lt;/body&gt;<o:p></o:p></p> <p class=MsoPlainText>&lt;/html&gt;</p> </div> </body> </html> 

Yeow! Where did the document go? Excessive markup makes the source document almost humanly impossible to read. What infuriates document purists like us, beyond the fact that lots of stuff that we neither wanted nor asked for was added, is that Word automatically treats any text document containing HTML markup as fodder for its mill. You can remove the .html or .htm suffix from the filename or delete <html> and <head> from the document, to no availWord will still get you.

Microsoft isn't alone in cluttering the source. Most HTML editors add at least a <meta> tag that contains their product information. Many go through and "fix" your document to comply with current standards and practices, toofor example, by adding all those paragraph and list-item end tags that HTML allows you to omit. (From an XHTML standpoint, we admit that this meddling is probably valid.)

To its credit, Word runs well, unlike other tools that routinely crashed without warning as we fought with their treatment of the markup. Microsoft even offers a Word plug-in that removes the additional markup so that you can recover a reasonable facsimile of the original document. [*]

[*] You can find this plug-in at http://office.microsoft.com/downloads/2000/Msohtmf2.aspx.

17.2.2. When and Why to Edit the Editor

No matter how good the HTML editor is, you'll inevitably have to edit the (albeit cluttered) source it generates. We've had to do it a lot ourselves , and so have all the web developers we've talked with over the last few years .

Not all HTML editors provide an easy means to add JavaScript to your documents, and many are not up-to-date with the HTML/XHTML and CSS2 standards. Remember, too, that the popular browsers don't always agree on how they render a tag, and even different versions of the same browser may differ . Furthermore, even the best HTML editors don't necessarily support extensions to the language.

So into the source you'll have to go, whether to include some HTML feature not yet supported by the editor (such as a new CSS2 property), to insert an attribute value or keyword, or to modify ones that the editor added.

The tip is this: compose first. Try to start with a clean, finished document. Concentrate on content from the outset, and add the special effects later. Use a good HTML editor from the start, or prepare your documents in two steps with two different toolsa good content editor followed by a good HTML editorparticularly if you plan to distribute the document in a format other than HTML.

17.2.3. Use the Best

If you compose web pages, we can't imagine you not using an HTML editor of some sort . The convenience is just too compelling. But choose carefully : some HTML editors are abysmal, and you'll spend more time hunting down misplaced tags and errant attributes than you'll spend actually creating the document. Top tip: you get what you pay for.

It's no surprise that HTML editors vary greatly in their features. Many editors let you switch the display from source text to what may appear when rendered by a browser. Some simply let you add tags and modify attribute values through pull-down menus and hot-key options. Others are WYSIWYG layout tools that make it easy to include graphics and other multimedia content. Other advanced features include embedding and testing applets and scripts.

In general, HTML editors fall into one of two categories: either they are good layout tools, including advanced styling features and tools for dynamic content, or they excel at content creation and management. Obviously, if you are producing flashy, commercial web pages that rely on advanced layout techniques and include lots of different styles and dynamic content, use a good layout tool. If you are producing a content-rich document, use a tool that provides good editorial assistance.

No matter which type you use, there are some common considerations to keep in mind when selecting an HTML editor:



Is it up-to-date?

No HTML editor is yet entirely up-to-date with the current standards, particularly CSS2. Read the product specifications and update often.



Does it include a source editor?

Although you may load an HTML editor-generated document into a different text editor to change the source, it's much more convenient if the editor itself lets you view and edit the HTML source. Also, make sure that your HTML editor doesn't automatically "fix" your source edits.



Is it modifiable?

Ideally, the HTML editor should let you customize its behavior to fit your specifications. For example, at a minimum you should be allowed to choose your own font colors, styles, and backgrounds, if those are automatically included in the editor's boilerplate document.



Is it affordable and reliable?

We can't stress enough that you get what you pay for. If creating web pages is more than just a passing fancy, get the best editor you can find. Find one that is well supported and well reviewed by other HTML authors. Ask around, and perhaps join an HTML author's newsgroup to get the latest scoop on products.



HTML & XHTML(c) The definitive guide
Data Networks: Routing, Security, and Performance Optimization
ISBN: 596527322
EAN: 2147483647
Year: 2004
Pages: 189
Authors: Tony Kenyon

flylib.com © 2008-2017.
If you may any questions please contact us: flylib@qtcs.net