Section 22.13. Output Style


22.13. Output Style

Owing to the fact that PHP generates its output dynamically, it is easy to generate messy output that is hard to read. While this is not a problem in itself, it does not look good on you and your web site, and also makes the outputted HTML source code hard to read if you have debugging to do. Help is at hand: the Tidy extension, amongst other things, can clean up and repair poorly written HTML.

Here's an example HTML document:

     <TITLE>This is bad HTML</title>     <BODY>     This would get rejected as XHTML for a number of reasons.     First, the <FOO> tag doesn't exist.<BR>Second, the tags aren't the same case.     Third, tags that don't end, like <HR>, aren't allowed.<BR>     Tidy should fix all this for us! 

As you can see, it's quite messy. Let's put it through Tidy with no particular options set:

     <?php $tidy = new tidy("lame.html");             $tidy->cleanRepair( );             echo $tidy;     ?> 

That will output the following:

     <!DOCTYPE html PUBLIC "-//W3C//DTD HTML 3.2//EN">     <html>     <head>     <title>This is bad HTML</title>     </head>     <body>     This would get rejected as XHTML for a number of reasons. First,     the tag doesn't exist.<br>     Second, the tags aren't the same case. Third, tags that don't end,     like     <hr>     , aren't allowed.<br>     Tidy should fix all this for us!     </body>     </html> 

Tidy has added all the right header and footer tags to make the overall content compliant, and normalized the case of the elements. Second, it has taken away the FOO tag because it is invalid. Third, it has wrapped the lines so they aren't too long. Finally, it added a new line after each tag.

If you would rather do without line wrapping, you can turn it off. Tidy accepts a variety of options, and we'll go over some of the popular ones in a moment. First things first, though: blast line wrapping and make the output actually look tidy!

     $tidyoptions = array("indent" => true,                                     "wrap" => 1000);     $tidy = new tidy("lame.html", $tidyoptions);     $tidy->cleanRepair( );     echo $tidy; 

This time, we use an array to store the options, enabling indent mode and setting the character-wrap limit to 1000 characters. The output now looks like this:

     <!DOCTYPE html PUBLIC "-//W3C//DTD HTML 3.2//EN">     <html>       <head>         <title>           This is bad HTML         </title>       </head>       <body>         This would get rejected as XHTML for a number of reasons.             First, the tag doesn't exist.<br>         Second, the tags aren't the same case. Third, tags that don't end, like         <hr>         , aren't allowed.<br>         Tidy should fix all this for us!       </body>     </html> 

Much better, but not yet perfect: it's valid HTML 3.2 now, but XHTML is the future, so it is recommended that you try to write conforming codeor let Tidy do it for you, like this:

     $tidyoptions = array("indent" => true,                             "wrap" => 1000,                             "output-xhtml" => true);     $tidy = new tidy("lame.html", $tidyoptions);     $tidy->cleanRepair( );     echo $tidy; 

That extra option makes the world of difference to the output:

     <!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN"         "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd">     <html xmlns="http://www.w3.org/1999/xhtml">       <head>         <title>           This is bad HTML         </title>       </head>       <body>         This would get rejected as XHTML for a number of reasons.             First, the tag doesn't exist.<br />         Second, the tags aren't the same case. Third, tags that don't end, like         <hr />         , aren't allowed.<br />         Tidy should fix all this for us!       </body>     </html> 

Now we get the works: a full XHTML doctype, all our tags are indented, and all our tags are closed. This is what we should be aiming for as standard.

To let you customize various aspects of how your tidied output should look, there is a wide variety of options that can be passed in. As you saw in the previous script, the way to do this is to create an array where the keys are the option names and the values are the settings for those options, then pass that in as the second parameter when creating a Tidy object.

The official list of Tidy options is available online in the Tidy manual (see http://tidy.sourceforge.net/docs/quickref.html), but here are a few to get you started:


logical-emphasis

Set to true to have Tidy change <i> tags to <em>, and <b> to <strong>.


replace-color

Set to TRue to have Tidy change numeric HTML color values to their named equivalents, wherever possible. That is, #FFFFFF becomes "white".


show-body-only

Set to true to have Tidy only output the contents of the <body> tagno headers, no titles, not even the body tag itself. This is used to grab the content (and only the content!) of a web page.


word-2000

My favorite. Set to true to have Tidy turn Word 2000's mangled attempt at HTML into proper HTML.


vertical-space

Set to TRue to have Tidy insert blank lines in the output to make it more readable.


fix-backslash

Set to true if someone in your company likes writing URLs with a \ rather than a /this corrects it.

22.13.1. Installing Tidy

If you're using Windows, you can enable Tidy support by enabling the extension in your php.ini file. Look for the line ";extension=php_tidy.dll" and take the semicolon off from the beginning.

If you're using Unix, you either have to install Tidy support through your package manager, or you need to compile it from source. Compiling Tidy support into your PHP takes two steps: installing the Tidy development libraries on your machine (do this through your package manager), then recompiling PHP with the with-tidy switch in your configure line. As long as you have the development version of Tidy installed, this should work fine.



PHP in a Nutshell
Ubuntu Unleashed
ISBN: 596100671
EAN: 2147483647
Year: 2003
Pages: 249

flylib.com © 2008-2017.
If you may any questions please contact us: flylib@qtcs.net