As you test the features of your web applications, you also may want to make sure the HTML content that your code produces conforms to the standards set by the World Wide Web Consortium (http://www.w3.org/). Coding to standards makes your site cleaner, easier to maintain, and more accessible from a variety of browsers and clients , especially for users with disabilities . How do I do that? The Test::HTML::Tidy module provides a single function, html_tidy_ok( ) , that checks the completeness and correctness of an HTML document. Save the following code as tidy.t : Note: You might already be familiar with the tidy command. Test:: HTML::Tidy uses HTML::Tidy as a backend, which in turn uses the tidy library . #!perl use strict; use Test::More tests => 2; use Test::WWW::Mechanize; use Test::HTML::Tidy; my $mech = Test::WWW::Mechanize->new( ); $mech->get_ok( 'http://search.cpan.org/' ); html_tidy_ok( $mech->content ); $mech->field( 'query', 'otter spotting' ); $mech->submit( ); html_tidy_ok( $mech->content( ) ); When running the test file, you may see successes or failures, depending on the current conformity of the CPAN Search Site. What just happened ? tidy.t uses Test::HTML::Tidy along with Test::WWW::Mechanize to make sure the CPAN Search Site's home page is valid HTML. The first test passes the entire HTML document, $mech->content , to html_tidy_ok( ) , which reports success if the page validates . The test then searches the CPAN for "otter spotting" and checks the HTML of the resulting page as well. What about... Q: | Can I check a smaller portion of HTML instead of an entire document? | A: | Use Test::HTML::Lint , which exports an html_ok( ) function to which you can pass any bit of HTML. Save the following listing as table.t : Note: Test::HTML::Lint uses HTML::Lint as a backend . #!perl use strict; use Test::More tests => 1; use Test::HTML::Lint; html_ok( <<'EOF' ); <h1>My Favorite Sciuridae</h1> <table> <trh> <td>Grey squirrel</td> <td>plump, calm</td> </tr> <tr> <td>Red squirrel</td> <td>quick, shifty</td> <tr> <td>Yellow-bellied Marmot</td> <td>aloof</td> </tr> </table> EOF Note: Yep, those errors are intentional . Run the test file with prove : $ prove -v part.t part....1..1 not ok 1 # Failed test (part.t at line 8) # Errors: # (5:5) Unknown element <trh> # (8:5) </tr> with no opening <tr> # (16:1) <trh> at (5:5) is never closed # (16:1) <tr> at (9:5) is never closed # Looks like you failed 1 tests of 1. dubious Test returned status 1 (wstat 256, 0x100) DIED. FAILED test 1 Failed 1/1 tests, 0.00% okay Failed 1/1 test scripts, 0.00% okay. 1/1 subtests failed, 0.00% okay. Failed Test Stat Wstat Total Fail Failed List of Failed ----------------------------------------------------------------------- part.t 1 256 1 1 100.00% 1 html_ok( ) reports the single test as a failure and reports exactly where the document has errors. The error reports take the form of ( line number : character position ) , where the line number is the line number of the provided HTML. As the output explains, Test::HTML::Lint has no idea what a <trh> tag is. Nevertheless, neither it nor the <tr> tag ever close. There's more work to do before putting this table of favorite furry animals online. | |