Recipe 20.20 Program: htmlsub

This program makes substitutions in HTML files so changes happen only in normal text. If you had the file scooby.html that contained:

<HTML><HEAD><TITLE>Hi!</TITLE></HEAD> <BODY><H1>Welcome to Scooby World!</H1> I have <A HREF="pictures.html">pictures</A> of the crazy dog himself.  Here's one!<P> <IMG src="/books/2/106/1/html/2/scooby.jpg" ALT="Good doggy!"><P> <BLINK>He's my hero!</BLINK>  I would like to meet him some day, and get my picture taken with him.<P> P.S. I am deathly ill.  <A HREF="shergold.html">Please send cards</A>. </BODY></HTML>

you could use htmlsub to change every occurrence of the word "picture" in the document text to read "photo". It prints the new document on STDOUT:

% htmlsub picture photo scooby.html <HTML><HEAD><TITLE>Hi!</TITLE></HEAD> <BODY><H1>Welcome to Scooby World!</H1> I have <A HREF="pictures.html">photos</A> of the crazy dog himself.  Here's one!<P> <IMG src="/books/2/106/1/html/2/scooby.jpg" ALT="Good doggy!"><P> <BLINK>He's my hero!</BLINK>  I would like to meet him some day, and get my photo taken with him.<P> P.S. I am deathly ill.  <A HREF="shergold.html">Please send cards</A>. </BODY></HTML

The program is shown in Example 20-12.

Example 20-12. htmlsub
  #!/usr/bin/perl -w   # htmlsub - make substitutions in normal text of HTML files   # from Gisle Aas <gisle@aas.no>      sub usage { die "Usage: $0 <from> <to> <file>...\n" }      my $from = shift or usage;   my $to   = shift or usage;   usage unless @ARGV;      # Build the HTML::Filter subclass to do the substituting.      package MyFilter;   use HTML::Filter;   @ISA=qw(HTML::Filter);   use HTML::Entities qw(decode_entities encode_entities);      sub text   {      my $self = shift;      my $text = decode_entities($_[0]);      $text =~ s/\Q$from/$to/go;       # most important line      $self->SUPER::text(encode_entities($text));   }      # Now use the class.      package main;   foreach (@ARGV) {       MyFilter->new->parse_file($_);   }


Perl Cookbook
Perl Cookbook, Second Edition
ISBN: 0596003137
EAN: 2147483647
Year: 2003
Pages: 501

flylib.com © 2008-2017.
If you may any questions please contact us: flylib@qtcs.net