Hack 82 Publish IE s Favorites to Your Web Site

Hack 82 Publish IE's Favorites to Your Web Site

figs/expert.gif figs/hack82.gif

You're surfing at a friend's house and think, "What is that URL? I have a link to it in my favorites. I wish I were home." How about making your favorites available no matter where you go?

You can't take them with youyour Internet Explorer bookmarks, I mean. They live on a particular machine, accessible only to you only when you're at that machine. Yes, there are some online bookmarking services, but the ones worth using have started making their users ante up or live through pop-up advertising hell. Of course, we Perl hackers don't have to settle for either.

This hack publishes the contents of your IE Favorites to any server that you can access via FTP, setting you up with a nice little navigable menu frame on the left to hold your favorites and a content area on the right to display the sites you click on. Yes, this hack is a bit Windows- and IE-specific, but before you complain too much, it's easily extendible to process any form of bookmark data that's stored in tree structure, and the output is templated. The template shown here generates just the simple HTML menu system, but templates for PHP, ASP, raw dataanything you likeshould be a breeze !

IE's Favorites

Let's start by taking a quick look at IE's Favorites folder. If you use Windows, you probably know that this folder is now used by more than IE, but most people I know, myself included, still use it mainly in the context of web browsing. On Windows NT, 2000, and XP running IE4 or later, the Favorites folder is nothing more than a directory stored within your user profile tree. The easiest and most consistent method for locating the folder is through the USERPROFILE environment variable. You'll note at the top of the script that a configurable global that identifies the root of the Favorites tree uses precisely this environment variable by default.

The structure of the Favorites tree itself is simple. It's a directory tree that contains folders and links. It is possible to put things other than URL links into your Favorites; since we're interested in publishing web bookmarks, we'll ignore everything except directories and links (in this context, links are defined as files with a .url extension). A link document contains a bit of data in addition to the actual URL; fortunately, it's easy to ignore, because the one thing that every link document has is a line that starts with URL= and then specifies the location in question. In our hack, we'll simply extract this one line with a regular expression.

What It Does and How It Works

The script goes through three processes:

  1. Parse the Favorites tree and load the structure.

  2. Generate the output documents.

  3. Upload the documents via FTP.

We'll take a quick look at each and then get right to the code.

Parsing the Favorites tree is handled by walking through the tree recursively using Perl's system-independent opendir , readdir , and closedir routines. We use File::Spec routines for filename handling, to make enhancing and porting to other systems easier. The structure itself is read into a hash of hashes, one of the basic Perl techniques for creating a tree. For each hash in the tree, subdirectories map to another hash and links map to a scalar with the link URL. Reading the entire Favorites tree into an internal data structure isn't strictly necessary, but it simplifies and decouples the later processes, and it also provides a great deal of flexibility for enhancements to the script.

Generating the output based on the Favorites data is done with a template so that the script doesn't lock its user into any one type of output. When you're using Perl, Text::Template is always an excellent choicesince Perl itself is the templating languageso we use it here. The template in this hack outputs HTML, defining a simple menu based on the folders and links and using HTML anchors to open the link targets in a named frame. It is expected that the entire set of documents, one document per Favorites directory, will be published to a single output directory, so filenames are generated using each directory's relative path from the main Favorites directory, each path component being separated by a period. The documents themselves are generated in a temp directory, which the script attempts to remove upon completion.

The upload code is straightforward and nonrobust. Upload is via FTP, and the published script requires that the FTP parameters be coded in the configuration globals at the top of the file. If anything other than an individual put fails, the code gives up. If a put itself fails, a warning is issued and we move to the next file.

The Code

You need three files. PublishFavorites.pl is the Perl code that does the work. The template for our example is favorites.tmpl.html . Finally, a simple index.html , which defines the frameset for our menus , will need to be uploaded manually just once.

First, here's PublishFavorites.pl :

 #!/usr/bin/perl -w use strict; use File::Spec; use File::Temp; use Net::FTP; use Text::Template; # - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - ## Configurable Globals ## $FAV_ROOT = Location of the root of the Favorites folder my $FAV_ROOT = File::Spec->join( $ENV{USERPROFILE}, 'Favorites' ); ## $FAV_NAME = Top level name to use in favorites folder tree my $FAV_NAME = 'Favorites'; ## $FAV_TMPL = Text::Template file; output files will use same extension my $FAV_TMPL = 'favorites.tmpl.html'; ## Host data for publishing favorites via ftp my $FAV_HOST = '   myserver.net   '; my $FAV_PATH = '   favorites   '; my $FAV_USER = '   username   '; my $FAV_PASS = '   password   '; ## End of Configurable Globals # - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - my $_FAV_TEMPDIR = File::Temp->tempdir( 'XXXXXXXX', CLEANUP => 1 ); # - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - sub LoadFavorites {   # Recursively load the structure of an IE   # Favorites directory tree into a tree of hashes.   my $FolderIn = shift;      # Folder to process   my $FavoritesOut = shift;  # Hashref to load with this folder's entries   # Do a readdir into an array for a   # quick load of the directory entries.   opendir( FOLDER, $FolderIn )      die "Could not open favorites folder '$FolderIn'";   my @FolderEntries = readdir( FOLDER );   closedir( FOLDER );   # Process each entry in the directory.   foreach my $FolderEntry ( @FolderEntries ) {     # Skip special names . and ..     next if $FolderEntry eq '.'  $FolderEntry eq '..';     # Construct the full path to the current entry.     my $FileSpec = File::Spec->join( $FolderIn, $FolderEntry );     # Call LoadFavorites recursively if we're processing a directory.     if ( -d $FileSpec && !( -l $FileSpec ) ) {       $FavoritesOut->{$FolderEntry} = {};       LoadFavorites( $FileSpec, $FavoritesOut->{$FolderEntry} );     }     # If it's not a directory, check for a filename that ends with '.url'.     # When we find a link file, extract the URL and map the favorite to it.     elsif ( $FolderEntry =~ /^.*\.url$/i ) {       my ( $FavoriteId ) = $FolderEntry =~ /^(.*)\.url$/i;       next if !open( FAVORITE, $FileSpec );       ( $FavoritesOut->{$FavoriteId} ) =            join( '', <FAVORITE> ) =~ /^URL=([^\n]*)\n/m;       close( FAVORITE );     }   } } # - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - sub MakeDocName {   # Quick hack to generate a safe filename for a favorites entry. Replaces   # all whitespace and special characters with underscores, concatenates   # parent spec with the new spec, and postfixes the the whole thing with   # the same file extension as the globally named template document.   my $FavoriteIn = shift;        # Label of new favorites entry   my $ParentFilenameIn = shift;  # MakeDocName of the parent level   my ( $FileType ) = $FAV_TMPL =~ /\.([^\.]+)$/;   $FavoriteIn =~ s/(\s+\W)/_/g;   $ParentFilenameIn =~ s/$FileType$//;   return lc( $ParentFilenameIn . $FavoriteIn . '.' . $FileType ); } # - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - sub GenerateFavorites {   # Recurse through a tree of Favorites entries and generate a document for   # each level based on the globally named template document.   my $FavoritesIn = shift;       # Hashref to current tree level   my $FolderNameIn = shift;      # Name of the current folder   my $ParentFilenameIn = shift;  # MakeDocName of the parent level   # Create shortcut identifiers for things that get reused a lot.   my $Folder = $FavoritesIn->{$FolderNameIn};   my $FolderFilename = MakeDocName( $FolderNameIn, $ParentFilenameIn );   # Separate the entries in the current folder into folders and links.   # Folders can be identified because they are hash references, whereas   # links are mapped to simple scalars (the URL of the link).   my (%Folders,%Links);   foreach my $Favorite ( keys( %{$Folder} ) ) {     if ( ref( $Folder->{$Favorite} ) eq 'HASH' ) {       $Folders{$Favorite} = { label => $Favorite,         document => MakeDocName( $Favorite, $FolderFilename ) };     }     else {       $Links{$Favorite}={label => $Favorite, href => $Folder->{$Favorite} };     }   }   # Set up Text::Template variables, fill in the template with the folders   # and links at this level of the favorites tree, and then output the   # processed document to our temporary folder.   my $Template = Text::Template->new( TYPE => 'FILE',     DELIMITERS => [ '<{', '}>' ], SOURCE => $FAV_TMPL );   my %Vars = (     FAV_Name => $FAV_NAME,     FAV_Home => MakeDocName( $FAV_NAME ),     FAV_Folder => $FolderNameIn,     FAV_Parent => $ParentFilenameIn,     FAV_Folders => \%Folders,     FAV_Links => \%Links   );   my $Document = $Template->fill_in( HASH => \%Vars );   my $DocumentFile = File::Spec->join( $_FAV_TEMPDIR, $FolderFilename );   if ( open( FAVORITES, ">$DocumentFile" ) ) {     print( FAVORITES $Document );     close( FAVORITES );   }   # Generate Favorites recursively for each of this folder's subfolders.   foreach my $Subfolder ( keys( %Folders ) ) {     GenerateFavorites( $Folder, $Subfolder, $FolderFilename );   } } # - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - sub PublishFavorites {   # Publish the generated documents via FTP. Pretty   # much just gives up if something goes wrong.   my $ftp = Net::FTP->new( $FAV_HOST )      die( "Cannot connect to '$FAV_HOST'" );   $ftp->login( $FAV_USER, $FAV_PASS )      die( "Authorization for user '$FAV_USER' failed" );   $ftp->cwd( $FAV_PATH )      die( "Could not CWD to '$FAV_PATH'" );   opendir( FOLDER, $_FAV_TEMPDIR )      die( "Cannot open working directory '$_FAV_TEMPDIR'" );   my @FolderEntries = readdir( FOLDER );   closedir( FOLDER );   foreach my $FolderEntry ( @FolderEntries ) {     next if $FolderEntry eq '.'  $FolderEntry eq '..';     $ftp->put( File::Spec->join( $_FAV_TEMPDIR, $FolderEntry ) )        warn( "Could not upload '$FolderEntry'...skipped" );   }   $ftp->quit; } # - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - sub main {   my %Favorites;   $Favorites{$FAV_NAME} = {};   LoadFavorites( $FAV_ROOT, $Favorites{$FAV_NAME} );   GenerateFavorites( \%Favorites, $FAV_NAME, '' );   PublishFavorites(  ); } main(  ); 

Here's our example template, favorites.tmpl.html :

 <html> <body>   <h1><a href="<{$FAV_Home}>"><{$FAV_Name}></a></h1>   <select onChange="location.replace(this[this.selectedIndex].value)">     <{       $OUT .= '<option selected>' . $FAV_Folder . '</option>' . "\n";       if ( $FAV_Parent ne '' ) {         $OUT .= '<option value="' . $FAV_Parent . '">..</option>' . "\n";       }       foreach my $folder ( sort( keys( %FAV_Folders ) ) ) {         $OUT .= '<option value="' . $FAV_Folders{$folder}->{document} .           '">&gt;' . $FAV_Folders{$folder}->{label} . '</option>' . "\n";       }     }>   </select>   <table>     <{       foreach my $link ( sort( keys( %FAV_Links ) ) ) {         $OUT .= '<tr><td><a target="net" href="' .           $FAV_Links{$link}->{href} . '">' .           $FAV_Links{$link}->{label} . '</a></td></tr>' . "\n";       }     }>   </table> </body> </html> 

And, finally, here's the simple index.html :

 <html> <head>   <title>Favorites</title> </head> <frameset cols="250,*">   <frame name="nav" scrolling="yes" src="favorites.html" />   <frame name="net" src="http://refdesk.com"/> </frameset> </html> 

Running the Hack

Before you run the code, you need to take care of a few configuration items.

First, let's make sure that your Favorites directory is where the script thinks it will be. At a command prompt, execute the following:

 dir "%USERPROFILE%"\Favorites 

If you get a directory listing with lots of names that appear to match things in your IE Favorites, then you're good to go. If this directory doesn't exist or if its contents don't appear to be your Favorites, then you'll have to find out where on your disk your Favorites are really stored and then change the $FAV_ROOT variable at the top of the script to match.

Second, you need to define your FTP information through the $FAV_HOST , $FAV_PATH , $FAV_USER , and $FAV_PASS variables at the top of the script.

Third, just once, you need to manually upload the index.html document to the directory on your server where you're going to publish your Favorites. Of course, you are free to rename this document and publish your Favorites to a directory that already contains other files, but we suggest setting aside a separate directory. You are also welcome to change the default page that the index.html file initially shows in the net frame.

Okay, now simply run the script as follows :

 %  perl PublishFavorites.pl  

The script runs quietly unless it encounters a problem. For most problems it might encounter, it just gives up and outputs an error message.

That's it. Suppose you publish to the Favorites directory on http://www.myserver.net . Just point your browser to http://www.myserver.net/favorites , and you should have a web-accessible menu of all your IE Favorites! An example is available at http://www.ronpacheco.net/favorites/.

Hacking the Hack

There's a ton of room for enhancement and modification to this hack. Most changes will probably fall into one of the hack's three major processing tasks : loading the bookmark data, generating output, and publishing.

First, you can make it read something other than the IE Favorites tree. Maybe you want to read Mozilla bookmarks, or suck links off a web site, or read your own tree or bookmarkswhatever. If you can read it into the simple tree structure that the script already uses, you'll have a plug-and-play subroutine.

Second, you can change the output. You can pretty up the existing HTML template, you can write new templates for things beyond simple HTML, or you can completely rip out the output section and replace it with something new. The framework for the code to traverse the bookmark tree is already in place. You can use the templating tools as is, or you can use the framework to build something new.

Finally, you can get more sophisticated about publishing. If someone were to ask me if, in practice, I'd really hardcode my username and password into a script and then use that script to publish stuff via an unsecured FTP session, I'd probably have to say no. I'm fairly comfortable putting the access information in the script, as long as I have good control over the system where it's locatedI've been doing it for a couple decades now without any incidentsbut I would be reluctant to use cleartext FTP. In fact, I use FTP to my servers all the time, including a variation of this script, but I tunnel all the connections through SSH. For more sophistication, you could add SSH support directly to the script, and you could consider methods of publication other than FTP.

Like I said, there's a ton of possibilities, limited only by the imagination of the hacker!

Ron Pacheco



Spidering Hacks
Spidering Hacks
ISBN: 0596005776
EAN: 2147483647
Year: 2005
Pages: 157

flylib.com © 2008-2017.
If you may any questions please contact us: flylib@qtcs.net