Hack 15 Secured Access and Browser Attributes

figs/beginner.gif figs/hack15.gif

If you're planning on accessing secured resources, such as your online banking, intranet, or the like, you'll need to send and receive data over a secured LWP connection .

Some sites are purveyors of such important data that simple password authentication doesn't provide the security necessary. A banking site, for instance, will use a username and password system to ensure you are who you say you are, but they'll also encrypt all the traffic from your computer to theirs. By doing so, they ensure that a malicious user can't "sniff" the data you're transmitting back and forthcredit card information, account histories, and social security numbers . To prevent against this unwanted snooping, using encryption, the server will install an SSL (Secure Sockets Layer) certificate, a contract of sorts between your browser and the web server, agreeing on how to hide the data passed back and forth.

You can tell a secured site by its URL: it will start with https :// .

When you access an HTTPS URL, it'll work for you just like an HTTP URL, but only if your LWP installation has HTTPS support (via an appropriate SSL library). For example:

 #!/usr/bin/perl -w use strict; use LWP 5.64; my $url = 'https://www.paypal.com/';   # Yes, HTTPS! my $browser = LWP::UserAgent->new; my $response = $browser->get($url); die "Error at $url\n ", $response->status_line,     "\n Aborting" unless $response->is_success; print "Whee, it worked!  I got that ",     $response->content_type, " document!\n"; 

If your LWP installation doesn't yet have HTTPS support installed, the script's response will be unsuccessful and you'll receive this error message:

 Error at https://www.paypal.com/    501 Protocol scheme 'https' is not supported 

If your LWP installation does have HTTPS support installed, then the response should be successful and you should be able to consult $response just as you would any normal HTTP response [Hack #10].

For information about installing HTTPS support for your LWP installation, see the helpful README.SSL file that comes in the libwww-perl distribution (either in your local installation or at http://search.cpan.org/src/GAAS/libwww-perl-5.69/README.SSL ) . In most cases, simply installing the Crypt::SSLeay module [Hack #8] will get you up to speed.

Other Browser Attributes

LWP::UserAgent objects have many attributes for controlling how they work. Here are a few notable ones (more are available in the full documentation):


$browser->timeout(15)

Sets this browser to give up when requests don't answer within 15 seconds.


$browser->protocols_allowed(['http','gopher'])

Sets this browser object not to speak any protocols other than HTTP and gopher. If the browser tries accessing any other kind of URL (such as an ftp: , mailto: , or news: URL), then it won't actually try connecting, but instead will immediately return an error code 500, with a message like "Access to ftp URIs has been disabled."


$browser->conn_cache(LWP::ConnCache->new( ))

Tells the browser object to try using the HTTP/1.1 Keep-Alive feature, which speeds up requests by reusing the same socket connection for multiple requests to the same server.


$browser->agent('someName/1.23 (more info here)')

Changes how the browser object identifies itself in the default User-Agent line of its HTTP requests. By default, it'll send libwww-perl/ versionnumber , such as libwww-perl/5.65 . More information is available in [Hack #11].


push @{ $ua->requests_redirectable }, 'POST '

Tells this browser to obey redirection responses to POST requests (like most modern interactive browsers), even though the HTTP RFC says that should not normally be done.

Sean Burke



Spidering Hacks
Spidering Hacks
ISBN: 0596005776
EAN: 2147483647
Year: 2005
Pages: 157

flylib.com © 2008-2017.
If you may any questions please contact us: flylib@qtcs.net