Hack 26. Iterate and Generate Expensive Data


Hide lists, streams, and expensive data structures behind a simple interface.

Perl's fundamental aggregate data typeshashes and arraysare wonderfully flexible and often just what you want. That's often, not always. Sometimes you really need to process data that's expensive to calculate, part of a huge list that won't fit into memory, or just never ends.

When that happens, use a function reference as a data structure. Seriously.

The Hack

Imagine that you've just taken a job as a network administrator, replacing someone who completely failed to do any documentation. You know that you have all sorts of devices on the network with static IP addresses and you have a rough idea of the network blocks, but you don't know which addresses are in use.

Rather than finding every device, checking its settings, and reassigning things, you can write a little program to loop through each address and try to contact the device. It's a good first approximation. How do you check every netblock though? Use Net::Netmask to generate a list of IP addresses.

That could get messy thoughdo you really want to loop over a list of potentially millions of addresses? This is a good place to use a generator.

use Net::Netmask; sub create_generator {     my @netmasks;     for my $block (@_)     {         push @netmasks, Net::Netmask->new( $block );     }     my $nth = 1;     return sub     {         return unless @netmasks;         my $next_ip = $netmasks[0]->nth( $nth++ );         if ( $next_ip eq $netmasks[0]->last( ) )         {             shift @netmasks;             $nth = 1;         }         return $next_ip;     } }

Running the Hack

Pass create_generator( ) a list of IP network blocks and netmasks and it will return a function reference that, when called, returns either the next address in the series or the undefined value if you've exhausted everything. It does this by closing over two variables, the list of Net::Netmask objects in @netmasks and a counter variable $nth. The latter represents the current position in the list of available addresses for the current Net::Netmask object.

To test an IP address for an active device, just pull a new address from the iterator by executing it:

my $next_address   = create_generator( '192.168.1.0/8', '10.0.0.0/16' ); while (my $address = $next_address->( )) {     # try to communicate with machine at $address }

If you have a huge group of addresses to check, this is much more memory- and time-friendly than generating a list of hundreds of thousands of addresses all at once.

Hacking the Hack

With a generator as large as this one and the inevitable delay for network communication, you might want a way to suspend and resume from a certain point. If you turned the generator function reference into an object, you could add a serialize( ) or store( ) method that saves the current state. Then you can resume from almost any point. All you need to save is the base( ) and bits( ) information from each active Net::Netmask object (presumably in the proper order) and the current value of $nth.

Of course, in a program that probably has network communication as its most significant bottleneck, you may want to check several addresses in parallel. "Pull Multiple Values from an Iterator" [Hack #27] can help.

Mark Jason Dominus's Higher Order Perl (Morgan Kaufmann, 2005) shows how to use functional programming techniques in Perl, including iterators and generators. This book is worth studying in detail.



Perl Hacks
Perl Hacks: Tips & Tools for Programming, Debugging, and Surviving
ISBN: 0596526741
EAN: 2147483647
Year: 2004
Pages: 141

flylib.com © 2008-2017.
If you may any questions please contact us: flylib@qtcs.net