Preface

When the Web began , it was a pretty small place. It didn't take much to keep abreast of new sites, and with subject indexes like the fledgling Yahoo! and NCSA's "What's New" page, you could actually give keeping up with newly added pages the old college try.

Now, even the biggest search enginesyes, even Googleadmit they don't index the entire Web. It's simply not possible. At the same time, the Web is more compelling than ever. More information is being put online at a faster clipbe it up-to-the-minute data or large collections of old materials finding an online home. The Web is more browsable , more searchable, and more useful than it ever was when it was still small. That said, we, its users, can only go so fast when searching, processing, and taking in information.

Thankfully, spidering allows us to bring a bit of sanity to the wealth of information available. Spidering is the process of automating the grabbing and sifting of information on the Web, saving us the trouble of having to browse it all manually. Spiders range in complexity from the simplest script to grab the latest weather information from a web page, to the armies of complex spiders working in concert with one another, searching, cataloging, and indexing the Web's more than three billion resources for a search engine like Google.

This book teaches you the methodologies and algorithms behind spiders and the variety of ways that spiders can be used. Hopefully, it will inspire you to come up with some useful spiders of your own.



Spidering Hacks
Spidering Hacks
ISBN: 0596005776
EAN: 2147483647
Year: 2005
Pages: 157

flylib.com © 2008-2017.
If you may any questions please contact us: flylib@qtcs.net