The Google Crawl

 < Day Day Up > 



As with most search engines, Google’s work has two parts: searching the Web and building an index. When you enter a search request, Google doesn’t really go onto the Web to find matching sites. Instead, it searches its index for matches. Google is special at both ends of its work spectrum: first in the scope of its Web searching (and therefore the size of its index), and second in the method by which it matches keywords to Web pages stored in the index.

Most search engine indexes start with an automatic, wide-flung search of the Web, conducted by automated software fancifully called a spider or crawler. Google’s crawl is farther-flung than most, resulting in an index that includes between three and four billion Web pages, as of this writing.

Google performs two levels of Web crawl. The main survey, often referred to as Google’s deep crawl, is conducted once a month. Google’s spider takes slightly more than a week to accomplish its profound examination of the Web. Then, as a bonus, Google launches a so-called fresh crawl much more frequently. The fresh crawl is an experimental update to Google’s index that began in mid-2002 and runs almost every day, at the company’s discretion. Naturally, the fresh crawl is shallower than the deep crawl and is designed to pick up new material from sites that change often. Material gleaned from the fresh crawl is added to the main Google index, though the schedule for the incorporation of new pages is a company secret.

Remember 

Webmasters can see the fresh crawl in action by searching for their new content in the main Google index. The continual index shifting (sometimes called the Everflux) is all part of the Google dance described in Chapter 14. Eager Webmasters should never forget that the Everflux is unpredictable, and one should never pin one’s hopes on the Google dance. There are no guarantees in the Google index, including one saying that any particular site must be included in the daily crawl. Hold fast to persistence and patience. The daily crawl is by no means designed to provide the Google index with a daily comprehensive update of the Web. Its purpose is to freshen the index with targeted updates.



 < Day Day Up > 



Google for Dummies
Google AdWords For Dummies
ISBN: 0470455772
EAN: 2147483647
Year: 2005
Pages: 188

flylib.com © 2008-2017.
If you may any questions please contact us: flylib@qtcs.net