How Google Searches the Internet

How Google Searches the Internet

When you search using Google, you're actually searching through an index of web pages. To gather the raw material for the index, Google's web-crawling robot, called Googlebot, sends a requests to a web server for a web page. It then downloads the page. Googlebot runs on many computers simultaneously, and constantly requests and receives web pages, making thousands of requests per second. In fact, Googlebot makes requests more slowly than its full capability, because if it operated fullthrottle, it would overwhelm many web servers, and the servers would not be able to deliver pages quickly enough to users.

Web masters who don't want their sites to be searchable via Google can instruct Google not to index their sites. To do it, they create a text file called robots.txt containing only these two lines and put it in the root directory:

User-agent: *
Disallow /

That tells all search engines, not just Googlebot, to stay away.

They can also tell Googlebot or other search engines to not search their site by putting this HTML tag into the <head> section of the HTML for their web page:

<META name="ROBOTS" content="NOINDEX, NOFOLLOW" />

What Google Knows About You

This illustration shows many of the kinds of information that Google can find out about you. Note that Google does not sell this information to other sites, and does not use it to create personal profiles about you. But it gives you a sense of what the largest and most successful search engine in the world knows about you as use it.


Chapter 29. How Map Sites Work

Need to find driving directions from Cincinnati to Seattle? Looking for Afghani restaurants in Cambridge, Massachusetts? Want to get a detailed a map of Lubbock, Texas?

All that, and more, are available on mapping sites on the Internet. There are a number of mapping websites you can visit, but the two most popular are MapQuest and Google Maps. MapQuest is the older and more established of the two, and the more targeted as well. It serves up maps and directions, but not much else. Google Maps, on the other hand, lets you also find local information on the maps you're looking at, such as nearby restaurants, museums, and more.

Although mapping sites look quite different from one another, and have different interfaces, if you take a look under the hood, they all operate relatively similarly.

Mapping sites, as a general rule, do not actually create the underlying mapping information themselves. Instead, they get that information from a commercial provider of mapping information. These providers typically sell mapping information not only to map sites, but to private businesses that need mapping information as well.

The providers regularly update the maps they sell to mapping sites in several different ways. Commonly, they hire people to actually drive the streets, and then update their maps to reflect any new construction, changes in streets and landmarks, and so on.

Map providers give more than just raw mapping information to map sites like MapQuest and Google Maps. They also provide a database that calculates the best driving directions from one point to another. The directions are based on a variety of complex algorithms, but generally they try to find the directions that take the least amount of time to drive, rather than the shortest distance between two points.

Map sites may make use of the same basic mapping databases, or similar databases, but the features they offer to visitorsand the interface they use to deliver those featuresare quite different.

MapQuest, for example, uses a simple, basic HTML interface, and concentrates on directions and maps. Google Maps, on the other hand, uses a far more interactive interface that allows visitors to more easily zoom in and zoom out, switch to a satellite view, and navigate through maps by dragging with a mouse. Google Maps does this by using a technique called AJAX. (For more details about AJAX, see the illustration "How AJAX Works" in Chapter 19, "How Markup Languages Work.")

Google also offers a tool more sophisticated than mere maps, Google Earth. Google Earth lets you "fly" to any location on earth in a virtual tour, using high-resolution photos and animations.