Defining Searches for Nonstandard Uses


Many people assume that they can only use Google to search for current information on Web sites. It's true that most searches reveal Web sites that include the search phrases you request. However, careful use of Google lets you perform other tasks as well, such as discover whether a Web site you want to visit has potentially damaging information in the form of a virus. You might be interested in hearing how users in Spain deal with spam. The only way you can perform that kind of search is to filter out spam hits from other countries . The following sections discuss searches you can create for special reasons, such as reducing the number of hits in a language you don't understand.

Determining Web Site Information

Learning more about Web sites, especially in the virus-filled Internet of today, is essential. The "Learning More about a Site" section of Chapter 1 discussed some of the issues surrounding Web site identification and information checks. Several of the examples in the book will also discuss this very important subject from a programming perspective. However, getting the basic information you need is relatively simple. All you need to do is use the info : query word followed by the Web site URL. For example, to check information on my site, you'd use info:http://www.mwt.net/~jmueller as the search term .

Exploring Cached Data

Google is a great repository of old information. While you can always search for the latest information a Web site provides, you might need the old piece of information you saw last week. The "Getting Old Data from the Cache" section of Chapter 1 discusses the need for this kind of search in detail. To perform this kind of search, use the cache: query word. For example, you can locate the cached version of my Web site by using cache: http://www.mwt.net/~jmueller as the search term.

Understanding Filtering and Restrictions

The process of filtering and restricting data using Google Web Services doesn't work anything like the same process on the Advanced Search page. In fact, the Advanced Search page doesn't provide some of the filtering and restriction options provided by Google Web Services. (It does contain most of the options, so you can test the filtering and restrictions features out.)

Unlike many of the other search features discussed in this chapter, you don't define a restriction or a filter using a special keyword. Instead of providing these search requirements as part of the query, you include them as a special query item. Chapter 4 discusses the information you provide as part of a SOAP request in detail. For now, all you need to know is that the filter information appears as part of the < filter > tag of the request. Using a filter, you can tell Google to return only the first URL for returns that contain the same title and snippet information. You can also restrict sites to two return values, so you don't end up with endless links from the same site.

Google supports two kinds of restrictions. The first is a language restriction that appears as part of the < lr > tag of the request. This tag tells Google which language you want to view for the results. The second tag, < restrict >, defines the country from which you want to receive information. You can also request a special topic such as Linux.

Performing Safe Searches

Safe search techniques keep pornography at bay while you locate the actual information you need. Like other kinds of filtering, you don't define this filter as part of the search term when using Google Web Services. Chapter 4 describes the request technique you use to tell Google how to use this kind of filter. All you need to know now is that you define a safe search by setting the < safeSearch > tag content to true. The "Avoiding Pornographic Material" section of Chapter 1 describes in detail why you would want to use this particular feature.




Mining Google Web Services
Mining Google Web Services: Building Applications with the Google API
ISBN: 0782143334
EAN: 2147483647
Year: 2004
Pages: 157

flylib.com © 2008-2017.
If you may any questions please contact us: flylib@qtcs.net