Discovering Uses for Google Web Services


Everyone associates Google with searches of various kinds. Many people use Google for simple searches. In fact, you can set browsers such as Internet Explorer to go directly to Google whenever you enter a set of search terms in the address bar. One way to do this is to use a tool such as Tweak UI to create special search entries. You could also install the Google Toolbar (http://toolbar.google.com/), which has the option of making your default search engine Google. However, this book doesn't go into a detailed discussion of ways to manage simple manual Google searches.

This book helps you perform complex searches quickly, more reliably, and with less effort than any manual search can provide. Power users tend to use Google for intense searches, so they usually go directly to the Google Advanced Search page at http://www.google.com/advanced_search/. (Some power users go so far as to memorize all of the special search terms Google uses so they can type everything in the basic search field.) Figure 1.1 shows an example of this page. Notice that you can manually search for a topic using a number of criteria, such as language and file format.

click to expand
Figure 1.1: Use the Google Advanced Search page to get a feel for the power of the Web service.

The Google Advanced Search page is useful because it helps you understand some of the power of the Google search engine. This page points to the need for some kind of automation in searching for information online. Not only does the page accept a number of inputs, but also permutations of the inputs will affect the output you see. Consequently, attempting to perform all searches manually is a time-consuming effort that many people would like to automate.

Now that you have a better idea of why you might not want to perform every search manually, it's time to consider specific ways to use Google Web Services. The following sections provide ideas on how you can improve productivity and make research easier using Google Web Services. The programming chapters of the book expand on many of these ideas by showing you how to implement them using code.

Performing Research

One of the most common uses of Google is performing research. Research searches normally begin based on keywords. The problem is that some keywords are ambiguous enough that the resulting data isn't meaningful. Try using Windows as a search term and you'll see this problem at an extreme. No one would want to look through all those hits. Using multiple keywords can help, but still doesn't solve the problem in many cases. For example, a search of the keywords Visual Studio.NET Email turns up 1,110,000 hits at the time of this writing ”no one would want to go through that many hits looking for an application programming example.

Using additional search terms can help. For example, let's say that you're proficient in using C# and Visual Basic. Because many products use the term Visual , you could enter just C# and Basic in the "with at least one of the words" field of the Google Advanced Search page. However, you still end up with 276,000 hits ”too many for the active developer to search.

Google Web Services can help in this case because you can automate multiple searches to locate specific information. For example, say you want to use a particular class or you have a special need in the application. Performing the search manually could require multiple trips to Google. However, using Web services means you could enter the criteria once and let the application make the multiple searches for you.

Even given the speed of an automatic search, you might wonder whether it's worth the effort of using Google Web Services. However, an application can do something that a manual search can't (at least not without a lot of trouble). Once the application returns from the search, it could store the results. The application would continue with each search scenario until it finished. Then the application could analyze the various returns and create a list of most likely sites based on the results. The 1,110,000 Visual Studio.NET email hits could suddenly become 20 or 25 hits that truly have useful information.

Conducting an Expansion Search

Expansion searches help you locate all available information on a topic by playing to the features that Google provides. For example, the order of search terms is important in the way that Google interprets a search. In addition, if you work in an acronym-laden field, expanding the acronyms is important to locate all sources of information on a given topic. Consider the following permutations of a search using the keywords Visual Basic serial port .

Visual Basic Serial Port This combination returns 132,000 hits with a first site of http://www.distiworld.com/cd- burner -to-download.htm./

Serial Port Visual Basic Just changing the two groups of words around reduces the number of hits to 130,000 with a first site of http://www.lvr.com/spc.htm.

Serial Port VB Using the VB acronym reduces the number of hits further to 58,200 with a first site of http://www.control.com/1026175817/index_html.

VB Serial Port You'd think that this number would be higher than the Serial Port VB search because of previous results. However, the number of hits is only 57,300 with a first site of http://forums.basicmicro.net/ShowPost.aspx?PostID=7638.

Four sets of keywords (and you could easily do more), four completely different results ”it's not hard to understand why an expansion search could help you obtain the maximum benefit from Google. Manual expansion searches become cumbersome for a number of reasons. Repetition is one of the main causes, but there are others such as entry errors and result interpretation. You have to provide enough keywords to make a search specific, but each keyword adds an order of complexity to the expansion search.

Google Web Services steps in by letting you perform an expansion search automatically using code. You supply the four keywords ”the code does the rest. By comparing the results of each expansion search, you can come up with an optimal group of sites. For example, you could verify that the site appears in every expansion search return, which tends to reduce the false positives. You can also rate the sites based on the number of times they appear and their position in the list. Although it's possible to perform this kind of data manipulation using a manual search, no one would want to do it.

Searching a Specific Site

Some Web sites don't provide a search engine. The site might be too small to support a search feature or hosted so the developer doesn't have access to the server's search feature. In other cases, a site does provide a search engine, but the search engine doesn't work nearly as well as Google's. You may find that the search engine fails to produce the desired results, even when you know the information exists. In both cases, you can create a site-specific search using Google Web Services.

You can perform this kind of search manually. In fact, it's not even all that time consuming. However, remembering the information you have to provide in an URL or going to Google's advanced search site every time you want to perform the search is a headache . Using Web services lets you store all of the static settings ”the ones that won't change ”so that all you need to know is what keywords you want to enter for that site. A site-specific search is all about convenience. Using this technique makes it easier to get the information you need without a lot of effort.

One way to use this technique is to create a search setup for your personal Web site. Many Web sites owned by individuals or the self-employed appear on hosted sites, making it impossible to add search capability with any ease. A Google Web Services application can make it easy to add a professional search service to your site, making it a lot more attractive to anyone who visits .

Another way to use this technique is to create custom search Web pages. I built one for my personal use that includes links to all my favorite coding sites. All I do now is select the site I want to search, add a few keywords, and Google Web Services takes care of all the hard work for me. Not only am I more productive, but I can stay focused on the task at hand ”finding sample code. I can even make searches of multiple sites with a single click. Even though multiple searches take place in the background (a minimum of one search for each site), I only click the search button once.

Learning More about a Site

What do you really know about a Web site before you visit it? This question takes many people by surprise because they have to admit that they really don't know anything about the site. However, visiting a site implies that you're willing to open yourself to anything the site can provide within the limitations of your browser. A site that contains pornographic material or a virus when you're conducting legitimate research on parts of the human anatomy is an unwelcome surprise that you could avoid.

Google provides a number of searches you can use to verify the usefulness of a Web site before you visit it. For example, you can begin by looking for keywords in the snippet and site summary that Google provides. An examination of the links for the site, along with the Web information it provides is revealing . Figure 1.2 shows the results of an informational search on my Web site.

click to expand
Figure 1.2: Informational searches help you learn more about a Web site before you visit.

You can also conduct a related links search to see how the site connects to the rest of the Internet. (Chapter 2 discusses search types in detail, so don't worry if these specialized searches are unfamiliar.) If you're truly uncertain about the usefulness of the site, you can view a cached version of the page. The cached version does contain old data, but it can help you check for objectionable terms or content without exposing yourself to as much risk. The point is that with the security problems that users face today, they need a better way to assess the risk of visiting a particular site online. A fact-finding search is very useful in keeping some types of Internet risks at bay.

Unfortunately, few users are going to take the time to perform such fact-finding before visiting a site. It's simply easier to click on the URL and go there. However, you could build a Google Web Services application that would display the search results and assess the potential of a Web site before the user visits there, while maintaining single click efficiency. When the user clicks a link, your application can check the site in the background and verify that it's reasonably safe. What the user sees is the normal sequence of events that take place when they click the link.

Getting Old Data from the Cache

The Internet is constantly changing. In fact, it changes so fast sometimes that it's hard to keep all of the links updated. Anyone who spends any amount of time researching information online knows that even the links Google provides get outdated . However, seeing an error message, page not found, when you click that link isn't the end of the road. You can request cached data from Google. The cached information is old in many cases, but at least it's available and you can use it for whatever you need. Figure 1.3 shows a typical example of a cached data page.

click to expand
Figure 1.3: Cached data searches can be very helpful, especially during research.

Like many other kinds of Google searches, you can perform a cache search manually. However, you have to perform multiple keystrokes to perform the search, assuming you remember to do it. Many people simply move on to the next site without thinking when they reach an error message.

A Google Web Services application can reduce the problems of the dead link. It could begin by searching for the site. If the site isn't available, the application can move on to the cached page. When Google doesn't provide a cached page (a rarity), the application can move on to related links. Even if these techniques fail, the application could use some kinds of regressive searching. A regression search is one in which you begin with the result data and look for the information used to create the results. The point is the user wouldn't see an error message ”a page of some kind would display and the user would then make the decision on the value of that page.

Performing Spell Checking

Interestingly, the spelling check is one of the few Google Web Services tasks that you can't perform manually. To use this feature, you send a string (up to 2,048 characters long) to Google Web Services. The Web service checks the string for spelling errors and sends the corrected string back to you.

At first, you might wonder how you would use this service. After all, it's relatively easy to find a local spell checker that won't use one or more of the 1,000 calls that Google allots to each developer per day. The answer is that you wouldn't use this service personally in most cases. However, if you're running a Web site that requests text input from users, you can use the spell checker to validate their work.

Because the data you receive as input from the user contains fewer errors, you'll also end up doing less work. For example, any database you use to maintain the user input will have fewer errors, so you'll spend less time looking for errant records.

Avoiding Pornographic Material

The Internet contains all kinds of pornographic material. No matter what your personal preferences are, this material becomes annoying at some point because it tends to get in the way of legitimate research. In addition, you don't want children to see this kind of material, and it can cause problems in the workplace. Fortunately, Google does provide a means of searching the Internet without running into too much pornographic material. In fact, you can theoretically eliminate all of it through wise keyword search choices.

Google provides an actual search feature that blocks pornographic sites based on your choice of keywords. The feature does work for the most part, unless your selection of keywords is less than perfect. For example, using breast as one of the keywords in a safe search produces a number of sites for cancer research and many forms of help or assistance. Using the standard search produces the expected results (13,000,000 of them). Unfortunately, figuring out which keywords to avoid isn't always easy.

Like many of the other tasks discussed in this section, you could perform this task manually and might even get good at it given enough time. However, Google Web Services can make the search process a lot more efficient. For example, you can create an application to perform a keyword translation to help you avoid the terms that produce pornographic results. When you couple this application with the safe search feature, all you'll receive is sites that contain the kind of information you need.




Mining Google Web Services
Mining Google Web Services: Building Applications with the Google API
ISBN: 0782143334
EAN: 2147483647
Year: 2004
Pages: 157

flylib.com © 2008-2017.
If you may any questions please contact us: flylib@qtcs.net