Conducting a More Refined Search


Most users enter a keyword or two into Google's search box, click the Search button, and are satisfied with the results. This is a rather brute force method of searching, however, and typically generates a ton of (mostly unwanted) results.

There is a better way to search, however, one that generates a smaller, more targeted list of results. To generate fewer, better results, you have to refine your queryusing a defined series of search operators.

Don't Worry About Capitalization...

First, let's expose the fact that Google's searches are not case-sensitive. It doesn't matter whether you search for California or california, the results will be the sameso don't worry about applying proper capitalization. (See Figures 2.6and Figure 2.7 to see for yourself.)

Figure 2.6. The results of a search for the capitalized word California...


Figure 2.7. ...are identical to the results for the all-lowercase california.


Note

An operator is a symbol or word that causes a search engine to do something special with the word directly following the symbol.


Note

To be fair, in many cases the very top results will be the same no matter what the word order. The difference tends to come as you move deeper into the result listings.


...But Do Worry About Word Order

In a Google query, the order of your keywords matters. Google weights the importance of your keywords in order of appearance, so that the first keyword is considered most important, the second keyword the second most important, and so on. You'll get slightly different results for hdtv retailers chicago(shown in Figure 2.8) than you will for chicago retailers hdtv(shown in Figure 2.9).

Figure 2.8. The results of a search for hdtv retailers chicago...


Figure 2.9. ...compared to a search for chicago retailers hdtvthe first result is the same, but all the others are different.


"And" Is Assumed

Next, know that Google automatically assumes the word "and" between all the words in your query. That is, if you enter two words, it assumes you're looking for pages that include both those wordsword one and word two. It doesn't return pages that include only one or the other of the words.

This is different from assuming the word "or" between the words in your query. As an example, compare the query bob AND ted with bob OR ted. In the first query, the results would include pages that mentioned both Bob and Ted. In the second query, the results would include pages that mentioned Bob alone, as well as pages that mentioned Ted alone, as well as pages that mentioned both Bob and Ted. It's a subtle difference, but an important one.

The upshot is that you don't have to enter the word "and" in your query. If you're searching for Bob and Ted, all you have to enter is bob ted, as shown in Figure 2.10. Google assumes the "and," and automatically includes it in its internal index search.

Figure 2.10. When you enter two keywords, an "and" is assumed between the two.


Search for One or Another Word

Similarly, if you want to conduct an "or" searchto search for pages that include one word or another word, but not necessarily bothyou can use the OR operator. For example, to search for pages that talk about either Bob or Ted (but not necessarily Bob and Ted together), use the query bob OR ted, as shown in Figure 2.11. And when you use the OR operator, make sure to insert it in all uppercase, or Google will ignore it as a stop wordwhich we'll discuss next.

Figure 2.11. To search for pages that contain either one word or another, enter an OR between the two keywords.


Common Words Are Automatically Excluded

Speaking of the words "and" and "or," Google automatically ignores these and other small, common words in your queries. These are called stop words, and include "and," "the," "where," "how," "what," "or" (in all lowercase), and other similar wordsalong with certain single digits and single letters (such as "a").

Including a stop word in a search normally does nothing but slow the search down, which is why Google excises them. As an example, Google takes the query how a toaster works, removes the words "how" and "a," and creates the new, shorter query toaster works, as shown in Figure 2.12.

Figure 2.12. Take a look at the statistics barGoogle simplified the query how a toaster works to just toaster works.


If you want these common words included in your query, you have two options. You can automatically include them by using the + operator (discussed next), or you can include the common words within a phrase by enclosing the entire phrase within quotation marks (discussed a little later).

Note

The OR operator is the only Boolean operator accepted by the Google search engine. (Boolean operators come from Boolean logic and mathematics.) The Boolean AND operator is assumed in all Google searches; the Boolean NOT operator is replaced by the Google operator, discussed a little later inthis chapter.


Always Include Stop Words

You can override the stop word exclusion by telling Google that it must include specific words in the query. You do this with the + operator, in front of the otherwise excluded word. For example, to include the word "how" in your query, you'd enter +how, as shown in Figure 2.13. Be sure to include a space before the + sign, but not after it.

Figure 2.13. Enter the + operator before any stop word you want to include in your query.


Exclude Words from Results

Sometimes you want to refine your results by excluding pages that include a specific word. You can exclude words from your search by using the - operator; any word in your query preceded by the - sign is automatically excluded from the search results. Remember to always include a space before the - sign, and none after.

For example, if you search for bass, you could get pages about the type of male singer or about the type of fish. If you want to search for the type of singer only, enter a query that looks like this: bass fish (as shown in Figure 2.14).

Figure 2.14. Use the - operator to exclude a word from your search results.


Take Advantage of Automatic Word Stemming

Unlike some other search engines, Google doesn't let you use wildcards to indicate the variable ends of words. Wildcards, as used elsewhere, let you search for all words that include the first part of a keyword; for example, a search for book* (with the * wildcard) would typically return results for "books," "bookstore," "bookkeeper," and so on.

Instead, Google incorporates automatic word stemming, which is a fancy way of saying that Google automatically searches for all possible word variations. This is a great way to search for both singular and plural forms of a word, as well as different tenses and forms.

For example, a search for the word monster will return both "monster" (singular) and "monsters" (plural). A search for rain will return "rain" (current tense), "rained" (past tense), and "rains" (active form). And the word stemming works in the opposite direction, too; a search for rains will return both the words "rains" and "rain."

Search for Similar Words

Not sure you're thinking of the right word for a query? Do you worry that some web pages might use alternative words to describe what you're thinking of? Fortunately, Google lets you search for similar words by using the ~ operator. Just include the ~ character before the word in question, and Google will search for all pages that include that word and all appropriate synonyms.

For example, to search for words that are like the word "elderly," enter the query ~elderly, as shown in Figure 2.15. This will find pages that include not just the word "elderly," but also the words "senior," "older," "aged," and so on.

Figure 2.15. Use the ~ operator to include synonyms in your search.


Tip

To list only synonyms, without returning a ton of matches for the original word, combine the ~ operator with the - operator, like this: ~keyword -keyword. This excludes the original word from the synonymous results. Using the previous example, to list only synonyms for the word "elderly," enter ~elderly -elderly.


Search for an Exact Phrase

When you're searching for an exact phrase, you won't get the best results simply by entering all the words in the phrase as your query. Google might return results including the phrase, but it will also return results that include all those wordsbut not necessarily in that exact order.

When you want to search for an exact phrase, you should enclose the entire phrase in quotation marks. This tells Google to search for the precise keywords in the prescribed order.

For example, if you're searching for Monty Python, you could enter monty python as your query, and you'd get acceptable results; the results will include pages that include both the words "monty" and "python." But these results will include not only pages about the British comedy troupe, but also pages about snakes named Monty, and guys named Monty who have snakes for pets, and any other pages where the words "monty" and "python" occuranywhere in the page, even if they don't appear adjacent to one another. To limit the results just to pages about the Monty Python troupe, you want to search for pages that include the two words in that precise order as a phrase. So you should enter the query "monty python" making sure to surround the phrase with the quotation marks, as shown in Figure 2.16. This way if the word "monty" occurs at the top of a page and "python" occurs at the bottom, it won't be listed in the search results.

Figure 2.16. To search for an exact phrase, include it in quotation marks.


Use Wildcards to Search for Missing Words in an Exact Phrase

I noted previously that Google doesn't use wildcards to complete missing letters in keywords. However, Google does let you use whole-word wildcards within a phrase search. That is, you can search for a complete phrase even if you're not sure of all the words in the phrase. You let the * wildcard character stand in for those words you don't know.

Here's an example. Let's say you want to search for pages that discuss Martin Luther King's famous "I have a dream" speech, but you're not sure whether he has, had, or have that dream. So you use the * wildcard to stand in for the word in question, and enter the following query: "i * a dream" (as shown in Figure 2.17).

Figure 2.17. Use the * wildcard to search for missing words in a phrase.


You can even use multiple wildcards within a single phrase, within reason. While "* * a dream" might return acceptable results, "* * * dream" is a fairly useless query.

Search for Words That Don't Appear Together

Here's another usage of the * whole-word wildcard. If you want to search for documents where two words don't appear side-by-side, insert the * operator between the two keywords in your querywhile still surrounding both keywords by quotation marks. This searches for instances where the two keywords are separated by one or more words.

For example, to search for pages where the words "happy" and "holidays" aren't adjacent, enter this query: "happy * holidays" (as shown in Figure 2.18).

Figure 2.18. You can also use the * wildcard to search for words that aren't adjacent.


Narrow Your Search to Specific File Types

Google can search for information contained in all sorts of documentsnot just HTML web pages. In particular, Google searches for the following file types and extensions, in addition to normal web pages:

  • Adobe Portable Document Format (PDF)

  • Adobe PostScript (PS)

  • Lotus 1-2-3 (WK1, WK2, WK3, WK4, WK5, WKI, WKS, WKU)

  • Lotus WordPro (LWP)

  • MacWrite (MW)

  • Microsoft Excel (XLS)

  • Microsoft PowerPoint (PPT)

  • Microsoft Word (DOC)

  • Microsoft Works (WDB, WKS, WPS)

  • Microsoft Write (WRI)

  • Rich Text Format (RTF)

  • Shockwave Flash (SWF)

  • Text (ANS, TXT)

If you want to restrict your results to a specific file type, use the filetype: operator followed by the file extension, in this format: filetype: filetype. For example, if you want to search only for Microsoft Word documents, enter filetype:doc along with the rest of your query (as shown in Figure 2.19).

Figure 2.19. Use the filetype: operator to limit your search to specific types of documents.


To eliminate a particular file type from your search results, use the filetype: operator preceded by the - operator and followed by the file extension, like this: -filetype:filetype. For example, if you want to elminate PDF files from your results, enter -filetype:pdf.

By the way, when you view a non-HTML document (something other than a web page, such as an Acrobat PDF or Word DOC file), Google displays a View As HTML link in the page listing, as shown in Figure 2.20. Clicking this link translates the original document into web page formatwhich often displays faster in your browser.

Figure 2.20. Click the View As HTML link to translate the document file to web page format.


Narrow Your Search to a Specific Domain or Website

Maybe you want to search only those sites within a specific top-level web domain, such as .com or .org or .eduor, perhaps, within a specific country's domain, such as .uk (United Kingdom) or .ca (Canada). Google lets you do this by using the site: operator. Just enter the operator followed by the domain name, like this: site: domain.

For example, to search only those sites within the .edu domain, you'd enter site:.edu along with the rest of your query (as shown in Figure 2.21). To search only Canadian sites, enter site:.ca. Remember to put the "dot" before the domain.

Figure 2.21. Use the site: operator to limit your search to a specific top-level domain.


The site: operator can also be used to restrict your search to a specific website. In this instance, you enter the entire top-level URL, like this: site:www.website.domain. For example, to search only within my personal Molehill Group website (www.molehillgroup.com), enter site: www.molehillgroup.com (as shown in Figure 2.22). Your results will include only pages listed within the specified website.

Figure 2.22. You can also use the site: operator to restrict your search to pages on a specific website.


Narrow Your Search to Words in the Page's Title

Google offers two methods for restricting your search to the titles of web pages, ignoring the pages' body text. If your query contains a single word, use the intitle: operator. If your query contains multiple words, use the allintitle: operator.

For example, if you want to look for pages with the word "Honda" in the title, use the intitle: operator and enter this query: intitle:honda (as shown in Figure 2.23). Make sure not to leave a space between the intitle: operator and the keyword.

Figure 2.23. Use the intitle: operator to restrict your search to web page titles only.


If you want to look for pages with both the words "Honda" and "Element" in the title, use the allintitle: operator and enter this query: allintitle: honda element. Notice that when you use the allintitle: operator, all the keywords after the operator are searched for; you separate the keywords with spaces.

Caution

If you enter intitle:honda element, Google will only search for the word "Honda" in the page titles; it will conduct a normal full-page search for the word "Element." This is why you want to use the allintitle: operator if you have multiple keywords in your query.


Narrow Your Search to Words in the Page's URL

Similar to the intitle: and allintitle: operators are the inurl: and allinurl: operators. These operators let you restrict your search to words that appear in web page addresses (URLs). You use these operators in the same fashion-inurl: to search for single words and allinurl: to search for multiple words.

For example, to search for sites that have the word "molehill" in their URLs, enter this query: inurl:molehill (as shown in Figure 2.24). Make sure not to leave a space between the inurl: operator and the keyword.

Figure 2.24. Use the inurl: operator to restrict your search to web page adddresses.


To search for sites that have both the words "molehill" and "group," enter this query: allinurl:molehill group. As with the allintitle: operator, all the keywords you enter after the allinurl: operator are searched for; you separate the keywords with spaces.

Narrow Your Search to Words in the Page's Body Text

For all this fuss about searching titles and URLs, it's more likely that you'll want to search the body text of web pages. You can restrict your search to body text only (excluding the page title, URL, and link text), by using the intext: and allintext: operators. The syntax is the same as the previous operators; use intext: to search for single words and allintext: to search for multiple words.

For example, to search for pages that include the word "Google" in their body text, enter the query intext:google (as shown in Figure 2.25). Make sure to leave a space between the intext: operator and the keyword.

Figure 2.25. Use the intext: operator to restrict your search to just the body of web pages.


To search for pages that include both the words "Google" and "search" in the body text, enter the query allintext:google search.

Narrow Your Search to Words in the Page's Link Text

There are two more operators similar to the previous batch. The inanchor: operator lets you restrict your search to words in the link, or anchor, text on a web page. (This is the text that accompanies a hypertext linkthe underlined text on the page.) The allinanchor: variation lets you search for multiple words in the anchor text.

For example, to search for links that reference the word "goose," you'd enter inanchor:goose (as shown in Figure 2.26). Make sure not to leave a space between the inanchor: operator and the keyword.

Figure 2.26. Use the inanchor: operator to search for links that reference the given keyword.


To search for links that reference the words "goose" and "duck," enter the query allinanchor:goose duck.

Search for a Range of Numbers

What if you want to search for pages that contain items for sale within a certain price range? Or selected back issues of a magazine?

For these tasks, use Google's ... operator. All you have to do is enter the lower number in the range, followed by the ... operator, followed by the higher number in the range. For example, when you enter 100...150 (as shown in Figure 2.27), you search for pages that include the numbers 100, 101, 102, and so forth on up to 150.

Figure 2.27. Searching for a range of numbers with the ... operator.


Travel Way Back in Time for Your Search

When you want to search for pages created between two specific dates, you can use Google's daterange: operatorassuming you can live with its quirks.

When you use the daterange: operator, Google restricts its search to web pages that match the dates you enter. Know, however, that Google dates the pages in its index based on when it indexed themnot when the pages were actually created. So if a page was created sometime back in 2001 but Google didn't get around to indexing it until May 12, 2002, it will be dated May 12, 2002. It's an imperfect way to approach this issue, but it's the only one that Google offers.

There's another catch to using the daterange: operatorand this is the killer, for most users. To use the daterange: operator, you have to express the date as a Julian date, not our standard month-day-year format. Julian chronology is a continuous count of dates since January 1, 4713 BC, so that July 8, 2002 is Julian date 2452463.5. There isn't really a good reason for the enforcement of Julian dates, other than that it's the simplest brute force method from a mathematics perspective. It's obviously much easier to calculate a single number than a series of three numbers (day/month/year).

If you insist on using the daterange: operator, your query syntax should look like this: daterange:startdate-enddate. I won't bother with an example.

Tip

To calculate Julian dates, go to the Julian Date Calculator at the U.S. Naval Observatory website (aa.usno.navy.mil/data/docs/JulianDate.html).


List Pages That Link to a Specific Page

Want to know which other web pages are linking to a specific page? Because Google works by tracking page links, this is easy to find out. All you have to do is use the link: operator, like this: link:URL. For example, to see the thousands of pages that link to Microsoft's website, enter link:www.microsoft.com (as shown in Figure 2.28).

Figure 2.28. Use the link: operator to find all pages that link to a specific page.


List Similar Pages

Have you ever found a web page you really like, and then wondered if there were any more like it? Wonder no more; you can use Google's related: operator to display pages that are in some way similar to the specified page. For example, if you really like the news stories on CNET's News.com website (www.news.com), you can find similar pages by entering related:www.news.com (as shown in Figure 2.29).

Figure 2.29. Use the related: operator to search for pages that are somehow like a given page.


Find Out More About a Specific Page

Google collects a variety of information about the web pages it indexes. Inparticular, Google can tell you which pages link to that page (see the link: operator, discussed previously), which pages that page links to, which pages are similar to that page (the related: operator), and which pages contain that page's URL. To get links to all this information on a single page (plus a link to Google's cached version of that page), use Google's info: operator. This displays a set of links, like those shown in Figure 2.30, that you can click to obtain the desired page info.

Figure 2.30. The results of applying the info: operator to the author's www.molehillgroup.com website.


Highlight Keywords

If you want to highlight all the instances of the keywords you searched for in a document, use the cache: operator, followed by the site's URL. This displays the cached version of the web page, with the keywords in your query highlighted in yellow, as shown in Figure 2.31.

Figure 2.31. Highlighting search keywords with the cache: operator. (Note the summary at the top of the cached page.)


For example, to highlight all instances of the keyword "windows" on the www.microsoft.com website, enter this query: cache:www.microsoft.com windows.




Googlepedia. The Ultimate Google Resource
Googlepedia: The Ultimate Google Resource
ISBN: 078973639X
EAN: 2147483647
Year: 2004
Pages: 370

flylib.com © 2008-2017.
If you may any questions please contact us: flylib@qtcs.net