As you learned in Chapter 10, "Get Your Site Indexed," it can be a challenge just getting your page into the search index. When you do, you face your next challengewhat the search engine thinks of your page. An organic search engine categorizes your page according to dozens of criteria, some driven by explicit tagging, but others based on judgments the search engine makes from an analysis of your page. We investigate the most important criteria search engines look for, which they use to make two different kinds of decisions:
What you put on your page is your best chance to influence the search engine's decisions as it filters and ranks the search results for each query. We check out filtering now.
Searchers use search filters to set their search's scope. Pages that are not included by the filters for a query do not appear in the results. For example, a searcher using the Yahoo! Australia and New Zealand site can choose to search the entire Web, or just Web pages from Australia or those from New Zealand. If the searcher's query for "digital cameras" limits results to pages from Australia, no pages from outside Australia will be shown, regardless of how closely they relate to the "digital cameras" search, because they are excluded by the Australian country filter.
The two most important search filters are for language and country, but we look at others, too.
Figure 12-1. Language filtering in Google. Searchers can choose to limit results to their local language.
In some countries, such as Japan and China, the vast majority of searchers want their results limited to their native languages; in other places, however, such as Sweden, searches can be conducted in Swedish or English. Searchers in different countries have different preferences.
For the search marketer, what is important is that the search engines know the language of your page. If your page is not correctly identified, it will be missing from searches that should include it, lowering your referrals.
So how do search engines decide the language of your page? There are several different methods:
For the most part, search engines will correctly detect the language of your pages without any action on your part. For pages with very few words, it is important that the language and character set metatags on your page be encoded correctly to ensure that your pages are identified in the proper language.
Country and Region Filters
Limiting by language does not do that. German pages exist in Germany, Austria, and Switzerland, French content in France and Canada, Spanish pages in Latin America as well as Spainyou get the ideaand English content all over the place. So, most search engines apply filters by country or by region. Searchers can always use the Advanced Search interface to specify these filters, but relatively few do. Instead, most local searches have a default filter, or allow selection between two or three filters on the search page, specifying a particular country, region, or language.
So how do search engines know which country your Web pages are from? They look at where the page is hosted and they examine the URL itself. Every Web page has a URL, and the domain (www.company.com) is resolved to an IP addressa unique number that points to a server somewhere on the Internet. Search engines can use that IP address to determine the country in which that server resides. Web pages hosted on servers within each country are part of that country's filter.
For the global search marketer in a multinational company, however, country (and region) filters can prove problematic. You want the country content on your site to be included by search engines when they filter by country, but it is not always easy to do, because of the way that the search engines' country filters work. Your multinational site is probably hosted centrally and uses com as its top-level domain, no matter what country the content is for.
Let's look at an example. Microsoft, like most multinational companies, maintains country "sites" that are part of its com top-level domain. Entering the URL www.microsoft.de redirects you to www.microsoft.com/germanywhich does not qualify as a page from Germany. You see, according to the search engine's country filter rules, all pages considered to be from Germany must be hosted locally or have a top-level domain of deMicrosoft's page is hosted centrally and has no content under its de name, so the search engine indexes the content from its com domain (the www.microsoft.com/germany page). Later, when searchers limit results to pages from Germany, this page will not be found (because it is a com and not a de and is not hosted in Germany); when they limit to German language pages, however, it will be found (because it is rightly analyzed to be written in German).
What is the global search marketer to do? First, don't panic. Many searchers understand this problem and regularly toggle between language and country filters to get what they want. If your target customers are not terribly sophisticated Web searchers, however, you might want to approach your Webmaster about changing the way your site is organized so that your country pages do use the top-level domains for each country. Or you can ask that your country pages be hosted at IP addresses within each country. Your Webmaster is unlikely to relish these suggestions, because they make your Web site harder to manage, but in the short term you might have no alternative.
In the long run, you should expect the search engines to address this issue. They are painfully aware of this problem and are taking some steps to ameliorate its impact. Some of the larger search engines already use your site's IP address to see whether your pages are hosted within the correct country, so that can help some of you. If you cannot adapt your site to use the proper top-level domains, we suggest ensuring that your URLs and content strongly reflect the country of the page. Microsoft's approach of adding germany (or de) to the URL while also placing the words Microsoft Deutscheland at the top of the page (itself written in German) might someday be enough clues for search engines to accurately discern the page's proper country. If you cannot satisfy what the search engines are currently looking for with their country filters, at least prepare your content to be as ready as possible for what they might be looking for someday. In addition, the better your content, the more likely it will draw links from sites that are included in the country filterfor some search engines, enough high-quality links from country pages can get your pages recognized as country pages, too.
Search engines offer searchers other filters that might sometimes be important to search marketers, but each search engine provides a different set.
Most search engines enable searchers to filter by type of content. Most have some kind of "picture" or "image" search, some can filter news stories, and most have an Advanced Search interface that filters by document type (such as Adobe PDF files or HTML files).
You might think it is valuable to be the #1 PDF file in Advanced Search, but it really is not, because so few searchers will take the extra time to use Advanced Search. In general, the more clicks required to execute a search, the fewer searchers will do so.
So before you get excited about these specialized searches, think like a searcher. If you work for a news organization, searchers for your site might make that extra click on the News tab in Google, so having high rankings for news stories could help you meet your search marketing goals. Image searches might be important to a seller of fine art prints. For the most part, however, none of these specialized searches are of much interest to the search marketer. (Some search engines offer a tab for shopping search, which might be important to youwe cover that in Chapter 14, "Optimize Your Paid Search Program.")
Most search engines enable searchers to set preferences that control how all of their searches work. Most preferences are unimportant to search marketers, such as the number of results on a page or whether search results open a new browser window, but one can be very important, because this preference is a filter.
The so-called Adult Content filter suppresses pornographic or otherwise sensitive material from the results. The issue for search marketers, as you might expect, is how accurate the filter is. In the past, news reports have trumpeted breast cancer sites (for example) suppressed by such filters, but modern search engines generally do a good job on these filters because of their strong text-analysis capabilities. Most search engines, by default, filter out just the most egregious scatological and sexual content, while leaving in explicit scientific and informational content. A strict setting can be chosen by searchers as their preference; if so, setting the preference just once then employs the strict filter forever. Search marketers whose site might contain sensitive material might want to monitor their page rankings so that they are not unfairly filtered, and might want to police word usage on their sites to avoid being filtered. Pay special attention to message boards on your site frequented by your visitorsif your visitors use inappropriate language in their posts, your site might be snagged by this filter.
You have finished your grand tour of search filters. Depending on the nature of your site, search filters might be critically important to your efforts, or you might not have to think about them much anymore. But now it's time to pay attention, as you learn what is behind a search engine's ranking algorithm.
Search Ranking Factors
A ranking algorithm is the mathematical formula a search engine uses to score pages against the query to see which pages are the closest matches. But what goes into that formula? How can your pages get consistently high scores for your targeted keywords?
It's time to answer those questions. Chapter 2, "How Search Engines Work," explored the basic concepts behind search ranking, but in this chapter we go deeper, explaining more of what search engines are looking for and, later in this chapter, showing you practical ways to help your pages score high. As you read this, keep in mind that the highest rankings mean nothing if your pages are excluded by the filters listed abovestrike out on a filter and your page is out of the results list no matter how closely it matches the query.
If your page is included by the filters for a particular query, the ranking algorithm takes over, looking at every page containing those words and deciding how your page stacks up against the others for that query. There are no right or wrong answers from the search enginethe engine tries finds the highest-quality pages matching the query. The ranking algorithm contains many factors, components that are scored for each page. If your page scores the most points, according to the ranking algorithm's factors, it will get the #1 slot in the results.
In Chapter 2, we discussed how complicated a search engine's ranking algorithm is. Because a ranking algorithm is such a closely guarded secret, no one can publicly state how many different factors a ranking algorithm weighs, but some say there are more than 100. Clearly, not all 100 factors are equally important, so we concentrate on the more important factors here.
Ranking factors come in two main varieties:
You can optimize your content for both page factors and query factorsneglecting either one will derail your search marketing program. We investigate page factors now.
Page Ranking Factors
The moniker "page factors" is a bit of a misnomerAndrei Broder, a Distinguished Engineer at IBM Research, prefers to call them query-independent factorsbecause they are not so much about the page as not about the query. So-called page factors can take into account anything the search engine knows about the page itself, the pages that link to that page, the site that contains the page, and many other components. What this means is that any particular page's page factor score is exactly the same for every querya page with strong page factors starts out with a high score for every word that is on that page.
Every Web search engine uses page factors as a critical component in its ranking algorithmGoogle's PageRank is the most famous example. As explained in Chapter 2, when a searcher enters a broad query, such as the word "camera," the search engine needs a way to decide which few pages, among the millions that contain the word camera, are the ones to rank at the top of the list. The pages with the most occurrences of camera are probably not what searchers want; they want the most definitive pages. Only page factors can make that determination. Let's look at the most important page factors in a search ranking algorithm:
As mentioned previously, although it is easy for us to think about these factors as relevant to just one page, search engines are more sophisticated than that. They look at links to your whole site, not just one page. They check for profanity on your whole site, even if most pages are "clean." If most of your pages are updated frequently, do not obsess about changing your "History of the Company" page every two months.
As you read the list, you might notice that you do not have a lot of control over some of these factors, and it is true, in general, that page factors are harder to influence than query factors. But you are not helpless. Although you cannot directly affect your site's popularity, for example, you can indirectly affect it in many waysthrough search engine marketing, by attracting more links (as explained in Chapter 13), and many other ways of getting attention.
Query Ranking Factors
For you control freaks out there, start salivating. The query-dependent ranking factors, which we call query factors for short, are what you will spend most of your time on, as you lovingly craft each search landing page to best appeal to search engines, and (do not forget) the searchers themselves.
Page factors are constant across every query. A page with high-scoring page factors takes that score with it for every query. And although page factors are important, there must be something going on that is query related, or else the same pages would be at the top of the search results for every query. There is something going on the query factors.
But before talking about query factors for ranking, there is one filter that we did not address back in the filtering section, because it makes more sense to discuss it now. A very powerful filter is used on every queryat least one (and typically all) of the words in the query are expected to be found on your page. If none of the words are on your page, that page is filtered out of the results list, no matter how wonderful its page factors are.
Now that sounds simple and obvious, doesn't it? Except that it is not precisely true. One exception applies to that filtering rule: if enough pages link to your pageif they link using the query words in the anchor text of their links.
Let's look at an example. Figure 12-2 shows the search results for the word "laptop" in Google. Apple has the #3 result, but the word laptop does not appear anywhere on the page. It must be that there are so many other pages linking to this one (containing the word laptop in the link) that Google is convinced (correctly) that this is a good result for that query, even without any occurrences of the word on the page itself. If Apple placed laptop on this page, maybe it would rank #1.
Figure 12-2. The power of page factors. Once in a while, the links to a page drown out the ranking factors from the content on the page itself.
That does not happen very often, however. The vast majority of the time, your page must contain the keyword to rank highly for that query. When your page gets past that filter (and the other filters we discussed earlier), it is in the results list and ranking takes over. Each page in that list comes with its predetermined page score, such as Google's PageRank, the score associated with that page based on its cumulative page factors. Pages with high page factors get a head start in the scoring. But then the query factors take over to decide the winner.
Some query factors can apply to any query, whereas others kick in for multiple-word queries only. Here are the universal query factors:
You can see that every query undergoes complex analysis, and your pages do as well. But queries containing more than one word are evaluated in an even more complicated way, because of the interplay between the words. Take a look at the factors that apply only to multiple-word queries:
Remember that search experts spend their whole careers crafting and polishing these formulas, so no short explanation will give you a complete understandingand you do not need one. You do need to understand the basics of what search engines are looking for, however, and now you do.
Although we have dealt with page factors and query factors separately, to make them simpler to explain, on every query the search engine mixes them together to derive the best ranking for the results listand different queries emphasize one set over the other. For example, imagine a query such as "digital camera." There are millions of occurrences of those words, and keyword density does not help much in determining the best pages. So page factors become critically important in deciding the top ten. But for the query "maytag jetclean dishwasher manual," relatively few pages contain all of those words, especially in proximity to each other, so query factors probably drive the top results more than page factors. Obviously any page that excels in both, for a particular query, could be the #1 result. But every search engine differs, and they handle different types of queries in different waysthat's one reason they have different results for the same query.
As you continue your education in search marketing, you will discover many articles that answer the basic question, "Just what are search engines looking for?" Each article has slightly different answers. Sometimes the articles contradict each other. Do not be concerned about that.
Search engines are fiercely complex, and they change all the time. In addition, "what search engines are looking for" strikes to the heart of a search engine's trade secretsits ranking algorithm. So, maybe the article's writer observed a few situations and concluded something that wasn't quite truethe search engines will never publicly divulge the truth. Or maybe what one writer wrote might have been true when it was written, but is not true anymore. Perhaps two writers performed tests with two different search engines, and what is true for one is not true for the other.
Unfortunately, divining how search works through observation is hopelessly subjective. You will read conflicting and erroneous information about what search engines are looking for, so you need to take everything you read with a grain of saltincluding what you read here. We do not have any inside information. We have lots of experience, just like most of those other writ ers, but what we write might not be any more accurate than anyone else's story. And by the time you read this, the search engines might have added a new wrinkle. If you believe you must keep up with the ever-changing algorithms, consult the resources in Chapter 16, "What's Next?"
But maybe you can take a different approach. What is more important than the details of how any particular search engine works is your philosophy for feeding them tasty spider food. How you think about what search engines want is more important than what they actually want at any particular moment in time. If your philosophy is to outsmart them at every turn, constantly tuning your pages to fit the latest ranking algorithm, that's one philosophyone we do not recommend. Wouldn't you rather use an approach that does not need to be changed every week, one that people without any special training can learn and stick to? We advise a philosophy of writing for your visitor firstyou will learn what that means next.