"Content is king"or at least it reigned during the Internet boom years of the late 1990s. But content is not the real change ushered in by the Web. The Web content itself is not remarkably different from what you can read in books, magazines, or newspapers. What does differ is the ease with which surfers can move from one piece of content to the nextthat's the power of links. Links changed the way people consume content; they can now navigate easily across different sources that once required meandering amid library stacksafter schlepping to the library in the first place.

Just as links transformed readers' content consumption, links changed the game for search engines, too. With the advent of Google in 1998, search engines began to use links to judge the quality of every page on the Web. Although Google was the first to apply this insight to Web searches, precursors to hypertext links have historically been employed to judge the value of information.

Scientific papers have long relied on citationsreferences to previous papers that attest to the correctness of a basic concept. Scientists vie for the honor of having the most citations to their papers, because when later papers cite a scientist's original work, it provides a rough estimation of that original paper's value. Similarly, new patents regularly refer to "prior art" in old patents, so they can build their ideas on top of the solid ground of previous ideas. These precursors to Web links served the same purposethey created a kind of "information economy" in which the best ideas are discovered because the most experts refer to them.

This information economy, not the content itself, is the most striking feature of the Web. Content on any subject can be created easily by anyone. Higher-quality information tends to attract more links than mediocre or poor content. When thousands or millions of pages are voluntarily linked to an article, it is a strong recommendation for its quality. You can think of the authors of each of those pages recommending that article, much the same way you might recommend a good plumber or a capable auto mechanic. When you recommend a person, you are providing someone access into your network of trusted associates. When your page links to another page, you are providing access to your network of trusted information. It is that trust, built up by the recommendations of thousands or millions of people, that allows search engines to conclude that the article in question is valuable, trustworthy information. In this way, we could say that it is not what your page knows, but who it knowslinks to your page cause it to be treated with respect regardless of its actual content.

However, it is not that simple, of course. Just as that plumber must have knowledge (what he knows) to continue to attract recommendations from whom he knows, so does your page require strong information to attract links. But the links themselves are what we are interested in here. In Chapter 12, we looked at how search engines assess your content, but now let's see how they value links to and from your page.

How Web Sites Link

Links between Web pages use an HTML tag, just like all other content, as shown in Figure 13-1. The figure shows an internal linka link from one page to another within the same domain (Web site). Of more interest to search engines are external links, which connect one Web site to another, because those links indicate more impartial recommendations. Web sites use these endorsements to determine which sites have the most linked-to pagesthe pages with the most inbound links. Inbound links act as a surrogate for the quality and trustworthiness of the content, which search engines cannot discern from merely looking at the words on the page.

Figure 13-1. How links are coded in HTML. Web page links use the HTML anchor tag to show the (usually) underlined text on the screen for that link.

AltaVista, Compaq, and IBM have advanced the bow-tie theorythat the Web is actually composed of four kinds of pages, each with its own peculiar linking patterns:

  • Core pages. Comprising 30 percent of the Web, these pages are the most linked-to and linked-from on the Web. The most popular Web sites tend to have many pages in this group.

  • Origination pages. Approximately 24 percent of all pages have numerous links into the core but relatively few from the core. These pages might be new or not terribly high quality, so they have not attracted the links back to them that would mark them as part of the core.

  • Destination pages. Another 24 percent of the Web consists of pages that are commonly linked from the core, but do not themselves link back into the core. These pages are typically high-quality pages, but they might be corporate Web sites that tend to link internally more then externally.

  • Disconnected pages. The remaining pages (22 percent) are not directly connected to the corethey might have links to or from origination and destination sites, or they might be linked only to other disconnected pages.

As you look at Figure 13-2, understand that the pages are categorized comparatively. Origination pages tend to have far more links into the core than from the core, with destination pages just the opposite tendency. Getting one directory listing for an origination page does not change it to a core page, but getting a number of inbound links from core pages would.

Figure 13-2. The bow-tie theory of the Web. Research shows a core of Web pages have the most links, with other pages feeding or being fed by the core.

For the search marketer, the importance of the bow-tie theory is that core and destination pages have the highest link popularity. Those are the best pages to get links from. As you can imagine, destination pages tend not to link to many other external pages, leaving just the core pages as the best targets to get links from. Garnering links from origination pages and disconnected pages will not bring you as many visitors and will not carry the same weight with search engines.

How Link Popularity Works

We introduced link popularity in Chapter 2, "How Search Engines Work," and in Chapter 12, "Optimize Your Content," we explained that link popularity is a critical (for some search engines, the most critical) page ranking factor. Many search experts believe that as few as 25 high-quality links to your site can significantly increase your search rankings, but attracting 25 such links might not come easily for all sites. Remember, links from 25 mediocre sites (or worse, from 25 poor sites) will not help your page to rank any higher.

Figure 13-3 shows an exaggerated example of how important link popularity can be. If you search for the phrase "miserable failure" in Google, you will find President George W. Bush's official biography as the #1 result. Why? Because enough disgruntled Democrats have linked to that page with the words "miserable failure" as the anchor text in their links. (Not to be outdone, similarly vexed Republicans have made former Democratic president Jimmy Carter the #2 result for the same query.) As you can see from the figure, the words miserable failure do not appear anywhere on the page, but Google places such great weight on anchor text from links that it makes this page #1 anyway.

Figure 13-3. Link popularity run amuck. Google places such weight on link popularity that it overrides the page content, sometimes leading to comical results.

Not all search engines emphasize link popularity to the extent that Google does, but they all give it strong weight in their ranking algorithms. Search engines evaluate a page's link popularity in four basic ways:

  • Link quantity. In general, pages receiving more links to them rank higher than pages with few links, but all links are not created equal, as discussed next.

  • Link quality. Everyone has an opinion, but some are worth more than others. Opinions expressed by those creating links are no different. A link to a page is an endorsement of that page, but endorsements have more value from well-respected and authoritative sources than from others. Search engines determine authority by examining the link popularity of the site linked from. So, if a high-authority site (one that itself has many other high-authority sites linking to it) links to your page, it is conferring some of its authority on your page. Search engines attribute the highest page ranking factors to pages with many links from high-quality sites. But it is even more complicated than that. Search engines mathematically split the authority conveyed to each linked page based on the number of linksa high-authority page with 3 links conveys more authority to each linked page than an equally high-authority page containing 50 links conveys. The theory is that there is only so much authority to go around and the more links there are, the less of a recommendation each linked page is getting.

  • Anchor text. The text that the visitor clicks to follow the link is very important to search engines because it provides the context of the recommendation. Consider two different links to the personal Web site of Pat Lee, one that uses the name Pat Lee in the anchor text and another that uses the words tax expert Pat Lee as the anchor text. Both clearly indicate that searches for "Pat Lee" might want to consider this page, but only the second conveys an endorsement of Pat Lee as a tax expert. Anchor text is a key query ranking factor in the search engine algorithmsqueries tend to return the linked pages that have variants of the query terms in anchor text. For example, if our fictitious firm Snap Electronics began to attract many links to one of its pages with the anchor text SnapShot digital cameras, queries that contain those words would rank Snap's page higher than before.

  • Link relevancy. Links from contextually relevant sites are also a key query ranking factor. When we say "contextually relevant," we mean that the information is on a certain theme (or topic or subject)it is not enough for the anchor text to use similar words, because words can have multiple meanings on unrelated subjects. Beyond the anchor text, search engines look at the words around the anchor text, words on the entire page and even the entire site being linked from. Why? Because sites that are relevant to the topic of the query provide more relevant links than others. Continuing the Snap example, links from digital camera review sites, from camera retailers, and from Snap affiliates are more contextually relevant than those from a teen fashion magazine. Random links from popular sites do not convey the same authority as links from sites that are popular and thematically related.

If these factors seem hard to calculate, you are getting the idea. It is rather amazing that search engines can consider so many complicated factors in the second between the searcher's query and the results page, but they do. Search engines attribute high page ranking factors to pages with many high-quality links to those pages. Search engines attribute high query ranking factors to pages with inbound links that are both from contextually relevant sites and contain the query words in their anchor text. Those factors are mixed together, along with many other factors discussed in Chapter 12, to determine which pages rank first for a query.

So how do search engines calculate the value of links to a page (the biggest part of the page ranking factor)? As you might expect, each use different detailed techniques, but they have some basics in common, built around the theory of hub and authority pages:

  • Hub pages. Web pages that link to several or many other pages on a similar subject. We have spoken of the hub page for Snap's digital camerasit is the page with links to the rest of the pages on Snap's site on that subject. But other pages are hubs, too. If you create a page on your site that links to the best pages on the Web on a particular subject, that makes your page a hub for that subject. One reason that directory links are so powerful is that the subject (or topic or theme) of the directory is a powerful clue as to the subject of the pages linked tothese directory pages are also hubs.

  • Authority pages. Web pages that are linked to by many other pages on a particular subject. Authority pages are the ones that search engines usually ascribe the highest page ranking factors to. And it makes sense. Pages that are most closely related to the searcher's keywords are probably authority pages.

We have expressed the concepts in terms of pages, but some search engines view hubs and authorities as clusters of pages within a site or even as sites themselves. Those engines would give more value to authorities that came from sites that had several pages that were also authorities on the same subject, for example.

Search engines try to rank authority pages higher, but the search engines use hub pages to do a better job of identifying authority pages. Because words can have multiple meanings (windows, for example), search engines that look only at anchor text might display pages about the wrong subject, even though those pages contain the words searched for. Hub pages can help search engines zero in on the particular meaning for a wordif an authority page is also linked by hub pages on a particular subject, search engines develop more confidence that the page is also on the right subject. Pella Windows will receive different hub links than Microsoft Windows, for example.

As discussed previously (in Chapters 2 and 12), the more pages that exist on the Web containing a keyword, the more important page factors become in the search result ranking. It makes sense, if you think about it. If your search marketing campaign is targeting a search term that very few sites use (such as a new brand name for your product line), optimizing your content and getting your pages indexed might be enough for high rankings. On the other hand, if you are trying to break through for digital camerasa very popular keyword found on millions of pagesyou must concentrate on factors in addition to content optimization. If it sounds confusing, it might be that you need a crash course in how to "think links"that's next.

