11.4. Directories and Search Engines
Now that you're well on your way to perfecting and popularizing your site, it's time to start looking at the second level of Internet promotionsearch engines. Getting your Web site into the most important search engine catalogs is a key step in publicizing your Web site. Working your way up the rankings so Web searchers are likely to find you takes more work, and monopolizes the late-night hours of many a Webmaster.
Directories are searchable site listings, with a difference: They're created by humans . That means a small army of computer workers painstakingly puts together a collection of sites, neatly sorted into categories. The advantage of directories is that they're well-organized. A couple of clicks can get you a complete list of California regional newspapers. The unquestioned disadvantage is that they're dramatically smaller than full-text search catalogs. That means directories aren't very useful for those in search of a piece of elusive information that doesn't easily fall into a category, like the most commonly misspelled words. Over the years , as the Web's ballooned in size , directories have become increasingly specialized, and full-text searching tools have become the most common way to hunt for information.
So, given that directories are just the unattractive cousins of full-text search engines, why do you need to worry about them? Two reasons. First, many Web surfers still use directories, even if they don't use them as often as full-text search engines. Second, some search engines (including Google) pay attention to directory listings, and tend to rank sites higher if they turn up in certain directories. Getting into the right directories can help you start to move up the list in a full-text search. And just like college, getting into a directory requires that you submit an application, which you'll learn about next .
18.104.22.168. The Open Directory Project
The most important directory to submit your site to is the Open Directory Project (ODP) at http://dmoz.org. The ODP is a huge, long-standing Web site directory that's staffed entirely by thousands of volunteer editors, who review submissions in countless categories. The ODP isn't the most popular Web directory (that honor currently goes to the Yahoo directory), but it is used behind the scenes by other search engines, including Google, Yahoo, AOL, HotBot, and Lycos. In fact, Google's own directory service (http://directory.google.com) is based on the ODP.
Before submitting to the ODP, take the time to make sure you do it right. An incorrect submission could result in your Web site not getting listed at all. You can find a complete description of the rules at http://dmoz.org/add.html, but here are the key requirements:
Next step is to spend some time at the http://dmoz.org site, until you've found the single best category for your site (see Figure 11-5).
Once you've found the right section, click the "suggest URL" link at the top of the page and fill out the submission form (see Figure 11-6). The form includes your URL, the title of your site, a brief description, and your email address.
Tip: If you have some free time on your hands, you can offer to help edit a site categoryjust click the "become an editor" link. And even if you don't have editorial aspirations, why not check out the editor guidelines at http://dmoz.org/guidelines to get a better idea of what's going on in the mind of an ODP editor, and how he'll evaluate your Web site submission?
Once you've submitted your site, there's nothing to do but wait (and submit your site to the other directories and search engines discussed in this chapter). If two or three weeks pass without your site appearing in the listing, and you haven't received an email describing any problems with your site, try submitting again. If that still doesn't work, it's time to contact the editor of the category where you submitted your site. Write a polite email asking why your site wasn't added to the listing, and include the date of your submission(s) and the name , URL, and description of your site. You can find the email address for the category editor at the very bottom of the category page (see Figure 11-7).
22.214.171.124. Other directories
ODP is a great starting point, but it isn't the only directory on the block. The other two heavyweights are Yahoo and Looksmart. Unfortunately, getting your site on these directories takes considerably more work. If you've created a commercial site, you'll almost certainly need to pay a fee. If you've created a non-commercial site, you can probably get in free, but it may take persistence, emails, multiple submissions, and a bit of luck.
Here are some links to get your started:
Once you're done with directories (or just ready to move on), it's time to take a look at full-text search engines.
11.4.2. Search Engines
For most people, search engines are the one and only tool for finding information on the Web. In order for the average person to find your Web site, you need to make sure your site is in the most popular search engine catalogs, and turns up as a result for the right searches. This task is harder than it seems, because the Web is full of millions of sites jockeying for position. In order to get noticed, you need to spend time developing your site and enhancing its visibility. You also need to understand how search engines rank pages (see the box below for an example).
The undisputed king of Web search engines is Google (www.google.com). Not only is Google far and away the most popular search engine, it also powers other search engines (usually without being credited). Google performs an amazing amount of workevery day it chews through hundreds of millions of search requests .
Tip: For more information about search engines, including who's on top and who owns whom, check out www.searchenginewatch.com.
It's not too difficult to get noticed by Google. By the time your site's about a month old, Google will probably have stumbled across it at least once, usually by following a link from another site or the ODP. A link to your site is the best way to introduce yourself to Google. As described in the box above, Google takes outside links into consideration when sizing up a site, so the more sites that link to you the more likely you are to turn up in someone's search results.
If you're impatient or you think Google's passing you by, you can introduce yourself directly using the submission form at www.google.com/addurl.html (see Figure 11-8). Most popular search engines include a submission form like this. Just make sure you keep track of where you've submitted, so you don't inadvertently submit to the same search engine more than once.
126.96.36.199. Rising up in the rankings
You'll soon discover that it's not difficult to get into Google's catalog. However, you might find that it's exceedingly hard to get noticed. For example, suppose you've submitted the site www.SamMenzesHomemadePasta.com . To check if you're in Google, try an extremely specific search that targets just your site, like "Sam Menzes Homemade Pasta." This should definitely lead to your doorstep. Now, try searching for just "Homemade Pasta." Odds are, you won't turn up in the top 10, or even the top 100.
So how do you create a site that the casual searcher's likely to find? There's no easy answer. Just remember that the secret to getting a good search ranking is having a good PageRank, and getting a good PageRank is all about connections. In order to stand out, your Web site needs to share links with the other leading sites in your category area.
If you want to delve into the nitty-gritty of search engine optimization (known to Webmasters as SEO), consider becoming a regular reader of www.webmasterworld.com and www.searchenginewatch.com. You'll find articles and forums where Webmasters discuss the good, bad, and downright seedy tricks to try and get noticed.
Tip: It's possible to get too obsessed about search engine rankings. Here's a good rule of thumbdon't spend more time trying to improve your search engine ranking than you spend improving your Web site. In the long term, the only way to gain real popularity is to become one of the best sites on the block.
188.8.131.52. Google AdWords
As a Web surfer, you've no doubt seen several lifetimes' worth of flashing messages, gaudy banners, and invasive pop-ups, all trying to sell you some hideously awful products. It probably comes as no surprise to learn that these types of ads aren't the way to promote your sitethey're more likely to alienate people than entice them.
However, there are respectable paid placements that can get your site in front of the right readers, at the right time, and with the right amount of tact. One of the best choices is AdWords ( https ://adwords.google.com), Google's insanely flexible advertising system.
The idea behind AdWords is that you create text ads that Google will show alongside its regular search results. The neat part is that the ads aren't shown indiscriminately. Instead, you choose the search keywords that you want your ad to be associated with (see Figure 11-9).
The neat (and slightly confusing) part about AdWords is that you bid for the keywords you want to use. For example, you might tell Google you're willing to pay 25 cents for the keyword "food." Google takes this into consideration with everyone else's bids, and shows the higher bidders more often. (Google will tell you the highest bid, in case you just want to beat that by a penny.) However, you don't get charged anything for appearing on Google's page. You owe money only when someone clicks on your ad to get to your site.
By this point, you might be getting a little nervous. Given the fact that Google handles hundreds of millions of searches a day, isn't it possible for a measly one-cent bid to quickly put you and your site into bankruptcy? Fortunately, Google's got the solution for this, too. You just tell Google how much you're willing to pay per day. Once you hit your limit, Google stops showing your ad.
Interestingly, the bid amount isn't the only factor that determines how often an ad appears. Popularity is also important. If Google shows your ad over and over again and it never gets a click, Google realizes that your ad just isn't working, and lets you know with an automatic email message. It may then start showing your ad significantly less, or refrain from showing it altogether until you can improve it.
AdWords can be competitive. In order to have a chance against all the AdWords sharks, you need to know how much a click is worth to your site. For example, if you're selling monogrammed socks, you need to know what percentage of visitors actually buy something (the conversion rate ) and how much they're likely to spend. A typical cost per click hovers around 35 cents, but there's a wide range. At last measure, the word free topped the charts at $1.33, while capitalism could be had for a songa mere 10 cents. (And in recent history, law firms have bid "mesothe-lioma"an asbestos-related cancer that could have a class action lawsuit in the makingup close to $100.) Before you sign up with AdWords, it's a good idea to conduct some serious research.
Note: You can learn more about AdWords from Google: The Missing Manual , which includes a whole chapter about it. You can also get an online introduction at http://searchenginewatch.com/sereport/article.php/2164591. Finally, for a change of pace, surf to www.iterature.com/adwords for a story about an artist's attempt to use AdWords to distribute AdWords, and why it failed.
184.108.40.206. Hiding from search engines
In rare situations, you might create a page that you don't want to turn up on a search engine. The most common reason is because you've posted some information that you want to share with only a few friends , like the latest Amazon e- coupons . If Google indexes your site, thousands of visitors could flood your way, sucking up your bandwidth for the rest of the month. Another reason may be that you're posting something semi-private that you don't want other people to stumble across, like a story about how you stole a dozen staplers from your boss. If you fall into the latter category, be very cautious. Keeping search engines away is the least of your problemsonce a site's on the Web, it will be discovered. And once it's discovered , it won't ever go away (see the box below).
But there is at least one thing you can do to minimize your site's visibility or, possibly, keep it off search engines altogether. To understand how this procedure works, recall that search engines do their work in several stages (Section 11.4.2). In the first stage, a robot program crawls across the Web downloading sites. You can tell this robot not to index your site, or a portion of it, in several ways. (Not all search engines respect these rules, but mostincluding Googledo.)
To keep a robot away from a single page, add the robots meta tag to your page. Use the content value noindex, as shown here:
<meta name="robots" content="noindex">
Remember, like all meta tags, you place this in the <head> section of your HTML document.
Alternatively, you can use the content nofollow to tell the robot to index the current page, but not to follow any of its links:
<meta name="robots" content="nofollow">
If you have larger portions of your site that you want to block off, you're better off to create a specialized file called robots.txt , and place it in the top-level folder of your site. The robot will check this file before it goes any further. The content inside the robots.txt file sets the rules.
If you want to stop a robot from indexing any part of your site, add this to the robots.txt file:
User-Agent: * Disallow: /
The User-agent part identifies the type of robot you're addressing. (An asterisk represents all robots.) The Disallow part indicates what part of the Web site is off limits. (A single forward slash represent the whole site.)
To rope off just the Photos subfolder in your site, use this:
User-Agent: * Disallow: /Photos
To stop a robot from indexing certain types of content (like images), use this:
User-Agent: * Disallow: /*.gif Disallow: /*.jpeg
As this example shows, you can put as many Disallow rules as you want in the robots.txt file, one after the other.
Remember, the robots.txt file is just a set of guidelines for search engine robots. It's not a form of access control. In other words, it's similar to posting a "No Flyers" sign on your mailboxit works only as long as advertisers choose to heed it.
Tip: You can learn much more about robots, including how to tell when they visit your site, and how to restrict the robots coming from specific search engines, at www.robotstxt.org.