Getting into Google

 < Day Day Up > 



You can get your site into the Google index in two ways:

  • Submit the site manually

  • Let the crawl find it

start sidebar
Checking your status

How do you know whether your site is in the Google index? Don’t try searching for it with general keywords — that method is hit-and-miss. You could search for an exact phrase located in your site’s text, but if it’s not a unique phrase you could get tons of other matches.

The best bet is to simply search for the URL. Make it exact, and include the www prefix. If you’re searching for an inner page of the site, precision is likewise necessary, and remember to include the htm or html file extension if it exists.

The link operator (see Chapter 2) is invaluable for checking the status of your incoming links, and by extension the health of your PageRank. Use the operator followed immediately by the URL, like this: link:www.bradhill.com. The search results show every page containing a link to your URL. When you try this operator with an inner page of your site, remember that you most likely link to your own pages with menus or navigation bars, and Google regards those links as incoming links, artificially inflating your incoming link count. Incoming links within a domain do not contribute to PageRank. You need to get other sites linking to you.

end sidebar

Both these methods lead to unpredictable results. Google offers no assurance that submitted sites will be added to the index. Google does not respond to submissions, and it does not promise to add or discard the site within a certain time frame. You may submit and wait, or you may just wait for the crawl. You may submit and wait for the crawl. Submitting does not direct the crawl toward you, and it does not deflect it. Google is impassive and promises nothing. But Google does sometimes add sites that are not linked on other pages, and would probably not be found by the crawl.

Remember 

If you have added a new page to a site already in the Google index, you do not need to submit the new page. Under most circumstances, Google will find it the next time your site is crawled. But you might as well submit an entirely new site, even if it consists of a single page. Do so at this URL:
www.google.com/addurl.html

The submission form could hardly be simpler. Enter your URL address, and make whatever descriptive comments you feel might help your cause. Then click the Add URL button — which is a bit misleading. Submitting a site is not the same as adding it to the index! Only the Google crawler or a human Google staffer can make additions to the index.

Luring the spider

The key to attracting Google’s spider is getting linked on other sites. Google finds your content by following links to your pages. With no incoming links, you’re an unreachable island as far as the Google crawl is concerned. Of course, anybody can reach you directly by entering the URL, but you won’t pluck the spider’s web until you get other sites to link to you.

In theory, any single page currently crawled by Google (that is, in the index) that links to your page or site is enough to send Google’s spider crawling toward you. In practice, you want as many incoming links as possible, both to increase your chance of being crawled (sounds a little uncomfortable, doesn’t it?) and to improve your PageRank after your site is in the index.

Remember 

Keep your pipes clean. That is to say, don’t make life difficult for Google’s spider. In other words (how many different ways can I say this before I finally make myself clear?), host your site with a reliable Web host, and keep your pages in good working order. The Google crawl attempts to break through connection problems, but it doesn’t keep trying forever. If it can’t get through in the monthly deep crawl and your site isn’t included in the fresh crawl, you could suffer a longish, unnecessary delay before getting into the index.

Tip 

Don’t expect instant recognition in Google when you add a page to your site. If your site is part of the fresh crawl, new page(s) show up fairly quickly in search results, but there’s no firm formula for the frequency of the fresh crawl or the implementation of its results. If the spider hits your site during the deep crawl, the wait for fresh pages to appear in the index is considerably longer. The same factors apply if you move your site from one URL address to another. (Although not if you merely change hosts, keeping the same URL.) Complicating that situation is that your site at the old address might remain cached (stored) in Google’s index, even while search results are matching keywords to your site at the new address. This confusion is one reason some Webmasters don’t like the Google cache — when they make a change to a site or its address, they don’t want the old information living on in the world’s most popular search index.

start sidebar
Index or directory?

Most of this chapter is devoted to getting a foothold in Google’s Web search index, which should not be confused with the Google Directory. Although the search index is largely automated, the Directory consists of hand-picked sites selected by a volunteer staff numbering in the hundreds. Chapter 3 describes the Open Directory Project, which Google uses and upon which Google imposes its PageRank formula.

Getting into the Directory is more direct than getting into the search index, but you must go to the Open Directory Project, not to Google. Go to this URL

www.dmoz.org/add.html

and follow the instructions there. See also the section called “Submitting a Web Page to the Directory” section in Chapter 3.

end sidebar

start sidebar
On your own

Creating the Google index is an automated procedure. The Google spider crawls through more than three billion pages in its surveys of the Web. Some sites (small ones in particular) might be tossed around by the Google dance, even to the extent of dropping out of the index for a month at a time and then reappearing. PageRank can fluctuate, influencing a site’s position in search results. Some sites have trouble breaking into the index in the first place.

Although Google receives and attends to URL submissions, as described in this chapter, the company does not provide customer service in the traditional sense. There is no customer contact for indexing issues. The positive aspect of this corporate distance is that the index is pure — nobody, regardless of corporate size or online clout, can obtain favorable tweaking in the index. The downside is that you’re on your own when navigating the surging tides of this massive index. Patience and diligent networking are your best allies.

end sidebar

Spider-friendly tips

Getting into the Google index is largely a waiting game, in which preparation, persistence, and patience are the tools of success. However, a number of techniques incline Google’s spider to look on you more favorably:

  • Place important content outside dynamically generated pages: A dynamic page is one created on-the-fly based on choices made by the site visitor. This method of page generation works fine when the visitor is a thinking human. (Or even a relatively thoughtless human.) But when an index robot hits such a site, it can generate huge numbers of pages unintentionally (assuming robots ever have intentions), sometimes crashing the site or its server. The Google spider picks up some dynamically generated pages, but generally backs off when it encounters dynamic content. Weblog pages do not fall into this category — they are dynamically generated by you, the Webmaster, but not by your visitors.

  • Don’t use splash pages: Splash pages, (which Google calls doorway pages) are content-empty entry pages to Web sites. You’ve probably seen them. Some splash pages employ cool multimedia introductions to the content within. Others are mere static welcome mats that force users to click again before getting into the site. Google does not like pointing its searchers to splash pages. In fact, these tedious welcome mats are bad site design by any standard, even if you don’t care about Google indexing, and I recommend getting rid of them. Give your visitors, and Google, meaningful content from the first click, and you’ll be rewarded with happier visitors and better placement in Google’s index.

  • Use frames sparingly: Frames have been generally loathed since their introduction into the HTML specification early in the Web’s history. They wreak havoc with the Back button, and they confuse the fundamental format of Web addresses (one page per address) by including independent page functions within one Web page. However, frames do have legitimate uses. Google itself uses frames to display threads in Google Groups (see Chapter 4). But the Google crawler turns up its nose when it encounters frames. That’s not to say that framed pages necessarily remain out of the Web index. But errors can ensue, hurting both the index and your visitors — either your framed pages won’t be included, or searchers are sent to the wrong page because of addressing confusion. If you do use frames, make your site Google friendly (and human friendly) by providing links to unframed versions of the same content. These links give Google’s diligent spider another route to your valuable content, and give us (Google’s users) better addresses with which to find your stuff. And your visitors get a choice of viewing modes — everybody wins.

  • Divide content topically: How long should a Web page be? The answer differs depending on the nature of the page, the type of visitor it attracts, how heavy (with graphics and other modem-choking material) it is, and how on-topic the entire page is. Long pages are sometimes the result of lazy site building, because it takes effort to spin off a new page, address it, link to it, and integrate into the overall site design. From Google’s perspective, and in the context of securing better representation in the index, breaking up content is good, as long as it makes topical sense. If you operate a fan page for a local music group, and the site contains bios, music clips, concert schedules, and lyrics, Google could make more sense of it all if you devote a separate page to each of those content groups. Google also likes to see page titles relating closely to page content. Keeping your information bites mouth-sized helps Google index your stuff better.

  • Keep your link structure tidy: Google’s spider is efficient, but it’s not a mind-reader. Nor does it make up URL variations, hoping to find hidden content. The Google crawler is a slave to the link. If you want all your pages represented in the index, make sure each one has a link leading to it from within your site. Many site-building programs contain link-checking routines and administrative checks to diagnose linkage problems. Simple sites might not warrant such firepower; in that case, check your navigation sidebars and section headers to make sure you’re not leaving out anything.



 < Day Day Up > 



Google for Dummies
Google AdWords For Dummies
ISBN: 0470455772
EAN: 2147483647
Year: 2005
Pages: 188

flylib.com © 2008-2017.
If you may any questions please contact us: flylib@qtcs.net