Types of Spam


If a search engine representative discovers that a web site owner is deliberately trying to trick the search engines into giving a page higher relevance than it deserves , the individual page, or even the entire site, can be penalized . Most of the time, a search engine engineer modifies the algorithm so that the spam pages no longer appear at the top of search results. In more extreme cases, the page itself, or the whole site, is removed from the search engine index. Therefore, avoid the types of spam discussed in the following sections.

Promoting Keywords That Are Not Related to Your Web Site

To gain top search engine visibility, many spammers place words on their web pages that are not related to the pages' actual content. For example, many people take the most popularly searched-for words (such as "sex" or a celebrity name ) and place those words inside a meta tag. This is done only because the word is popular, not necessarily because the word relates to the content of a web page.

As a general rule, if you are not using a keyword or a keyword phrase on the visible content of your pages, do not place unrelated words in your meta tags, titles, alternative text, or any CSS layers .

Keyword Stacking

Keyword stacking is the repeated use of a keyword or keyword phrase to artificially boost a web page's relevancy in the search engines. Keyword stacking, at its simplest level, looks like the following:

 organic tea organic tea organic tea organic tea organic tea organic tea  organic tea  tea tea tea tea tea tea tea tea tea tea tea tea tea tea tea tea tea tea  tea tea tea tea tea tea 

This type of text can be placed in any HTML tag, including the title tag, meta tags, alternative text, and so forth. It can also be placed inside invisible layers and the <noframes> , <noscript> , and <input type="hidden"> tags. It can even be placed at the very bottom of a web page where people are not likely to view the text.

No matter where this type of text is placed, it is still considered spam because the words are gibberish and are clearly written to boost relevancy, not to benefit a site's end users.

Keyword Stuffing

Keyword stacking and keyword stuffing often have the same meaning. Some search engine marketers differentiate the two forms of spam. Keyword stacking often refers to the writing of the gibberish "sentences." Keyword stuffing usually refers to placing the gibberish "sentences" inside graphic images or layers.

For example, many unethical search engine marketers create a small, transparent image called blank.gif, clear.gif, spacer.gif, or shim.gif. Its size is from 1x1 pixels to 25x25 pixels. Then they place a series of keywords in the alternative text of the graphic image. The HTML code for keyword stuffing generally looks like the following:

 <img src="images/blank.gif" alt="organic tea organic teas Organic Tea  Organic Teas ORGANIC TEA ORGANIC TEAS"> 

If such text does not describe the graphic image or the page on which it is placed, the search engines consider the text to be spam.

Hidden Text

As stated, for web pages to appear at the top of search results, the words that your target audience types into a search query must be used on your web pages. One way to place keywords on your web pages without changing the look and feel of your site design is to hide the text or make it invisible. Hidden keywords and keyword phrases are supposed to be visible to search engine spiders, but not to site visitors .

Unethical search engine marketers can make text invisible in multiple ways:

  • Using colored text on a same-colored background, with <font> tags, tricky graphic images, and Cascading Style Sheets.

  • Using hidden text in HTML forms tag <input type="hidden"> even though a web page does not contain a form.

  • Placing keywords inside a <noframes> tag even though the page does not contain a frameset.

  • Placing keywords between the <noscript></noscript> tags if there are no scripts on a web page. The <noscript> tag is text that is meant to display in the event that a script is not executed. It is not meant as a "secret hiding place" for keyword stacking.

  • Using hidden layers in style sheets, either by placing layers on top of each other or by placing the layers outside the browser screen.

As a general rule, anything on a web page that is not meant to be seen or detected at any time by your target audience is considered spam. Because most browsers support frames and JavaScript, there is little or no need to use <noframes> and <noscript> tags. Furthermore, the search engines have been aware of keyword stuffing in these tags for years now, and they place little or no relevance on this text.

Tiny Text

Many spammers understand that hidden text can get their pages penalized, so they often place keywords at the bottom of a web page. They format this text with an extremely small typeface and usually in a color that is not the exact same color as a background, but light enough so that it is difficult to read.

Even though tiny text is visible, it is often illegible. If the text on your web pages is too tiny for your site's visitors to read, the tiny text is also considered spam.

Hidden Links

The true purpose of creating hypertext links and navigation buttons (graphic images) is to have people click them. Unethical web marketers who create links that end users cannot easily detect are, in essence, deceiving end users. Therefore, hidden links are generally considered spam.

Ways of hiding links include the following:

  • Using the same font attributes for a hypertext link as your regular text

  • Hiding hypertext links in punctuation marks

  • Hiding hyperlinks in transparent images

  • Hiding hyperlinks in invisible layers

  • Hiding hundreds or thousands of links inside a small graphic image

Artificial Link Farms and Web Rings

In an attempt to boost the link component of a search engine algorithm, many spammers try to create multiple web sites whose sole purpose is to link to each other. Free-for-all (FFA) web sites are an example of artificial link farms.

All the search engines make it very clear that linking to "bad neighborhoods" can get a site into trouble. Though no one can control which web sites link to your site, you have total control over which sites you link to. Therefore, if your site links to another site that is considered a "bad neighborhood," your site can be penalized.

Creating web rings for the sole purpose of increasing popularity can also be considered spam, especially if the sites linking to each other are not related. For example, a dating service web site has nothing to do with an auto parts web site.

As a general rule, if a hyperlink is not created to be read or followed by humans , it is probably spam.

Page Swapping, Page Jacking, and Bait-and-Switch

Page swapping , also called bait-and-switch spam, occurs when an optimized web page is submitted to the search engines, and then the page is "swapped" for a different one after a top search engine position has been attained. Search engines end up viewing an optimized page; end users wind up viewing a different page altogether.

The reasoning behind this type of spam is to prevent others from stealing a page's search engine "secrets." Unfortunately, page swapping often happens with stolen content. Unethical search engine marketers find a high-ranking web page and copy its content. Sometimes, they change the company name but nothing else. The page with the stolen content is submitted to the search engines. After the stolen page gets a top search engine position, a different web page is placed on the server. The practice of stealing content from other web sites is called page jacking .

Bait-and-switch spam occurs in directories as well. To artificially boost link popularity, unethical web marketers submit a "fake" web site to the directories. After the site is accepted, another web site is put up in its place.

Page swapping is a difficult practice to maintain. No one knows when or how often a search engine spider visits a site. Thus, a swapped page's position is always temporary. Furthermore, if content is stolen from a web page and the original author discovers the online thievery, the spammer could be faced with a copyright and/or trademark lawsuit.

Redirects

Another way that spammers switch web pages is by using a redirect. A redirect is HTML coding, programming, or scripting that is placed on a web page so that page visitors are sent to a different URL after a specified period of time, often zero seconds. One of the most common ways to redirect is a meta-refresh tag, which looks like the following:

 <META HTTP-EQUIV="refresh" content="0;  URL=http://www.domainname.com/differentpage.html"> 

With redirection, spammers create an optimized page for a particular keyword phrase. The optimized page, with the redirect, is submitted to the search engines. If the optimized page gets a top search engine position, anyone clicking the link to this page is automatically sent to a different page called a destination page . The destination page does not contain the same content as the optimized page. In fact, the destination page often does not contain the same keywords as the optimized pages.

To combat this type of spam, many search engines do not accept pages with any type of redirection other than the HTTP 301 (permanent) redirect. Most of the time, the search engines list the destination page, not the page that contains the redirect.

If you find yourself in a situation where your design calls for a redirect, the redirect timing must last long enough for your site's visitors to read the content. Most of the time, 15 seconds does not cause a spam penalty. Another solution is to place the Robots Exclusion Protocol on any page that contains a redirect.

Mirror or Duplicate Pages

Both search engines and their end users do not want the same web sites dominating search results. For this reason, most search engines now cluster their search results. Clustering generally enables only one or two pages per web site to be displayed in the top search results.

All too often, spammers duplicate or slightly modify a web page. Then, the spammers submit to a search engine hundreds or thousands of pages with only tiny modifications. For example, two pages might have identical content and different title tags, and two other pages might have identical content and different meta tags. If any of the submitted pages rank well for a specific keyword phrase, all the pages with slight modifications can dominate a search engine's top search results.

Think about any search you performed that showed the same company over and over again. If you did not like the information the first time you visited the company's site, you would not like it the second, third, fourth, or fifth time either.

Duplicate content is a common occurrence with affiliate and reseller sites. Affiliates and resellers are essentially providing the same information on their web sites as the original corporate site. Again, if you, the end user , did not like the information presented on the corporate site, in all likelihood , you would not like the same information on an affiliate site.

Thus, search engines tend to reject affiliate and reseller sites due to duplicate content. If they find duplicate web pages or web sites, they try to eliminate at least one of them.

Doorway Pages, Gateway Pages, and Hallway Pages

Doorway pages , gateway pages , and hallway pages are not created for the benefit of a site's visitors. They are created specifically for obtaining high search engine positions . That is the main reason the search engines consider doorway pages to be spam.

Doorway page companies generally create thousands of pages for a single keyword or keyword phrase. All these pages are fed to the search engines, often through the free submit pages. Because doorway pages are built specifically to rank, the search engines indices are essentially polluted with web pages containing unnecessary information. Doorway pages are not very pleasant to look at, and they often contain so much gibberish that they have to be cloaked.

Cloaking

Cloaking is the technique of feeding search engine spiders one web page and feeding all other end users a different web page. All the major search engines consider cloaking to be spam. The only time that search engines accept cloaking is through a trusted feed program.

Domain Spam and Mirror Sites

Domain spam is the practice of purchasing multiple domain names and building sites with identical or nearly identical content in them. The purpose of utilizing domain spam is to get multiple listings in directories to achieve link popularity and more traffic. The resulting link popularity in the directories helps boost search engine visibility.

A warning sign that you might be dealing with domain spam is if you hear the terms "micro-site" or "mini-site." This type of spam can also be done with subdomains.

Search engines and directories respond to domain spam by removing all the mirror sites. If the domain spam is particularly egregious, they permanently remove all the sites from their indices.

Typo Spam and Cybersquatting

Web site owners who purchase domain names for the sole purpose of tricking end users into visiting their sites can be considered spammers. For example, a cybersquatter might purchase the domain name Yahhoo.com (note the extra "h") to try to steal traffic from Yahoo!.

Google is particularly picky about this form of spam. They penalize sites that cybersquat. In fact, cybersquatting is illegal in many countries .



Search Engine Visibility
Search Engine Visibility (2nd Edition)
ISBN: 0321503244
EAN: 2147483647
Year: 2003
Pages: 111
Authors: Shari Thurow

flylib.com © 2008-2017.
If you may any questions please contact us: flylib@qtcs.net