SEARCH ENGINES


Search engines, such as AltaVista, Excite, Google, HotBot, Infoseek, Northernlight, Yahoo, and numerous others, offer a wide range of web searching facilities. These search engines are sophisticated, but not as much as one might expect. Their results can easily fall victim to intelligent and often deceptive web page designers. Depending on the particular search engine, a web site can be indexed, scored and ranked using many different methods (Searchengine.com, 2002). Search engines' ranking algorithms are often based on the use of the position and frequency of keywords for their search. The web pages with the most instances of a keyword, and the position of the keywords in the web page, can determine the higher document ranking (see Jensen, 2002; Searchengine.com, 2002; Eyeballz, 2002). Search engines usually provide the users with the top 10 to 20 relevant hits.

There is limited information on the specific details of the algorithms that search engines employ to achieve their particular results. This is logical as it can make or break a search engine's popularity as well as its competitive edge. There is generalized information on many of the items that are employed in search engines such as keywords, the reading of tags, and indexes. For example, AltaVista ranks documents, highest to lowest , based on criteria such as the number of times the search appears, proximity of the terms to each other, proximity of the terms to the beginning of the document, and the existence of all the search terms in the document. AltaVista scores the retrieved information and returns the results. The way that search engines score web pages may cause very unexpected results (Jensen, 2002).

It is interesting to note that search results obtained from search engines may be biased toward certain sites, and may rank low a site that may offer just as much value as do those who appear on the top-ranked web site (Lucas & Nissenbaum, 2000). There have often been questions asked without substantial responses in this area.

Like search engines on the Web, online databases on the WWW have problems with information extraction and filtering. This situation will continue to grow as the size of the databases continues to grow (Hines, 2002). Between database designer and web page designers, they can devise ways to either promote their stored information or to at least make something that sounds like the information the user might want come to the top of the search engine result listing. This only adds to the increased difficulties in locating and filtering relevant information from online databases via the WWW.




(ed.) Intelligent Agents for Data Mining and Information Retrieval
(ed.) Intelligent Agents for Data Mining and Information Retrieval
ISBN: N/A
EAN: N/A
Year: 2004
Pages: 171

flylib.com © 2008-2017.
If you may any questions please contact us: flylib@qtcs.net