Blooper 36: Needle in a Haystack: Piles of Irrelevant Hits

 <  Day Day Up  >  


Blooper 36: Needle in a Haystack: Piles of Irrelevant Hits

Just as you don't want a search facility to miss relevant items, you don't want it to return a lot of stuff that isn't really relevant to your search terms. Search facilities are measured not only by their recall -ability to find all relevant items-but also by their precision, or ability to exclude irrelevant items. On the Web, search facilities are also measured by their ability to sort results by relevance, so the most relevant items are listed first.

Unfortunately, search facilities that bury relevant items in irrelevant and barely relevant ones are even more common than search facilities that overlook relevant items. Irrelevant items are an annoying distraction even when rel atively few in number, especially if presented as if they were relevant. Conversely, a large number of low-relevance items isn't too harmful if all the relevant items are listed before them. What's worst is when a search facility returns a large number of items and fails to order them by actual relevance to the user 's search terms.

Spurious Matches as Distractions

United Airlines' website provides a good example of how spurious search results can distract and delay users (Figure 5.28). If you search for a flight to Minneapolis, instead of a list of flights , you first get a page that reads:

click to expand
Figure 5.28: www.United.com (Feb. 2002)- A- A search for a flight to Minneapolis turns up B- seven matches- One is Minneapolis, but the other six are irrelevant.

Uncertain City/Airport Name : More than one city were found matching with your destination entry of 'minneapolis" in our databases. Please select an airport in or nearby the city of your choice from the following list. Otherwise go back and specify a different entry.

For now we'll ignore the poor English (" ... one city were found ... ") and focus on the fact that only one of the airports listed has anything to do with "Minneapolis: St. Paul International." Multiple choices might make sense if Minneapolis had more than one airport, but it doesn't. The other six airports listed are not only not in Minneapolis; they are in states nowhere near Minnesota! Furthermore, their names aren't even similar to "Minneapolis." Why these match the given destination is unclear. But they do, so anyone who uses United.com to book a flight to "Minneapolis" is forced to make this entirely unnecessary choice.

Poor Results Order

A good example of search results being poorly ordered comes from online bookstore BarnesAndNoble.com. I searched for book author "Ellen Isaacs." The search facility found 21 books (Figure 5.29). As the results page indicates, the books are sorted not by relevance to the given search terms, but by "bestselling order." In this order, books by Ellen Quigley (editor) and Isaac Bickerstaff (illustrator) are at the top of the list, and Ellen Isaacs' book doesn't appear until item 20. In other words, the book that best matches the search terms is 20th in a list of 21. Doesn't that seem odd?

click to expand
Figure 5.29: www.BarnesAndNoble.com (Jan. 2002)- A- A look at the top of the search results shows results sorted in "bestselling order," not by relevance. B- At the bottom of the results, the best matching book is listed as item 20 of 21 items.

The results page provides buttons that resort the list alphabetically by title or by publication date, but not by relevance to the search terms. Imagine if the search facility had found 50 books, or 100, sorted by best-selling order.

VitaminShoppe.com provides a similar example in a different product domain. The user searched for "glucoseamine sulfate" (Figure 5.30). Everything it found matched at least one of those words, which is good. What isn't good is that products matching both words did not appear in the results list until item eight. Quoting the search terms didn't help. Clearly, relevance to a user's search terms is not the default order of VitaminShoppe.com's search results.

click to expand
Figure 5.30: www.VitaminShoppe.com (Jan. 2002)-A search for "glucoseamine sulfate" gave the first seven items matching only "sulfate," and the actual hits start at item 8.

Searching for a Job at Dice.com

The search facility at Dice.com returns many irrelevant job descriptions and orders them poorly. Someone I know searched for "editor." It found 133 job listings supposedly matching that word. The first thirty were as follows :

  1. Strong Project Editor

  2. Freelance Video Editor

  3. Resource Kit Technical Editor

  4. Medical Managing Editor

  5. Web Copy Editor

  6. Freelance Editing/Final Cut Pro/Avid/Photoshop

  7. Technical Documentation Editor/analyst

  8. Senior Circuit Design Engineer

  9. Technical Writer

  10. Senior Circuit Design Engineer

  11. Clarify Technical Lead/Senior Programmer

  12. Photo Editor

  13. IC Mask Layout Designer

  14. Medical Writer/Editor

  15. Filmbox Editor

  16. IC Layout Mask Design Contractor

  17. Senior Circuit Design Engineer SRAM, PLL, I/O

  18. Backend Design Engineer

  19. Clarify Tech Lead

  20. Senior Technical Writer-HTML Editor

  21. Oracle Application Serv Admin

  22. Jr. UNIX Database Administrator-Shell Scripting, SQL, Unix Ad

  23. Sr. Analog and RFIC Layout Designer

  24. Product Developer

  25. Unix Operator Consultant

  26. Senior Analog Layout Designer

  27. Senior SQL Server DBA/Biztalk

  28. Sr. Mask Layout Engineer

  29. Senior IC Mask Designer-Microprocessor products

  30. Web Developer w/BizTalk and .Net

The first seven job titles actually contain the word "editor," which is good, but after that, the results go downhill fast. The titles of jobs 12, 14, 15, and 20 contain the word "editor," but for some reason they weren't placed before many items that don't contain "editor" and seem irrelevant. Finally, none of the jobs in items 21 through 133 seem even remotely relevant to the term "editor."

Avoiding the Blooper

Extraneous irrelevant hits usually occur for the same reasons as missed items: poor indexing and weak search methods . Poor ordering of results is usually due to faulty metrics for rating the relevance of items.

Again, this is a back-end implementation problem that strongly affects the usability of a website. Therefore, the best remedy is a back-end design process that is just as focused on users and their tasks as the front-end design process is. Back-end developers may squirm at this, but it is crucial: You cannot slap a user-friendly front end on a back end that was designed with no thought to usability and usefulness for actual user tasks.

Incorrect keywords on data items can totally destroy the accuracy of an otherwise good search engine. Erroneous keywords sometimes get attached to data items when new items are copied from old ones haphazardly. Sometimes it happens because the people hired to add content to the site don't really understand the site's central topic.

As with Blooper 35 (Search Myopia: Missing Relevant Items), the obvious remedy to this blooper is better procedures and oversight for adding indexing and maintaining content. Further, a lexicon of allowed keywords can help reduce randomness in assigning keywords to content items (Rosenfeld and Morville, 2002). The goal is to ensure that keywords on items are accurate and useful.



 <  Day Day Up  >  


Web Bloopers. 60 Common Web Design Mistakes and How to Avoid Them
Web Bloopers: 60 Common Web Design Mistakes, and How to Avoid Them (Interactive Technologies)
ISBN: 1558608400
EAN: 2147483647
Year: 2002
Pages: 128
Authors: Jeff Johnson

flylib.com © 2008-2017.
If you may any questions please contact us: flylib@qtcs.net