Ranking Results

                 

 
Special Edition Using Microsoft SharePoint Portal Server
By Robert  Ferguson

Table of Contents
Chapter  5.   Overview of Indexing and Searching Content


In response to a user query, it is important to return the relevant information first. This can become a challenge, as user queries tend to be fairly short. Typically only a few terms are specified, and therefore returning meaningful results is certainly a challenge. But again, the work of Microsoft's Research Labs comes to the rescue. They have developed an algorithm known as an Okapi ranking or probabilistic ranking that weights the frequency of words relative to document length and their overall occurrence in the system. For example, when searching for the words "Windows" and "Paint" on the Microsoft Developer Network, "Paint" is more relevant than "Windows", while on a construction company site, "Windows" may be more relevant. Thus documents that frequently contain the word "Windows" will get ranked higher for a SharePoint Portal Server implementation of a construction department than for a software engineering department.

Another element in relevance ranking is that a complete phrase match is a much better result than just the appearance of all or even only some of the words. To specify a phrase, you enclose the phrase in double quotes. Leaving them off will still find all occurrences of the phrase, but also the individual words. As any phrase match is of higher relevance, you likely will not notice much of a difference other than that the number of returned results is much higher without quotes than with quotes. Just to illustrate that with an example, if you are looking for the term "Computer Industry", enclosing the phrase within quotes will ensure that you find just documents that contain that phrase. Leaving the quotes off, you also would find a document that discusses "How a Computer increases the productivity in any industry," but that document would show up later in the results due to the lower rank.

NOTE

If you are interested in some background on the advanced research work that Microsoft is doing, resulting in such valuable extensions as the probabilistic ranking algorithm, take a look at http://research.microsoft.com/. More research information on the Okapi algorithm may also be found at http://research.microsoft.com/users/robertson/papers/trec_papers.htm.



                 
Top


Special Edition Using Microsoft SharePoint Portal Server
Special Edition Using Microsoft SharePoint Portal Server
ISBN: 0789725703
EAN: 2147483647
Year: 2002
Pages: 286

flylib.com © 2008-2017.
If you may any questions please contact us: flylib@qtcs.net