Searching for Data

                 

 
Special Edition Using Microsoft SharePoint Portal Server
By Robert  Ferguson

Table of Contents
Chapter  5.   Overview of Indexing and Searching Content


At this point we have a decent understanding of how the index is built up and what kind of properties are available to support searching for data. So it's now time to dive a little more into the search architecture from a query perspective, as searching for data is the reason for building up the index in the first place.

As a Knowledge Discovery tool, SharePoint Portal Server has been designed around the idea of a Web portal. Not only the browser, but also the access through Web Folders and Office XP make use of the HTTP Web protocol and extensions such as WebDAV, which reside on top of it.

NOTE

In its origin, the Web was designed to browse for information, not for authoring information or to facilitate search in a rich manner. For this reason, Microsoft has actively worked together with the Internet Engineering Task Force to extend the HTTP protocol, resulting in the WebDAV (Web Distributed Authoring and Versioning) protocol, documented in RFC 2518. WebDAV includes a standardized set of operations that are optimized for authoring documents on the Web, like copying and pasting files, moving files and creating directories, and working with document properties. WebDAV makes the Web aware of properties, but unfortunately does not provide for a structured query on these properties.


The SharePoint Portal Server “specific WebDAV protocol is implemented in a dedicated ISAPI extension, the extension mechanism used by Microsoft's Internet Information Server. If a search request is issued, the request will be passed to the search engine. The query will be analyzed , and if some full-text operations are included, word breakers will break down the words, noise words will be filtered out, word stemming will be applied, and finally a lookup in the thesaurus will be done before the index is questioned. This process is illustrated in Figure 5.11.

TIP

The language dependencies apply both at index and at query time. If, for example, a German document is indexed, "these" is not filtered out, as it is not a noise word. But if you are querying from an English browser, you will never find the word "these", simply because it will be taken out, being an English noise word.



                 
Top


Special Edition Using Microsoft SharePoint Portal Server
Special Edition Using Microsoft SharePoint Portal Server
ISBN: 0789725703
EAN: 2147483647
Year: 2002
Pages: 286

flylib.com © 2008-2017.
If you may any questions please contact us: flylib@qtcs.net