How Microsoft SPS Crawls | Special Edition Using Microsoft SharePoint Portal Server

	Special Edition Using Microsoft SharePoint Portal Server By Robert Ferguson
	Table of Contents

	Chapter 18. Configuring SPS to Crawl Other Content Sources

Different portal products emphasize various features, such as document management, personalization, threaded discussions, and so on. But the most pervasive feature in any Portal product is the ability to facilitate searching by crawling, indexing, analyzing, and sorting large quantities of data. In this way, the portal becomes a capable machine for turning raw data into real information, whether the data resides on a corporate intranet or across the globe.

To crawl simply means to search through content for inclusion in an index. Once indexed, the data may then be sorted into various categories or classifications according to a set of rules ”the taxonomy. The taxonomy organizes the data around business-oriented customer-driven concepts that are meaningful to the end- user community. At this point, the raw data becomes useful, as it is made available to users through two primary vehicles:

Search capabilities
Customizable "views" of the data

It is Microsoft's search engine , which runs as a service, that facilitates such robust crawling capabilities. SharePoint utilizes a highly refined search engine resulting from years of research and development. This search engine ”also known as MSSearch ”employs crawling search algorithms that utilize probability rankings, inverse queries, and vector machine categorization. In simplest terms, though, the search engine compares the query (search criteria) to the documents that have been indexed, and then chooses the documents that are most relevant to the query.

Top