Maintaining full-text indexes presents some new challenges to the database administrator. One of the most important choices that must be made with regard to full-text indexes is how to repopulate them.
Choosing the Best Refresh Schedule
Updating a full-text index is an expensive operation. If possible, restrict the refresh periods to an off-peak time window. Complete refreshes are the simplest method for maintaining your indexes, but large data sets (more than a million rows) can be unwieldy. The correct choice for repopulation schedules must be made with application design goals in mind. Do you require real-time indexing? Can you live with one hour , one day, or one week of lag time between updating the index?
You have several options for refreshing full-text data:
Backing Up Full-Text Indexes
The ability to back up the full-text indexes depends heavily on the update method you have chosen . In a worst-case scenario, you can restore your database backups (which are kept consistent by a transaction log), rebuild the catalog, and then perform a full repopulation. However, with large full-text indexes, this could be a lengthy process.
If you are using change tracking, it is more difficult to implement a solid backup policy because there are constant changes occurring to the full-text data. You could perform a backup of the files once a day, and restore these partial indexes in the event of a failure. This would provide limited functionality until you could take the system down for a full repopulation.
If you are using incremental updates, you can back up the full-text indexes that are in the file system separately by stopping the MSSearch service, backing up all the files in the catalog, backing up the SQL Server database, and restarting the MSSearch service. In the event of an index corruption or file system failure, stop the search service. Restore these files, restart the search server, and immediately perform an incremental full-text update. Search Server will receive all the changes to the table based on the timestamp from the last update.
Effects of Inconsistent Data
Maintenance of full-text indexes is an asynchronous process, meaning that your indexes can easily be out of sync with your data. SQL Server uses transactions and locking to prevent inconsistency in its data, but the full-text search engine has neither of these features.
This section describes the effects of inconsistent data with Search Server. Following are descriptions of what would happen if a row is inserted, updated, or deleted in the SQL Server table, but the full-text index has not yet been updated to reflect the change.
A row that is inserted to the table, but isn't in the index, will not be returned by any search that would normally find it. Because the unique key associated with the word did not exist at the time of the index creation, the new row will not be reachable by Search Server. Of course, you can still get to the row through conventional Transact -SQL commands.
A deleted row will work correctly, although extra processing will be involved. When you write a query to find certain words, SQL Server asks the Search Server for the key values of rows that satisfy your request. When these results are returned, SQL Server attempts to retrieve the row using this key. If the row has been deleted, no row will be returned.
If the text for an indexed column is modified such that it no longer satisfies the full-text search criteria, the index will still find the old row, but will return the new information. Because the key value is unchanged, and the Search Server only returns the key values that satisfy your query, SQL Server has no way to know that the results are incorrect. This can be confusing to a user .