Understanding Evidence Operators


Verity's evidence operators control whether Verity steps in and searches for words that are slightly different from the search words you actually specify.

Remember that you cannot use prefix notation with evidence operators, unlike other operators. Instead, you must specify them with infix notationthat is, insert them between each word of a set. Evidence operators include STEM, WILDCARD, and WORD.

The STEM operator tells Verity to expand the search to include grammatical variations of the search words you specify. You specify something other than the root word and Verity finds the root of each word and then searches for all the common variations of that root. If you used permitting as the search criterion, Verity would take it upon itself to search for permit and permitted as well. Here are some examples:

 CRITERIA="<STEM> permitting" CRITERIA="AND (<STEM> permitting, <STEM> smoke)" 

NOTE

The STEM operator is implied in simple Verity searches. To prevent this behavior, specify explicit searches with the TYPE attribute in the <CFSEARCH> tag.


The WILDCARD operator tells Verity that the search words contain wildcards it should consider during the search. Note that Verity assumes two of the wildcard charactersthe question mark (?) and asterisk (*)to be wildcards, even if you don't specify the WILDCARD operator. The other wildcard characters will behave as such only if you use the WILDCARD operator. The following statements are examples:

 CRITERIA="smok*" CRITERIA="smok?" CRITERIA="<WILDCARD>smok*" CRITERIA="<WILDCARD>'smok{ed,ing}'" 

Table E.1 summarizes the possible operators for a wildcard value.

Table E.1. Verity Wildcards

WILDCARD

PURPOSE

*

Like the percent (%) wildcard in SQL, * stands in for any number of characters (including 0). A search for Fu* would find Fusion, Fugazi, and Fuchsia.

?

Just as in SQL, ? stands in for any single character. It's more preciseand thus generally less helpfulthan the * wildcard. A search for ?ar?et would find both carpet and target, but not Learjet.

{ }

The curly brackets enable you to specify a number of possible word fragments, separated by commas. A search for {gr,frag,deodor}rant would find documents that contained grant, fragrant, or deodorant.

[ ]

The square brackets work like { }, except that they stand in for only one character at a time. A search for f[eao]ster would find documents that contained fester, faster, or foster.

The minus sign allows you to place a range of characters within square brackets. Searching for A[C-H]50993 is the same as searching for A[DEFGH]50993.


If you use any wildcard other than ? or *, you must use either single or double quotation marks around the actual wildcard pattern. I recommend that you use single quotation marks because you should contain the criterion parameter as a whole within double quotation marks.

The WORD operator tells Verity to perform a simple word search, without any use of wildcards or the STEM operator. Including a WORD operator is a good way to suppress Verity's default use of the STEM operator; it is also effective if you don't want the ? in a search for Hello? to be treated as a wildcard character. Here are some examples:

 CRITERIA="<WORD>smoke" CRITERIA="<WORD>Hello?" 

The SOUNDEX operator enables you to search for documents containing words that sound like or have a similar spelling to the word in your search criteria; for example:

 CRITERIA="<SOUNDEX>hire" 

This would find documents containing higher and hear. The use of the <SOUNDEX> operator is not supported by ColdFusion MX as installed, but with a little coaxing, you can add support for soundex Verity searches. Adding this functionality involves editing a couple of simple text files that Verity uses for configuration information. Both files are named style.prm but they're located in two different directories. One affects searches against collections built using files or paths; the other affects searches against other custom collections (in other words, collections built on query result sets).

To enable the use of <SOUNDEX> in searches against collections of type file or path, edit the style.prm file in the folder:

 c:\CFusionMX\lib\common\style\file 

To enable the use of <SOUNDEX> in searches against collections of type custom, edit the style.prm file in the folder:

 c:\CFusionMX\lib\common\style\custom 

NOTE

If you did not install ColdFusion in the default path (i.e., something other than c:\CfusionMX\), then you must replace c:\CfusionMX\ above with the path in which you installed ColdFusion.


Look for this text:

 $define     WORD-IDXOPTS    "Stemdex Casedex" 

Replace it with this text:

 $define     WORD-IDXOPTS    "Stemdex Casedex Soundex" 

You are not giving anything up by defining this value differently. Once you've made this change, you'll be able to use the <SOUNDEX> operator in your search criteria.

The THESAURUS operator tells Verity to search for the word specified in your criteria and any synonyms. Here are some examples:

 CRITERIA="<THESAURUS>weak" 

This could be used to find documents containing frail, feeble, and so forth.

The Thesaurus operator is supported in ColdFusion MX7 but only for the following languages:

  • Danish

  • Dutch

  • English

  • Finnish

  • French

  • German

  • Italian

  • Norwegian

  • Norwegian (Bokmal)

  • Norwegian (Nynork)

  • Portuguese

  • Spanish

  • Swedish

The TYPO/n operator lets you search for words similar in spelling to, but not the same as, the search criteria. The n represents the number of letters that can be different and still result in a match, as in this search:

 CRITERIA="<TYPO/1>receipt" 

This would find documents containing recieve but not receive.

The default value of n in ColdFusion is 2.

Here are some notes about using these advanced operators. <SOUNDEX> cannot be used together <THESAURUS>. It is recommended that you not use <TYPO/n> on collections containing more than 100,000 documents. You must use TYPE="Explicit" in your <CFSEARCH> tag when using the <THESAURUS>, <SOUNDEX> and <TYPO/n> operators in your search criteria. You won't get an error if you don't, but you may not get the expected results.



Macromedia Coldfusion MX 7 Web Application Construction Kit
Macromedia Coldfusion MX 7 Web Application Construction Kit
ISBN: 321223675
EAN: N/A
Year: 2006
Pages: 282

flylib.com © 2008-2017.
If you may any questions please contact us: flylib@qtcs.net