Verity Search Query Language


When you use <CFSEARCH>, you can specify the type of search to be performed by using the TYPE attribute. This attribute can be set to either Simple, which is the default, or Explicit.

A simple query expression is typically a word or a set of words. An explicit query expression makes use of a number of operators and modifiers to refine the search. Although an explicit query can employ operators and modifiers, all aspects of the search must be invoked explicitly. A simple query expression employs operators by default.

The Verity query language provides a number of operators and modifiers to compose queries. You can use these search techniques to search a Verity collection:

  • Word searches

  • Proximity searches

  • Concept-based searches

  • Field searches in which documents are matched based on matching predefined custom attributes

  • Scoring operators

Simple Query Expressions

These queries allow end users to enter simple, comma-delimited strings and use wildcard characters. A simple query searches for words, not strings. For example, entering the word all will find documents containing the word "all," but not "allegorical." You can use wildcards to broaden the scope of the search. all* will return documents containing both "all" and "alliterate."

You can also enter multiple words separated by commas. In a simple query expression, the comma is just like a logical OR. If commas are omitted, the query expression is treated as a phrase.

Usually, operators are employed in explicit query expressions. They're normally surrounded by angle brackets (< >). You can also use AND, OR, and NOT in a simple query without using angle brackets. To include an operator in a search, surround it with double quotation marks.

By default, a simple query employs the STEM operator and the MANY modifier. The STEM operator searches for words that can be derived from those entered in the query expression. As a result, entering shift will return documents that contain "shift," "shifting," "shifts," and so on. The MANY modifier counts the number of times a particular search term is encountered when a record is being searched.

When you use the simple syntax, the search engine implicitly interprets single words as though they were modified by the MANY and STEM operators. By applying the MANY modifier implicitly, the search engine calculates each document's score based on the density of the search term in the searched documents. The more frequent the occurrence of a word in a document, the higher the document's score. The search engine ranks documents according to word density since it searches for the word you specify, as well as words that have the same stem.

Explicit Query Expressions

These queries can be constructed using a variety of operators, such as evidence, proximity, relational, concept, and score. Most operators in an explicit query expression are surrounded by angle brackets (< >). The AND, OR, and NOT operators can be used without angle brackets.

When you use explicit syntax, the search engine interprets the search terms you enter as literals. For example, when you enter the word "find" (including quotation marks) using explicit syntax, the stemmed versions of the word such as "finds" and "finding" are ignored.

Using Operators

An operator applies logic and rules to a search element. Logic defines the qualifications a document must meet to be retrieved. Operators are used with a form-based search or when they're hard-coded into the application. All operators except AND, OR, and NOT need to be enclosed in angle brackets to keep the Verity engine from treating them as literal search terms. The syntax for using operators is

 "<operator>search_string" 

These are the various types of operators:

  • Wildcards

  • Evidence operators

  • Proximity operators

  • Relational operators

  • Concept operators

  • Score operators

  • Modifiers

Wildcards

These operators return all records that match the wildcard character used in the search criteria. Wildcard operators use back quotes (`) to enclose the search string. These are the various wildcard characters used in Verity:

  • *. The asterisk specifies zero or more alphanumeric characters in a particular search string. * is ignored inside [] and {} wildcard searches.

  • ?. A question mark represents a single alphanumeric character.

  • []. Square brackets represent any of the characters appearing in a set. Searching for "<WILDCARD>`m[e,a]n`" returns records containing "men" and "man."

  • {}. Curly brackets represent one of a group of characters that appear in a set. Searching for "<WILDCARD>`learn{ing,ed}`" returns any records containing "learning" and "learned."

  • ^. A caret is used with [] to specify one of any characters that aren't specified in the set. A search for "<WILDCARD>`m[^e]n`" wouldn't match a record containing "men."

  • -. A hyphen is used in conjunction with []. It specifies a range of characters. For example, a search for "<WILDCARD>`b[a-e]m`" would return "bam," but not "bum."

To search for a wildcard character as a literal, you need to escape it by placing a backslash before it. For example, when you search for "Why?", the question mark needs to be escaped as "Why\?".

Evidence Operators

These operators find words that are similar to a particular word. A basic word search matches only that particular word. Evidence search is an intelligent search that looks for additional words, which can be related to the basic search term.

These are the various types of evidence operators:

  • SOUNDEX. Finds words that sound similar to the specified word or have a similar structure. It uses the standard AT&T soundex algorithm. "<SOUNDEX>there" returns records containing "their" also.

  • STEM. Locates words that can be derived from the search term. A search for "<STEM>find" returns "finding," "finds," and so on.

  • THESAURUS. Allows a search for synonyms.

  • TYPO/N. Searches for words that are spelled similarly. The /N is optional. It specifies the maximum number of spelling errors that are allowed between the search word and any matches.

  • WORD. Matches a specific word without using wildcards or STEM. For example, let's say you want to search for "what?", including the question mark. Instead of escaping the question mark, enter the criteria as "<WORD>what?".

Proximity Operators

These operators demarcate the proximal location within a record. They search for records that contain search terms within the same phrase or sentence. Records generated as a result of this search are ranked. These are some proximity operators:

  • IN. Finds the documents that contain the search term in a specified document zone. Document zones are areas within a document such as the title or the body, as defined by Verity.

  • NEAR. Retrieves records containing the specified search terms. Records are scored on the basis of the proximity of the search term.

  • PARAGRAPH. Selects records that contain all the specified search terms within the same paragraph.

  • PHRASE. Locates records that contain the phrase specified in the search criteria. A phrase consists of two or more words that occur in a specified order.

  • SENTENCE. Selects records that contain all the words specified within the same sentence.

  • NEAR/N. Finds documents that contain two or more search terms within N number of words of each other. N is an integer between 1 and 1,024.

Relational Operators

These operators look for specified document fields within collections. There are five Verity document fields: TITLE, KEY, URL, CUSTOM1, and CUSTOM2. In ColdFusion, these fields are known as CF_TITLE, CF_KEY, CF_URL, CF_CUSTOM1, and CF_CUSTOM2. The MANY operator cannot be used with relational operators.

These relational operators are used for text comparisons:

  • CONTAINS. Finds records by matching a word or phrase within a specified document field. To find records that contain the word "ham" anywhere in the TITLE field, use "CF_TITLE<CONTAINS>ham".

  • STARTS. Finds records by matching the characters in the search criteria with records that have the same characters as the starting values in a specified document field.

  • ENDS. Returns records that have values stored in the specified document field that end in the same values specified in the search criteria.

  • SUBSTRING. Finds documents that contain search criteria such as a substring of a word or phrase within a document field.

These relational operators are used for data comparisons:

  • = (Equals)

  • > (Greater than)

  • >= (Greater than or equal to)

  • < (Less than)

  • <= (Less than or equal to)

Concept Operators

These operators identify a concept in a document by linking a group of search terms using the criteria specified by the operator. Retrieved records are ranked on the basis of the density of the search criteria.

Here are some concept operators:

  • OR. Returns the records when one of the words is found. For example, a search with "honey OR milk" will return all the records that have either "honey" or "milk."

  • ACCRUE. Returns records when they contain at least one of the search terms specified. Although it's similar to OR, ACCRUE returns the ranking too.

  • AND. Returns the records when all the search terms are found. For example, searching for "honey AND milk" would return only those records that contain both "honey" and "milk."

  • ALL. Returns records when all the search terms are found. It's similar to AND.

  • ANY. Returns all records when any of the search terms are found. It's similar to OR.

Score Operators

These operators help to determine the scores of the records that match the search criteria. Documents are rated as a decimal percentage between 0 and 1,000, based on the operators applied to the search criteria.

Here are the score operators:

  • COMPLEMENT. Returns the complement value for the score of a matching record. For example, if a search results in assigning a score of 0.3 to the matching document, a search using COMPLEMENT would get a score of 0.7.

  • PRODUCT. Multiplies the scores of each term found in a particular record. Records with higher-scoring matches score higher than records with lower-scoring matches.

  • SUM. Adds scores for records matching the search criteria.

  • YESNO. Forces the score of the search to 1 if the calculated score of the term is not zero.

Modifier Operators

These operators change the behavior of standard operators in a predetermined way. Here are the modifier operators:

  • CASE. Specifies a case-sensitive search. "<CASE>J[APAN, apan]" searches for "JAPAN" and "Japan." If a search contains a mixed-case string, the search request will be case-sensitive.

  • MANY. Counts the density of words, stemmed variations, or phrases in a document and produces a relevance-ranked score for retrieved documents. It can be used with these operators: WORD, WILDCARD, STEM, PHRASE, SENTENCE, and PARAGRAPH. The MANY modifier cannot be used with the operators AND, OR, ACCRUE, and relational operators.

  • NOT. Excludes documents that contain the specified word or phrase. Used only with the AND and OR operators.

  • ORDER. Specifies that search elements must occur in the same order in which they were specified in the query. It can be used with the operators PARAGRAPH and SENTENCE.

Special Characters

The Verity search engine handles a number of characters in a special way:

  • , () [. These characters end a text token.

  • = > < !. These characters also end a text token. They're terminated by an associated end character.

  • '@)' < { [!. These characters signify the start of a delimited token. They're terminated by an associated end character.

A backslash (\) is used for escaping and removes the special meaning of the character that follows it. To enter a literal backslash in a query, use two in succession:

 <FREETEXT>("\"There is nothing\", said Marshall.")"backslash (\\)" 




Macromedia ColdFusion MX. Professional Projects
ColdFusion MX Professional Projects
ISBN: 1592000126
EAN: 2147483647
Year: 2002
Pages: 200

flylib.com © 2008-2017.
If you may any questions please contact us: flylib@qtcs.net