Populating and Indexing a Collection


After creating a Verity collection, you need to populate it with indexed data. You can use either ColdFusion Administrator or the <CFINDEX> tag to populate the collection.

Indexing Using ColdFusion Administrator

Follow these steps to use ColdFusion Administrator to create indexed data:

  1. Open the ColdFusion Administrator Verity Collections page.

  2. Select the collection to be populated and click Index.

  3. Enter the list of file extensions for the Verity engine to index.

  4. Enter the directory path to be used for indexing.

  5. Select the Recursive indexing option if directories below the parent directory are to be indexed. The default is the parent directory only.

  6. Add a URL to be appended to the beginning of all the files indexed.

  7. Specify an alternative language, if desired.

  8. Click Update.

Indexing Using the <CFINDEX> Tag

You can use the <CFINDEX> tag to index files programmatically. This provides better control over the Verity collections. The following is the syntax for <CFINDEX>:

 <CFINDEX ACTION="Action"COLLECTION="collection_name" KEY="ID" TITLE="title" TYPE="type" QUERY="query_name" BODY="body" CUSTOM1="custom_value" CUSTOM2="custom_value" LANGUAGE="language"> 

Here are the attributes of <CFINDEX>:

  • ACTION. Specifies the action to be performed: Update, Delete, Purge, or Refresh.

  • COLLECTION. Specifies the name of the collection that was created using the <CFCOLLECTION> tag.

  • KEY. Specifies the unique identifier or primary key in a table.

  • TITLE. Specifies either the query name or a column in the table to be used as the title of the document. This is a required attribute when you index a database.

  • TYPE. Specifies the type of index being created. File, Path, and Custom are the three types of indexes available. The default attribute is Custom.

  • QUERY. Specifies the name of the query being used for the index/collection creation.

  • BODY. Creates the body of the document that Verity uses for searching. You can include any number of fields, using commas to delimit them.

  • CUSTOM1 and CUSTOM2. Serve as the optional attributes that associate information with each record returned by the search.

The following sample code is used to index the desired HTML pages:

 <CFINDEX COLLECTION="TestWebSite" KEY="DirIndex" ACTION="refresh" TYPE="path" URLPATH="http://111.0.0.1" EXTENSIONS=".htm, .html" RECURSE="Yes"> 

You can also use <CFINDEX> to index query results. You may recall that <CFQUERY> is used to query a database. Data search and retrieval from databases is based on keys and the exact matches of the contents of the fields. Since Verity indexes are more attuned to text searches, the query results are indexed first. Then, Verity looks for the required record. The database is then queried against a primary key.

For example, the following code indexes the output of a query:

 <CFINDEX COLLECTION="TestCollection" ACTION="update" TYPE="custom" BODY="IndexField" KEY="DataId" TITLE="ColumName" QUERY="Qname" RECURSE="Yes"> 

In the preceding example, Qname is the table name, IndexField is the field to be used for indexing, and DataId is the primary key.

Indexing Query Results

You can index the results of database, LDAP, and pop queries. A search operation based on a Verity collection provides faster access than the <CFQUERY> tag.

Verity collection search is preferred under the following conditions:

  • Indexing text data. You can search Verity collections containing textual data more efficiently using <CFINDEX> than with <CFQUERY>.

  • Requirement for queries only. Many applications provide end users with a query facility only. The end user isn't expected to update the database at all. In such cases, the use of Verity collection-based search queries is preferred. You can also use Verity collections if you don't want the data source to be exposed to the end user directly.

An extra step is involved when you index the result set from a ColdFusion query. First, you code the query and output parameters. Then, you use <CFINDEX> on the result set from the ColdFusion query. This step isn't required when you index documents.

To index a ColdFusion query:

  1. Create a collection in ColdFusion Administrator.

  2. Execute a query and display the query data.

  3. Populate the collection using <CFINDEX>.

You need to specify a key attribute to populate a collection from <CFQUERY>, as seen in the previous section. This attribute corresponds to the primary key of the data source table. The body attribute corresponds to the column or columns that you want to search for the index.

The following example shows the use of <CFQUERY> and <CFINDEX>:

 <!--- Select the table ---> <CFQUERY name="Names"   DATASOURCE="SampleData">   SELECT * FROM Names </CFQUERY> <!--- Output the query result ---> <CFOUTPUT QUERY="Names">   #Record_ID#, #Name#,#Description# </CFOUTPUT> <!--- Use cfindex to Index the result set ---> <CFINDEX COLLECTION="NameIndex" ACTION="Update"  TYPE="Custom"   BODY="Name"   KEY="Name_Id"   TITLE="Description"   QUERY="Names"> 

In the preceding example, Name_Id is the primary key. TITLE is the attribute that displays the required output parameter.

You can also index a <CFLDAP> query result. LDAP is widely used to develop directory structures that are amenable to searches. You can index data from an LDAP query, and end users can then search this information.

Caution

Keep these points in mind when creating an index from an LDAP query:

LDAP structures vary extensively. You must know the server's directory schema and the exact name of every LDAP attribute you intend to use in a query.

The records on an LDAP server change frequently. You need to index the collection again before processing a search request.

In the following example, <CFLDAP> is used in conjunction with <CFINDEX> to populate a collection from the LDAP server called Sample.com. The LDAP query searches for the name Marshall.

Note

Since LDAP servers use the Distinguished Name (dn) attribute as the unique identifier for each record, that attribute is used as the key value for the index.

 <! --- Run the LDAP query ---> <CFLDAP SERVER="ldap.sample.com" ACTION="query"  NAME="samplequery"  SCOPE="cn=marshall,c=US" FILTER="(cn=marshall),(c=US)" ATTRIBUTES="dn,cn,o,mail"> <!--- Populate the ldap_sample collection with the Output of query result set ---> <CFINDEX ACTION="regresh" COLLECTION="ldap_sample" KEY="dn" TYPE="Custom" BODY="cn,o,mail" QUERY="SampleQuery"> 

<CFINDEX> can also be used to index the contents of a POP mailbox. This is used to search the mailbox for a mailing list or a similar application.

Caution

The contents of mail servers change frequently. The message number is reset when messages are added and deleted. You should index the collection again before processing a search.

You need to provide a unique value for the KEY attribute and enter the data fields to index in the BODY attribute.

The following example updates the pop_sample collection with the current mail for sampleuser. It searches and returns the message number and subject line for all messages containing the word "administration":

 <!--- Query the POP server ---> <CFPOP ACTION="getall"   NAME="GetMessages"   SERVER=111.0.0.1  USERNAME="sampleuser"  PASSWORD="pwd12"> <!--- Populate the pop_sample collection with the results of the CFPOP query ---> <CFINDEX ACTION="update"   COLLECTION="pop_sample"   KEY="messageno"   TYPE="custom"   TITLE="subject"   QUERY="p_messages"   BODY="body"> <!--- Search messages for the word "administration" ---> <CFSEARCH COLLECTION="pop_sample"   NAME="sample_messages"   CRITERIA="administration"> 

Selecting the Indexing Method

Before indexing, decide whether you want to use ColdFusion Administrator or <CFINDEX> for this purpose. Certain guidelines suggest how to select the appropriate method for indexing. Using ColdFusion Administrator is appropriate under the following conditions:

  • File type. You want to index document files.

  • Updating frequency. The collection isn't likely to be updated frequently, or it's a one-time collection.

  • Level of customization. You want to generate the collection without using any CFML code.

<CFINDEX> is more appropriate under these conditions:

  • File type. You want to index query results.

  • Updating frequency. The collection is likely to be updated frequently.

  • Level of customization. You need to populate or update a collection dynamically from a ColdFusion template.




Macromedia ColdFusion MX. Professional Projects
ColdFusion MX Professional Projects
ISBN: 1592000126
EAN: 2147483647
Year: 2002
Pages: 200

flylib.com © 2008-2017.
If you may any questions please contact us: flylib@qtcs.net