Implementing a Taxonomy

 <  Day Day Up  >  

Now that we have shared more than you ever planned to learn about taxonomies, semantics, and artificial intelligence, it's time to tackle the implementation of your portal taxonomy. There are three general approaches to creating taxonomy:

  • Automatic taxonomy creation and document categorization

  • Human taxonomy creation

  • Assisted taxonomy creation and document categorization

You may find that two or even all three approaches have value for your project. Bear in mind that taxonomies are never really complete as long as new content is being added. They grow and evolve over time in response to the changing demands of users.

The taxonomy industry has continued to grow and now offers a wide range of technology and products at various price points. Small to mid- sized companies are now also able to offer their employees and customers the same benefits formerly available only to corporate behemoths. In addition to public Internet and corporate portals, taxonomies are also finding their way into vertical portals, customer and partner extranet sites, and even to very specialized knowledge worker document repositories.

Automatic Taxonomy Creation and Document Categorization

Microsoft does not currently offer a product that automatically categorizes documents or creates taxonomy. Several other vendors have taken this approach, however, and you can use these third-party products to search and categorize your web pages, documents, and other content sources.

A number of algorithms have been developed to enable categorization of data repositories. In its taxonomy and content categorization study, the Delphi Group identified several basic algorithms, including:

  • Linguistic analysis, which identifies the subject, verbs, and objects of a sentence and then analyzes them to extract meaning.

  • Statistical text analysis and clustering, which measure word frequency, placement, and grouping and the distance between words in a document.

  • Rule-based taxonomies, which classify documents based on specific rules created and maintained by experts using if-then statements that measure how well a document fits into a category. [1]

    [1] A Delphi Group White Paper, "Taxonomy and Content Classification: Market Milestone Report," April 11, 2002, p. 16.

Even vendors of automated taxonomy tools (Table 10.3) concede that human judgment is essential to a finished taxonomy. Their tools can save time and money, however, and find patterns in data that would not be obvious to the analyst.

Table 10.3. Taxonomy and Categorization Tools

Product

Vendor

Notes

URL

BrainEKP (Enterprise Knowledge Platform)

The Brain Technologies

Despite the name , this is software; innovative visualization of taxonomy

www.thebrain.com

IDOL Server

Autonomy Corp

 

www.autonomy.com

Inxight Categorizer

Inxight

Uses linguistic and statistical analysis; includes visualization

www.inxight.com

LexisNexis Content Organizer

Verity Inc.

Prebuilt taxonomies from those used by LexisNexis; may be combined with custom taxonomy

www.verity.com

SemioTagger

Entrieva

Uses linguistic and statistical clustering techniques

www.entrieva.com

SemioTaxonomy

Entrieva

Collection of 27 prebuilt taxonomies

www.entrieva.com

Stratify Classification Server

Stratify Inc.

Linguistic and statistical analysis, statistical clustering techniques

www.stratify.com


Human Taxonomy Creation

The second approach discussed here is to unleash a specialist in taxonomy development to master the domain of your portal and formulate a taxonomy. This approach might be expensive and time consuming, but the taxonomy would benefit from the expertise and experience of the analyst. Many portal projects have taken this approach. One risk is that the taxonomy development might expand to soak up too many project resources and make it more challenging for the project to remain on schedule.

Assisted Taxonomy Creation and Document Categorization

This third approach is a hybrid of the first two. It involves human analysts in conjunction with automated taxonomy and search tools. There are many sources of data that can help with taxonomy development. Search query logs, analysis of library reference requests , focus group results, findings from in-person interviews of individual knowledge workers, and survey results are all indicators of what content each segment of employees needs, and on what schedule. These sources also tell you about the knowledge workers' information-seeking behavior, which in turn lets you know which access methods (such as searching and browsing) and access points (such as metadata elements) you need to use in schemas and in descriptive and navigational taxonomies.

 <  Day Day Up  >  


Building Portals, Intranets, and Corporate Web Sites Using Microsoft Servers
Building Portals, Intranets, and Corporate Web Sites Using Microsoft Servers
ISBN: 0321159632
EAN: 2147483647
Year: 2004
Pages: 164

flylib.com © 2008-2017.
If you may any questions please contact us: flylib@qtcs.net