CONCLUSIONS

data mining: opportunities and challenges
Chapter VIII - Mining Text Documents for Thematic Hierarchies Using Self-Organizing Maps
Data Mining: Opportunities and Challenges
by John Wang (ed) 
Idea Group Publishing 2003
Brought to you by Team-Fly

In this chapter, we presented a method to automatically generate category hierar-chies and identify category themes. The documents were first transformed to a set of feature vectors. The vectors were used as input to train the self-organizing map. Two maps the word cluster map and the document cluster map were obtained by labeling the neurons in the map with words and documents, respectively. An automatic category generation process was applied to the document cluster map to find some dominating neurons that are centroids of some super-clusters. The category terms of super-clusters were also determined. The same processes were applied recursively to each super-clusters to reveal the structure of the categories. Our method used neither human-provided terms nor predefined category structure. Text categorization can easily be achieved in our method.

Brought to you by Team-Fly


Data Mining(c) Opportunities and Challenges
Data Mining: Opportunities and Challenges
ISBN: 1591400511
EAN: 2147483647
Year: 2003
Pages: 194
Authors: John Wang

flylib.com © 2008-2017.
If you may any questions please contact us: flylib@qtcs.net