Introduction

 < Day Day Up > 



Knowledge discovery is the elicitation of interesting knowledge from potentially very large data repositories. This new field of research has drawn from the related disciplines of statistics, artificial intelligence, databases and visualisation. Although computer based, the knowledge discovery process is human-centric because of its reliance upon the user's involvement (from data selection, preparation, analysis, presentation and interpretation of the analysis results). Using current techniques, the user is involved within all stages of the discovery process except for the analysis stage, which remains a "blackbox." This analysis stage uses data mining algorithms to explore a data set and discover patterns or structures, which are influenced by user-specified constraints and objective measures of interestingness.

To date, data mining research has focused mainly upon heuristic correctness and efficiency. In interactive data mining, the process is extended to investigate ways in which the user can become an integral part of the mining process. The need for user inclusion is based on the premise that the concept of interestingness is subjective rather than objective and can not therefore be defined in heuristic terms. This suggests that by extending data mining algorithms to incorporate subjective measures of interestingness (through user participation), more useful results will be produced. The collaboration between computer and human will result in a symbiosis; the computer will provide processing power and storage facilities, and the user will contribute such capabilities as understanding and perception.

This chapter investigates interaction techniques that allow the user to actively guide the knowledge discovery process, in effect overcoming the computer's inability to incorporate knowledge about intangible subjective measures such as domain knowledge and data semantics. In addition to producing more interesting results, guidance of the mining process implies that the algorithm can be dynamically constrained during processing (to reduce the breadth or depth of analysis), hence reducing both mining time and result set size.

There are two classes of data mining tasks: directed and undirected. Directed mining, also known as supervised learning or predictive analysis, refers to a group of methods that build a model based upon a set of data and make predictions about new items based on this model. Undirected mining, also known as unsupervised learning or explorative analysis, employs techniques that are used to discover patterns, unknown or theorised by the user. Interactive mining can only be applied to the explorative tasks such as clustering and association mining; as directed mining tasks such as classification and characterisation are guided through training sets of data.

This chapter explores the techniques available for visualisation of and interaction with knowledge discovery systems. "User Participation" builds towards a discussion on interactive data mining by outlining the need for human participation within the knowledge discovery process. This is followed by a discussion of presentation paradigms in "Presentation," which highlights the strengths and weaknesses of both textual and graphical methods. "Presentation of Mining Results" contains a comprehensive taxonomy of undirected mining presentations, including the presentation of hierarchical, temporal and spatial semantics. "Human Computer Interaction" discusses interaction and more specifically direct manipulation techniques and the creation of interaction mappings. The section also provides a discussion about interactive views and distortion, which are two common interaction-based methods used to alleviate some of the problems incurred through the presentation of large complex data sets in coordinate space. Finally "Interactive Data Mining" discusses the current state of interactive data mining and the few relevant tools that are available, and "Conclusion" provides a brief look at the future directions of interactive data mining research.



 < Day Day Up > 



Managing Data Mining Technologies in Organizations(c) Techniques and Applications
Managing Data Mining Technologies in Organizations: Techniques and Applications
ISBN: 1591400570
EAN: 2147483647
Year: 2003
Pages: 174

flylib.com © 2008-2017.
If you may any questions please contact us: flylib@qtcs.net