|< Day Day Up >|| |
Data mining is a tool designed to analyze massive data sets, to draw inferences from data and to describe relationships between variables for prediction, quantifying effects, or suggesting causal paths. Data mining and data warehousing are highly complementary. Salient issues such as privacy and data mining performance emerge. However, data mining is still in its infancy, and several important trends can be identified at this point. First, both the number of data sets and their volumes will continue to grow exponentially. Second, data mining tools will continue to grow in power, analytical sophistication, and ease of use (Peacock, 1998). Third, data mining, similar to data warehousing, is and will be continuously driven by applications. Most of these applications are aimed at understanding behavior. Data mining systems have quickly progressed from single-component tools to multi component tool kits with loose connections to database management systems. Next generation systems will be tightly integrated with the database management system and be capable of mining data in the large database.
It seems that two forces-the need for data mining and the means to implement it-result in the popularity of data mining. The need is from customers' expectations. The means is from technical advances in artificial intelligence, machine learning research, database, and visualization technologies. There is a concern in knowledge accumulation in areas of methodology and techniques. A problem for the search of data mining exists because of the confidential nature of the work. Business organizations are unlikely to share their experiences from data mining exercises with others. Therefore, it is difficult to know exactly the choice of appropriate methodology and the mix of data mining techniques in the organization (Ballou & Tayi, 1999).
Further empirical research is needed in the relationship between data mining and data warehousing. How can data quality be enhanced in the data warehouse environments? How can the performance of data mining be enhanced in large-scale data warehouses? What factors impact the performance of data mining? In the same way, further empirical research is also needed in order to investigate issues dealing with personal information in data mining. How can we protect personal privacy and utilize data mining at the same time? What are the guidelines for the privacy in the data mining process?
|< Day Day Up >|| |