data mining: opportunities and challenges
Chapter IV - Feature Selection in Data Mining
Data Mining: Opportunities and Challenges
by John Wang (ed) 
Idea Group Publishing 2003
Brought to you by Team-Fly

  1. Continuous objective functions are discretized.

  2. This is one of main tasks in the 2000 CoIL challenge (Kim & Street, 2000). For more information about CoIL challenges and the data sets, please refer to http://www.dcs.napier.ac.uk/coil/challenge/.

  3. If other objective values are equal, we prefer to choose a solution with small variance.

  4. This is reasonable because as we select more prospects, the expected accuracy gain will go down. If the marginal revenue from an additional prospect is much greater than the marginal cost, however, we could sacrifice the expected accuracy gain. Information on mailing cost and customer value was not available in this study.

  5. The other four features selected by the ELSA/logit model are: contribution to bicycle policy, fire policy, number of trailer, and lorry policies.

  6. The cases of zero or one cluster are meaningless, therefore we count the number of clusters as K = κ + 2 where κ is the number of ones and Kmin = 2 K Kmax.

  7. For K = 2, we use Fcomplexity = 0.76, which is the closest value to 0.69 represented in the front.

  8. In our experiments, standard error is computed as standard deviation / iter0.5 where iter = 5.

Brought to you by Team-Fly

Data Mining(c) Opportunities and Challenges
Data Mining: Opportunities and Challenges
ISBN: 1591400511
EAN: 2147483647
Year: 2003
Pages: 194
Authors: John Wang

flylib.com © 2008-2017.
If you may any questions please contact us: flylib@qtcs.net