data mining: opportunities and challenges
Chapter IV - Feature Selection in Data Mining
Data Mining: Opportunities and Challenges
by John Wang (ed) 
Idea Group Publishing 2003
Brought to you by Team-Fly

Agents, Mutation, and Selection

ELSA springs from artificial life models of adaptive agents in ecological environments (Menczer & Belew, 1996). In ELSA, an agent may die, reproduce, or neither, based on an endogenous energy level that fluctuates via interactions with the environment. Figure 1 outlines the ELSA algorithm at a high level of abstraction.

click to expand
Figure 1: ELSA pseudo-code.

The representation of an agent consists of the D bits, and each of the D bits is an indicator as to whether the corresponding feature is selected or not (1 if a feature is selected, 0 otherwise). Each agent is first initialized with some random solution and an initial reservoir of energy, and competes for a scarce resource, energy, based on multidimensional fitness and the proximity of other agents in solution space. The mutation operator randomly selects one bit of the agent and flips it. Our commonality-based crossover operator (Chen, Guerra-Salcedo, & Smith, 1999) makes the offspring inherit all the common features of the parents.

In the selection part of the algorithm, each agent compares its current energy level with a constant reproduction threshold θ. If its energy is higher than θ, the agent reproduces: the agent and its mutated clone that was just evaluated become part of the new population, each with half of the parent's energy. If the energy level of an agent is positive but lower than θ, only the agent itself joins the new population. If an agent runs out of energy, it is killed. The population size is maintained dynamically over iterations and is determined by the carrying capacity of the environment, depending on the costs incurred by any action, and the replenishment of resources (Menczer, Street, & Degeratu, 2000).

Energy Allocation and Replenishment

In each iteration of the algorithm, an agent explores a candidate solution similar to itself. The agent collects ΔE from the environment and is taxed with Ecost for this action. The net energy intake of an agent is determined by its offspring's fitness and the state of the environment that corresponds to the set of possible values for each of the criteria being optimized.[1] We have an energy source for each criterion, divided into bins corresponding to its values. So, for criterion fitness FC and bin value v, the environment keeps track of the energy Eenvtc(v) corresponding to the value Fc = v. Further, the environment keeps a count of the number of agents Pc(v) having Fc = v. The energy corresponding to an action (alternative solution) a for criterion Fc is given by

Agents receive energy only inasmuch as the environment has sufficient resources; if these are depleted, no benefits are available until the environmental resources are replenished. Thus, an agent is rewarded with energy for its high fitness values, but also has an interest in finding unpopulated niches in objective space, where more energy is available. Ecost for any action is a constant (Ecost < θ). When the environment is replenished with energy, each criterion c is allocated an equal share of energy as follows:

where C is the number of criteria considered. This energy is apportioned in linear proportion to the values of each fitness criterion, so as to bias the population toward more promising areas in objective space.

Advantages and Disadvantages

One of the major advantages of ELSA is its minimal centralized control over agents. By relying on local selection, ELSA minimizes the communication among agents, which makes the algorithm efficient in terms of computational time and scalability (Menczer, Degaratu, & Street, 2000). Further, the local selection naturally enforces the diversity of the population by evaluating agents based on both their quality measurements and the number of similar individuals in the neighborhood in objective space. Note also that ELSA can be easily combined with any predictive and clustering models.

In particular, there is no upper limit of number of objective functions that ELSA can accommodate. Noting that no single criterion is best for every application, we consider all (or at least some) of them simultaneously in order to provide a clear picture of the (possibly nonlinear) trade-offs among the various objectives. The decision-maker can select a final model after determining her relative weights of criteria for application.

ELSA can be useful for various tasks in which the maintenance of diversity within the population is more important than a speedy convergence to the optimum. Feature selection is one such promising application. Based on the well-covered range of feature vector complexities, ELSA is able to locate most of the Pareto front (Menczer, Degeratu, & Street, 2000). However, for problems requiring effective selection pressure, local selection may not be ideal because of its weak selection scheme.

[1]Continuous objective functions are discretized.

Brought to you by Team-Fly

Data Mining(c) Opportunities and Challenges
Data Mining: Opportunities and Challenges
ISBN: 1591400511
EAN: 2147483647
Year: 2003
Pages: 194
Authors: John Wang

flylib.com © 2008-2017.
If you may any questions please contact us: flylib@qtcs.net