Application

Given a modular DT with well-defined interfaces and a satisfactory implementation, it can be applied to weapon selection. Most of the remaining work lies in building a scaffolding to support it, providing the right inputs and outputs.

The exploitation of the DT is relatively straightforward, comparable to replacing the voting system with another component. Listing 27.2 shows how the senses are used to interpret the features, which are used by the classification process to predict the fitness.

Listing 27.2 A Function for Selecting the Weapon Called on Regular Intervals When It's Necessary to Change

 function select_weapon()      # use senses to determine health, enemy distance...      env = interpret_features()      # find best weapon among the ones available      max = 0      for each (weapon,ammo) in inventory           # compute the fitness based on concatenated features           fitness = dt.predict(env + weapon + ammo)           # only remember if it's better           if fitness > max then                fitness = max                best = weapon           end if      end for      # weapon can be selected      return best end function

The induction of the DT and computing the sample for learning are slightly more complex. It involves three main phases:

Interpreting the features of the environment from the senses
Monitoring episodes of the fight when weapons are being used
Computing the desired fitness for each episode

The implementation is built upon the same techniques used to monitor the effectiveness of rockets. The following sections explain these stages in greater detail.

Interpreting Environment Features

The features of the environment are collected using the senses from the interfaces discussed in Chapter 24, "Formalizing Weapon Choice," and other specifications (for instance, vision, inventory, and physics). The results are a set of predictor variables, with the representation shown in Table 27.1.

Table 27.1. Variables Used by the DT to Estimate the Fitness of a Particular Weapon
Variable	Range
Distance	Near, medium, far
Health	Low, high
Ammo	Low, medium, high
Traveling	Forward, backward
Constriction	Low, medium, high
Fitness	[0...100]

These variables are the most important features to incorporate into the model, although we could easily add and remove some as necessary to find the ideal balance. These predictor variables are used by the weapon selection, but the response variable is needed for learning. The response is evaluated by monitoring the fight.

Monitoring Fight Episodes

The AI gathers four different types of information from the game, all relevant to the applicability of weapons. Like the animat learning target selection, an event-driven mechanism is used to identify hits (pain signals) and potentially misses (explosion only):

Self-damage is identified by any pain broadcast by the body, usually shortly after a projectile is launched.
Hit probability measures the number of enemy hits, compared to the number of bullets fired (ammo query).
Maximal damage keeps track of the most pain the enemy has suffered while a particular weapon was used.
Potential damage per second computes the average damage over the total time the weapon was used.

Identifying the cause of damage can be the most difficult task, but can be solved by checking the location of the pain event, compared with the aiming direction. Alternatively, this information could be provided by the data structures used to store the messages.

Computing the Fitness

The principle at the base of the voting system is that the fitness of a weapon depends on the situation. This also means the criteria used to evaluate the weapons changes depending on the conditions.

It's somewhat difficult to go into this subject without considering high-level tactics (covered in Part VII), so we'll make a few assumptions. Looking at weapon selection alone, we want to take into account the following oversimplified criteria:

Low personal health requires precautions to minimize self-damage.
Low enemy health should incite the player to increase the hit probability.
When the enemy is facing away, the animat should attempt to maximize the potential damage. Otherwise, a good policy is to try to maximize potential damage per second.

Because the overall fitness will represent these different criteria in different situations, we need to make sure that they're vaguely on the same scale. To do this, we'll rescale the values so that they fall into the range [0...100] as closely as possible, as summarized in Listing 27.3.

Listing 27.3 This Function Learns the Desired Fitness of Weapons Based on the Features of the Environment

 function learn_weapon(weapon,episode)      # gather the information from the senses      env = interpret_features()      # compute the fitness in terms of the monitored information      if episode.self_health < 25 then           fitness = -episode.self_damage      else if episode.enemy_health < 40 then           fitness = episode.accuracy      else if episode.enemy_position.y > 0 then      # enemy is facing away           fitness = episode.max_potential      else      fitness = episode.enemy_damage_per_second      # incrementally induce the fitness from concatenated features      dt.increment(env + weapon, fitness) end function

Practical Demo

An animat demonstrates the ideas described here. It can be found under the name of Selector on the web site at http://AiGameDev.com/. There's a step-by-step guide to get the animat up and running. Selector uses a DT to evaluate the benefit of each of the weapons based on the current situation. This is interpreted, with a few restrictions, to choose the best weapon.

Biologically Plausible Errors

By analyzing the data with statistics, it's possible to see how the problem is shaping up. Using a histogram of the potential damage per second, the main feature (distance of the enemy) offers visible trends. For example, the super shotgun is extremely efficient up close, but tails off as the enemy gets farther away. This is understandably caused by the spread of the fire.

On the other hand, some trends are quite surprising. The railgun performs well at a distance, as expected. But up close, the performance is higher than expected. Together with this, traveling backward imposes no additional difficulties on the aiming, so weapons are just as efficient regardless of the direction of travel. Quite literally, the animats are like the mobile turret of a tank, and just as efficient.

The constant aiming errors were sufficient in the previous part to produce realistic aiming, but we need a more plausible error model for higher-level behaviors, such as weapon selection, to be more humanlike. The weapon selection is already very realistic, but believability could be taken a step further by increasing the variability of the accuracy.

To achieve this, we'll improve the aiming error model to take into account movement and relative direction of travel. The more the animat moves, the less accurately it will turn; also, moving forward is more accurate than running backward.