Application | AI Game Development: Synthetic Creatures with Learning and Reactive Behaviors

Now it's possible to apply this perceptron to target selection. Our primary objective is to analyze the effectiveness of each rocket based on the situation. Then, theoretically, it should be possible to learn to predict the effectiveness of any shot before it happens.

The AI code combines many of the techniques covered previously discussed. For navigation, a solution that provides good coverage of the terrain is essential. Tracking of enemy movement and estimating the velocity is also necessary. Then, the AI can launch a rocket around that target and see what happens.

Algorithm

The algorithm that selects the target is an iterative process. This is not done all at once; the decision can easily be split over multiple frames of the AI processing.

Essentially, a generation mechanism creates new random points around the estimated position of the enemy. The animat checks for the first obstacle along that line (floor, wall), and uses that as the target. Then, the perceptron is called to evaluate the potential for damage at that point. Only when the damage is good enough is a target returned. Listing 20.1 shows this. The randvec() function returns a random offset, and the perceptron is called by the estimate_damage() function to assess the value of a target.

Listing 20.1 The Iteration That Selects the Best Nearby Target by Predicting the Damage and Comparing the Probability to a Threshold Set by the Designer

 function select_target      repeat           # pick a candidate target point near the enemy           target = position + randvec()           # use the perceptron to predict the hit probability           value = estimate_damage( target )      # stop if the probability is high enough until value > threshold return target end function

This reveals how the neural network is put to use. However, some programming is required to teach it.

Tracking Rockets

Keeping track of the situation proves the most problematic. Most of the programming involves gathering and interpreting the information about the shot. The AI must be capable of answering the following questions:

Did a projectile explode nearby?
Was it someone else's projectile?
Was there any damage attributed to the projectile?
What were the conditions when the shot was fired?

To prevent any misinterpretations, each animat tries to have only one rocket in the air at any one time. It's then easier to attribute damage to that rocket. Embodiment actually helps this process, because all the rocket explosions that are invisible to an animat are unlikely to belong to it.

There are four different stages for gathering this information. The major part of this problem is solved using events and callbacks (steps 2 and 3):

When a rocket is fired, start tracking the situation and remember the initial conditions.
Upon perceiving a sound of pain (or visual cue, such as blood), determine whether it was caused by the rocket by checking the origin of the sound. Take note of it.
If a rocket explosion is detected, set the target to the point of collision, and set the expected time of collision to the current time.
When the expected time of collision is past (given a small margin), use the information gathered to train the neural network.

All four stages follow the life cycle of a rocket. After it has exploded, the AI has a unique data sample to train the perceptron.

Dealing with Noise

Despite all these precautions, there is a lot of noise in the data. This means that two similar situations may have different outcomes. This is due to many things:

The representation of each situation is simplified, so there are many unknown parameters that vary.
The features gathered from each situation are even more limited (by design).
The environment is in itself unpredictable, as many animats autonomously behaving cohabit together.

There are large variations in the training samples, making it very difficult to learn a good prediction of the outcome. The task is even tougher online, because the noise really takes its toll when learning incrementally. Instead, working with entire data sets really helps, so the noise averages out to suitable values.

To gather the data, a number of animats (around five of them) store all their experience of different shots in logs. We can therefore acquire more information and process it manually to see whether the data indicates trends thus confirming the design decisions. In Figure 20.2, for example, analyzing the hit ratio in terms of the distance between the player and the target reveals an interesting trend: Targets close to the estimate are likelier to cause more damage (this is accidental damage because the animats do not voluntarily shoot at close targets), but the damage is maximal at around 15 steps away (between distance 180 and 324). This curve reveals the strength of the prediction skills, which the perceptron for target selection will learn. (Manual design would no doubt ignore such factors.)

Figure 20.2. The likelihood of a rocket damaging the enemy based on the distance from the player.

graphics/20fig02.jpg

Inputs and Outputs

The inputs are chosen from among distances and dot products using combinations of the four points in space (that is, player origin, enemy position, estimated position, and chosen target):

Dot products are plugged straight in to the perceptron, as they have suitable range (always [ 1,1]). Their absolute value can be taken if the sign is not an important feature.
Distances are rescaled slightly to be more convenient. This is not necessary, but rescaling works best with the initial random weight values (chosen with a uniform random distribution). The input distance is divided by the maximum distance if known (for instance, between estimate and target), or a suitable guess.

Three primary distances between the four 3D points of the problem are the most useful as inputs (see Figure 20.3). There are trends in the dot products, too, but these are not as obvious. We could use further preprocessing to extract the relevant information out of the situation, but in a more generic fashion. For example, the dot product between the projectile and enemy velocities is 0 when the enemy is still; this situation would seem the same to the perceptron as a side-on shot. Returning a value of 1 similar to a full-on shot would prove more consistent.

Figure 20.3. Graphical representation of the inputs to the perceptron. The three primary features are the distances between the origin, enemy, and target.

graphics/20fig03.gif

The unique output corresponds to the damage inflicted. Because the damage is very often zero, and we want to emphasize any damage at all, we use a Boolean output that indicates injury.

Training Procedure

During the training, the bots are not given unlimited rockets; they have to pick up the rocket launcher along with ammunition. In the test level, the rockets are easily available, and even the simplest-movement AI has no trouble picking them up accidentally. Animats without ammunition actually serve as good target practice for others with the rocket launchers (without causing chaos).

To reduce the time taken by the simulation to gather the same data, the game can be accelerated. For example, this can be done by running the Quake 2 game in server mode and disabling the graphics. By default, these servers will only use necessary processing power to maintain the simulation real time. FEAR has an option to enable acceleration so the simulation runs at full speed, taking up as much processor time as possible. This is a great feature and saves a lot of development time.

Practical Demo

The animat used to illustrate this chapter is called Splash and can be found on the web site at http://AiGameDev.com/. Both the source code and the binaries are available and ready to run. Splash predicts the enemy position, and then generates a random target around it, using a perceptron to evaluate its likelihood of success. If the changes of success are acceptable, a rocket is fired. Training data is gathered when the expected time of collision is past.