Let's now look at the source code, which implements both the backpropagation algorithm for a configurable network topology as well as training and testing for the neurocontroller example.
The global variables are shown in Listing 5.1.
#define INPUT_NEURONS 4 #define HIDDEN_NEURONS 3 #define OUTPUT_NEURONS 4 /* Input to Hidden Weights (with Biases) */ double wih[INPUT_NEURONS+1][HIDDEN_NEURONS]; /* Hidden to Output Weights (with Biases) */ double who[HIDDEN_NEURONS+1][OUTPUT_NEURONS]; /* Activations */ double inputs[INPUT_NEURONS]; double hidden[HIDDEN_NEURONS]; double target[OUTPUT_NEURONS]; double actual[OUTPUT_NEURONS]; /* Unit Errors */ double erro[OUTPUT_NEURONS]; double errh[HIDDEN_NEURONS];
The weights are defined as input to hidden layer (wih) and hidden to output layer (who). The weight for the connection between u 5 and u 1 (from Figure 5.10) is an input to hidden layer weight represented by wih[0][0] (since u 1 is the first input cell and u 5 is the first hidden cell, based from zero). This weight is referred to as w 5,1 . The weight w 11,7 (connection from u 11 in the output layer to u 7 in the hidden layer) is who[2][3] . Bias weights occupy the last row in each of the tables, as are identified by the "+1" size for the wih and who arrays.
The activations are provided by four arrays. The inputs array defines the value of the input cells , the hidden array provides the output of the hidden cells, the target array represents what is desired of the network for the given inputs, and the actual array represents what the network actually provided.
The errors of the network are provided in two arrays. The erro array represents the error for each output cell. The errh array is the hidden cell errors.
In order to find random weights for initialization of the network, a number of macros are defined, shown in Listing 5.2.
#define LEARN_RATE 0.2 /* Rho */ #define RAND_WEIGHT ( ((float)rand() / (float)RAND_MAX) - 0.5) #define getSRand() ((float)rand() / (float)RAND_MAX) #define getRand(x) (int)((x) * getSRand()) #define sqr(x) ((x) * (x))
Weights are randomly selected in the range [-0.5 - 0.5]. The learning rate (rho) is defined as 0.2. The weight range and learning rate can both be adjusted depending upon the problem and accuracy required.
Three support functions exist to assign random weights to the network and for algorithm support. These are shown in Listing 5.3.
void assignRandomWeights( void ) { int hid, inp, out; for (inp = 0 ; inp < INPUT_NEURONS+1 ; inp++) { for (hid = 0 ; hid < HIDDEN_NEURONS ; hid++) { wih[inp][hid] = RAND_WEIGHT; } } for (hid = 0 ; hid < HIDDEN_NEURONS+1 ; hid++) { for (out = 0 ; out < OUTPUT_NEURONS ; out++) { who[hid][out] = RAND_WEIGHT; } } } double sigmoid( double val ) { return (1.0 / (1.0 + exp(-val))); } double sigmoidDerivative( double val ) { return ( val * (1.0 - val) ); }
Function assignRandomWeights randomly assigns a weight to each of the connections within the network (including all biases). The sigmoid function implements the squashing function used in the feed-forward phase (Equation 5.5). The sigmodDerivative function implements the derivative of the sigmoid function and is used during error backpropagation.
The next function implements the feed-forward phase of the algorithm (see Listing 5.4).
void feedForward( ) { int inp, hid, out; double sum; /* Calculate input to hidden layer */ for (hid = 0 ; hid < HIDDEN_NEURONS ; hid++) { sum = 0.0; for (inp = 0 ; inp < INPUT_NEURONS ; inp++) { sum += inputs[inp] * wih[inp][hid]; } /* Add in Bias */ sum += wih[INPUT_NEURONS][hid]; hidden[hid] = sigmoid( sum ); } /* Calculate the hidden to output layer */ for (out = 0 ; out < OUTPUT_NEURONS ; out++) { sum = 0.0; for (hid = 0 ; hid < HIDDEN_NEURONS ; hid++) { sum += hidden[hid] * who[hid][out]; } /* Add in Bias */ sum += who[HIDDEN_NEURONS][out]; actual[out] = sigmoid( sum ); } }
As is illustrated in Listing 5.4, the feed-forward algorithm starts by calculating the activations of the hidden layers with the inputs from the input layer. The bias is added to the cell prior to the sigmoid function. The output layer is then calculated in the same manner. Note that the network may have one or more output cells. Therefore, the output cells are looped to perform all needed output activations.
The actual backpropagation algorithm is shown in Listing 5.5.
void backPropagate( void ) { int inp, hid, out; /* Calculate the output layer error (step 3 for output cell) */ for (out = 0 ; out < OUTPUT_NEURONS ; out++) { erro [out] = (target[out] - actual[out]) * sigmoidDerivative( actual[out] ); } /* Calculate the hidden layer error (step 3 for hidden cell) */ for (hid = 0 ; hid < HIDDEN_NEURONS ; hid++) { errh[hid] = 0.0; for (out = 0 ; out < OUTPUT_NEURONS ; out++) { errh[hid] += erro[out] * who[hid][out]; } errh[hid] *= sigmoidDerivative( hidden[hid] ); } /* Update the weights for the output layer (step 4) */ for (out = 0 ; out < OUTPUT_NEURONS ; out++) { for (hid = 0 ; hid < HIDDEN_NEURONS ; hid++) { who[hid][out] += (LEARN_RATE * erro[out] * hidden[hid]); } /* Update the Bias */ who[HIDDEN_NEURONS][out] += (LEARN_RATE * erro[out]); } /* Update the weights for the hidden layer (step 4) */ for (hid = 0 ; hid < HIDDEN_NEURONS ; hid++) { for (inp = 0 ; inp < INPUT_NEURONS ; inp++) { wih[inp][hid] += (LEARN_RATE * errh[hid] * inputs[inp]); } /* Update the Bias */ wih[INPUT_NEURONS][hid] += (LEARN_RATE * errh[hid]); } }
This function follows the algorithm outlined in the "Backpropagation Example" section. The error for the output cell (or cells) is calculated first using the actual output and the desired output. Next, the errors for the hidden cells are calculated. Finally, the weights for all connections are updated according to whether they're in the input-hidden layer or the hidden-output layer. An important note is that in the algorithm, we don't calculate a weight for the bias, but instead simply apply the error to the bias itself. During the feed-forward algorithm, we simply add the bias in without applying any weight to it.
That's the algorithm for computing the output of a multi-layer, feed-forward neural network and adjusting the network using the backpropagation algorithm. Let's now look at the example code that drives it for the neurocontroller example.
In Listing 5.6, a structure is defined to represent the examples for our training set. The structure defines the inputs ( health , knife , gun , enemy ), and an array for the desired output ( out array). Listing 5.6 also contains the initialization of the training set.
typedef struct { double health; double knife; double gun; double enemy; double out[OUTPUT_NEURONS]; } ELEMENT; #define MAX_SAMPLES 18 /* H K G E A R W H */ ELEMENT samples[MAX_SAMPLES] = { { 2.0, 0.0, 0.0, 0.0, {0.0, 0.0, 1.0, 0.0} }, { 2.0, 0.0, 0.0, 1.0, {0.0, 0.0, 1.0, 0.0} }, { 2.0, 0.0, 1.0, 1.0, {1.0, 0.0, 0.0, 0.0} }, { 2.0, 0.0, 1.0, 2.0, {1.0, 0.0, 0.0, 0.0} }, { 2.0, 1.0, 0.0, 2.0, {0.0, 0.0, 0.0, 1.0} }, { 2.0, 1.0, 0.0, 1.0, {1.0, 0.0, 0.0, 0.0} }, { 1.0, 0.0, 0.0, 0.0, {0.0, 0.0, 1.0, 0.0} }, { 1.0, 0.0, 0.0, 1.0, {0.0, 0.0, 0.0, 1.0} }, { 1.0, 0.0, 1.0, 1.0, {1.0, 0.0, 0.0, 0.0} }, { 1.0, 0.0, 1.0, 2.0, {0.0, 0.0, 0.0, 1.0} }, { 1.0, 1.0, 0.0, 2.0, {0.0, 0.0, 0.0, 1.0} }, { 1.0, 1.0, 0.0, 1.0, {0.0, 0.0, 0.0, 1.0} }, { 0.0, 0.0, 0.0, 0.0, {0.0, 0.0, 1.0, 0.0} }, { 0.0, 0.0, 0.0, 1.0, {0.0, 0.0, 0.0, 1.0} }, { 0.0, 0.0, 1.0, 1.0, {0.0, 0.0, 0.0, 1.0} }, { 0.0, 0.0, 1.0, 2.0, {0.0, 1.0, 0.0, 0.0} }, { 0.0, 1.0, 0.0, 2.0, {0.0, 1.0, 0.0, 0.0} }, { 0.0, 1.0, 0.0, 1.0, {0.0, 0.0, 0.0, 1.0} } };
Recall that the health input represents three values (2 for healthy, 1 for moderately healthy, and 0 for not healthy ). Knife and gun inputs are booleans (1 if item is held, 0 if not) and enemy is the number of enemies that are seen. Actions are booleans as well; a non-zero value represents the desired action to take.
Since we're implementing a winner-take-all network, we'll code a simple function to determine the output cell with the highest weighted sum. This function searches the vector for the highest output, and returns a string representing the action to take (see Listing 5.7). The return value can then be used as an offset into strings to emit the text response behavior.
char *strings[4]={"Attack", "Run", "Wander", "Hide"}; int action( double *vector ) { int index, sel; double max; sel = 0; max = vector[sel]; for (index = 1 ; index < OUTPUT_NEURONS ; index++) { if (vector[index] > max) { max = vector[index]; sel = index; } } return( sel ); }
Finally, Listing 5.8 provides the main function for performing the training and testing of the neurocontroller.
int main() { double err; int i, sample=0, iterations=0; int sum = 0; out = fopen("stats.txt", "w"); /* Seed the random number generator */ srand( time(NULL) ); assignRandomWeights(); /* Train the network */ while (1) { if (++sample == MAX_SAMPLES) sample = 0; inputs[0] = samples[sample].health; inputs[1] = samples[sample].knife; inputs[2] = samples[sample].gun; inputs[3] = samples[sample].enemy; target[0] = samples[sample].out[0]; target[1] = samples[sample].out[1]; target[2] = samples[sample].out[2]; target[3] = samples[sample].out[3]; feedForward(); err = 0.0; for (i = 0 ; i < OUTPUT_NEURONS ; i++) { err += sqr( (samples[sample].out[i] - actual[i]) ); } err = 0.5 * err; fprintf(out, "%g\n", err); printf("mse = %g\n", err); if (iterations++ > 100000) break; backPropagate(); } /* Test the network */ for (i = 0 ; i < MAX_SAMPLES ; i++) { inputs[0] = samples[i].health; inputs[1] = samples[i].knife; inputs[2] = samples[i].gun; inputs[3] = samples[i].enemy; target[0] = samples[i].out[0]; target[1] = samples[i].out[1]; target[2] = samples[i].out[2]; target[3] = samples[i].out[3]; feedForward(); if (action(actual) != action(target)) { printf("%2.1g:%2.1g:%2.1g:%2.1g %s (%s)\n", inputs[0], inputs[1], inputs[2], inputs[3], strings[action(actual)], strings[action(target)]); } else { sum++; } } printf("Network is %g%% correct\n", ((float)sum / (float)MAX_SAMPLES) * 100.0); /* Run some tests */ /* Health Knife Gun Enemy */ inputs[0] = 2.0; inputs[1] = 1.0; inputs[2] = 1.0; inputs[3] = 1.0; feedForward(); printf("2111 Action %s\n", strings[action(actual)]); inputs[0] = 1.0; inputs[1] = 1.0; inputs[2] = 1.0; inputs[3] = 2.0; feedForward(); printf("1112 Action %s\n", strings[action(actual)]); inputs[0] = 0.0; inputs[1] = 0.0; inputs[2] = 0.0; inputs[3] = 0.0; feedForward(); printf("0000 Action %s\n", strings[action(actual)]); inputs[0] = 0.0; inputs[1] = 1.0; inputs[2] = 1.0; inputs[3] = 1.0; feedForward(); printf("0111 Action %s\n", strings[action(actual)]); inputs[0] = 2.0; inputs[1] = 0.0; inputs[2] = 1.0; inputs[3] = 3.0; feedForward(); printf("2013 Action %s\n", strings[action(actual)]); inputs[0] = 2.0; inputs[1] = 1.0; inputs[2] = 0.0; inputs[3] = 3.0; feedForward(); printf("2103 Action %s\n", strings[action(actual)]); inputs[0] = 0.0; inputs[1] = 1.0; inputs[2] = 0.0; inputs[3] = 3.0; feedForward(); printf("0103 Action %s\n", strings[action(actual)]); fclose(out); return 0; }
After seeding the random number generator with srand, the connection weights of the network are randomly generated. A while loop is then performed to train the network. Each of the samples is taken in order, rather than random selection of the examples. The feed-forward algorithm is performed, followed by a check of the mean-squared-error. Finally, the backpropagation algorithm runs to adjust the weights in the network. After a number of iterations are performed, the loop exits and the network is tested with the training set to verify its accuracy. After the network is tested with the training set, a set of examples are checked that were not part of the original training set.
If the neural network were to be used within a game environment, the weights would be emitted to a file for later inclusion within source. The network could also be optimized at this stage to minimize the computing resources needed for its use within game environments.