Feedback Dividends | Puzzles for Programmers and Pros

Have you ever admired your own skill at navigating a narrow road at high speed? If not, imagine the following alternative method of travel: Study a detailed map of the same road, figure out how much the wheel should turn and the accelerator should be pressed at every time point, and then drive down the road blindfolded. Even without obstacles, this is beyond the memory and trigonometric capacity of most of us.

In fact, we’re hardly conscious of the intellectual effort of driving. Perhaps the reason is that the act of driving consists of very short-term plans (few seconds at most) followed by adaptation based on eyesight. The driver has an overall goal - get to some destination - but the plan is incremental and adaptive. This requires less brainpower and is far more robust to changes in the environment.

Any person on the street understands this argument, but we require quantification. So to make this concrete, examine the following game. Consider a standard checkerboard having 8 rows and 8 columns as in Figure 1-9.

image from book
Figure 1-9: The goal is to go from the S to the E, always staying on black squares. Don’t fall off the board.

You want to go from row 1, column 4 (the black square above the S) to row 8, column 5 (the black square below the E). Each move goes from black square to black square and proceeds up a row and either to the left or right diagonally adjacent square. If you fall off the checkerboard or reach the top row without reaching the correct square, you lose.

At each move, you as player get to aim to go either right or left. You will achieve that step’s aim with probability Pgood, whose values we will discuss in a minute. There are two kinds of strategies: FeedYes and FeedNo.

A FeedYes strategy can decide where to aim on the i^th move after seeing the results of the first i-1 moves. A FeedNo strategy must decide where to aim at step i from the very beginning.

Here is an example to show you the difference. Suppose you want to go from row 1, column 4 to row 3, column 4. Suppose that Pgood is 0.9. Then in the FeedYes strategy you might aim right the first move. If you in fact go right (probability 0.9), then you would aim left the second move. But if you go left on the first move (probability 0.1), you will aim right the second move. The net result is that you have a probability of 0.9 to hit your destination. In the FeedNo strategy, you might say something like “aim right the first move and aim left the second.” There are two cases in which you would win with that strategy: You in fact move right in move 1 and left in move 2 (probability 0.9 × 0.9 = 0.81) or you move left in move 1 and right in move 2 (probability 0.1 × 0.1 = 0.01). So FeedNo has a probability of 0.82 of hitting the destination. Call the feedback dividend the probability of hitting the destination with the optimal FeedYes strategy divided by the probability of hitting it with the optimal FeedNo strategy. (Optimal means that you do as well as you can based on the probabilities.) In the example here, the feedback dividend is 0.9/0.81.

Warm-Up

Are there any values of Pgood for which the feedback dividend is 1 regardless of source and destination?

Solution to Warm-Up

If Pgood were 0.5 or 1, the feedback dividend would be only one. In the first case, it doesn’t matter where you aim. In the second, you don’t need feedback. For all other Pgood values, the dividend will exceed one.

Now here is the full problem. You start at row 1, column 4 and you want to hit row 8, column 5.

If Pgood is 0.9, what is the probability of hitting row 8, column 5 using the FeedYes strategy and using the FeedNo strategy?
For which value of Pgood does the feedback dividend reach its greatest value? What is the feedback dividend in that case?

You may be surprised by the result of this second question. It’s not intuitive at all.

If we cut off the three rightmost columns and the two leftmost columns, then which value of Pgood would give the highest feedback dividend? Assume that falling off the board gives a certain loss.