Overloaded Scheduling and Freezing Crystals | Puzzles for Programmers and Pros

When first faced with a problem, try pure greed (look for moves that lower the cost). If that doesn’t work or if it reaches a dead end, then you can try case analysis as we did for Sudoko. If there are too many cases, then try dynamic programming. If all these fail, it is time for heuristic search techniques along with some bound on the optimum.

A heuristic technique is one that is not guaranteed to work perfectly, but often (you hope) works well. Greed is, of course, a great heuristic technique but can be brittle, as we saw in the solution to the “Finding a Schedule that Works” puzzle.

The principle of exploratory heuristic techniques is that they take anti-greedy moves sometimes. They do this in order to explore the search space. If each problem configuration could be encoded as some scalar value (horizontal axis of Figure 2-23), then the squiggly line in Figure 2-23 represents the cost of each configuration.

image from book
Figure 2-23: The horizontal axis represents the search space. The vertical represents the cost. A greedy approach, given a location in the search space, will perform moves that decrease cost. So the arrows always start from a higher cost and go to a lower one.

A greedy approach will make moves that decrease cost only. An exploratory heuristic technique will sometimes make moves to more costly states in the hopes of leaving a local minimum (see Figure 2-24). If this reminds you of high school chemistry notions of activation energy, you have the right intuition.

image from book
Figure 2-24: In simulated annealing and other heuristic search techniques, some moves increase the cost. You can see that with the rightmost arrow which goes from a lower cost valley to a higher cost hill. The idea is to escape local cost minima.

But we don’t want to go any which way all the time. Let’s first look at animal analogies - flies vs. bees. If trying to leave the interior of a house, a fly, starting from a closed window, may bump against the window a few times, but will eventually fly away from that window and find an open one. Bees keep bumping up against the glass. Flies are using anti-greedy moves to escape the local optimum of the windowpane. Bees insist on trying to push through the pane.^[2] On the other hand, flies never build anything as complex as a beehive. To me, this suggests that when you are faced with a technical problem, you must alternate between fly behavior and bee behavior. At some point you must stop exploring and start doing, but if you hit a roadblock, then you must start to explore some more.

The main exploratory search techniques are genetic (also known as evolutionary) algorithms, tabu searching, integer programming, and simulated annealing. A book that explains these techniques both engagingly and academically is How to Solve It: Modern Heuristics by Zvigniew Michalewicz and David B. Fogel (Springer, 2004).

Everybody seems to have a favorite. Mine is simulated annealing, because it requires fewer parameters than genetic algorithms and has a wider range of applicability than integer programming. I should note, however, that the other techniques work really well sometimes. If a problem is heavily numeric, for example, linear programming approximations to the integer programming problem can sometimes work well and avoid numerical inaccuracies. Some people make great use of genetic algorithms. I consider it a character flaw that I can’t ever tune the parameters correctly.

Simulated annealing conforms to my intuition about the way the world works. I first heard about the technique in a lecture by its main inventor Kirkpatrick in 1983. He described how he had thought of the idea when he had to solve a traveling salesman-like problem - how to minimize the lengths of wires (I think power wires) around a circuit board. The problem was NP-complete but the company (IBM) needed a solution. Kirkpatrick was a solid-state physicist, so he suggested a technique inspired by the cooling of crystals.

If you take liquid silicon and freeze it quickly it will not form a crystal, which is the lowest energy state, but rather an amorphous (higher energy) blob. If you freeze it slowly (“anneal it”), then it will form a nice crystal. The slow cooling allows it to find an optimal state. At high temperatures atoms move rapidly, allowing them to escape local energy minima to fly off to far places. At lower temperatures, the atoms rearrange themselves only slightly.

Kirkpatrick reasoned that the same thing could be done with algorithms. Given a wire route at some stage of the algorithm, a move would replace, say, a random quadruple in the path, say A to B to C to D, by A to C to B to D in one of two cases:

The change resulted in a shorter path.
The change resulted in a longer path, but when you threw some pseudo-random dice, they came out in a certain improbable configuration.

Thus, step 2 is anti-greedy but occurs only with a certain probability. The key insight in Kirkpatrick’s proposal was that this probability of taking an anti-greedy move would depend on both the amount by which the path became longer, and on the amount of time the algorithm had run. The probability would start out quite high (potentially near 1) but then gradually become lower as the algorithm proceeded. Towards the end, the algorithm almost never does anti-greedy moves. In annealing terms, the temperature decreases over the course of the algorithm as does the probability of taking anti-greedy moves. Here is the pseudo-code.

 take an initial route r1  initialize temperature T to a high value  loop until T is low enough      or you have found an answer close to optimal    consider a randomly chosen possible change            yielding a route r2    if r2 is lower cost than r1      then replace r1 by r2      else replace r1 by r2 with probability         e^{((cost(r1)-cost(r2))/T)}    end if    decrease the temperature T  end loop

Note that the exponent is always negative for an anti-greedy move, so the probability is between 0 and 1. The larger the temperature T, the closer the exponent is to 0, so the higher the probability. Also, the larger the increase in cost the more negative, so the lower the probability. Thus, anti-greed tends to happen earlier and for small increases in cost.

Now, Kirkpatrick observed that the technique was in no way dependent on the specifics of wire layout, as I hope you can see from the pseudo-code. Further, the only parameters are the initial temperature, the speed at which the temperature decreases, and an encoding of the cost.

The elegance of this approach immediately appealed to me, but the feeling was not universal. One professor asked pointedly, “What have you proved?” “Nothing,” Kirkpatrick replied with that impatience that physicists manifest when faced with questions from pesky mathematicians. “Under certain conditions one can show that if you freeze infinitely slowly you will get a crystal, but what this means in terms of an algorithm is very uncertain.”

Their exchange illustrates a cultural divide between the heurist and the algorithmist. The heurist has an NP-hard problem to solve, so uses a heuristic that seems at least plausible. The algorithmist wants a time guarantee. Both have their place, but since NP-completeness has been around since the 1970s and no algorithm seems to be coming over the horizon, it’s possible that we’ll be stuck with heuristics for quite some time for many problems.

Heuristic search techniques offer no guarantee, however. The search space could have just one sweet spot protected by a high-cost slope as in Figure 2-25. This would make it hard even for simulated annealing to find the optimum.

image from book
Figure 2-25: A search space for which simulated annealing might have a difficult time finding a solution. The optimum is at a very specific part of the search space, so a move is unlikely to find it.

I invite you now to apply simulated annealing to the overloaded scheduling problem. Here is how it goes. You have a bunch of tasks, each of which has a deadline, an amount of work, and a positive value. You receive all the value of the task if you finish by the deadline and nothing if you finish after the deadline. You have more work to do than you could possibly finish.

Task T1 takes 3 days with deadline on day 19 and value 17
Task T2 takes 4 days with deadline on day 23 and value 14
Task T3 takes 6 days with deadline on day 51 and value 10
Task T4 takes 3 days with deadline on day 30 and value 7
Task T5 takes 7 days with deadline on day 38 and value 13
Task T6 takes 6 days with deadline on day 36 and value 11
Task T7 takes 7 days with deadline on day 45 and value 18
Task T8 takes 3 days with deadline on day 16 and value 10
Task T9 takes 5 days with deadline on day 22 and value 13
Task T10 takes 2 days with deadline on day 13 and value 16
Task T11 takes 8 days with deadline on day 12 and value 6
Task T12 takes 1 day with deadline on day 31 and value 15
Task T13 takes 5 days with deadline on day 17 and value 13
Task T14 takes 2 days with deadline on day 2 and value 13
Task T15 takes 7 days with deadline on day 30 and value 11
Task T16 takes 5 days with deadline on day 11 and value 18
Task T17 takes 4 days with deadline on day 4 and value 10
Task T18 takes 5 days with deadline on day 27 and value 15
Task T19 takes 4 days with deadline on day 6 and value 15

How much value can you obtain in total from these tasks?

Now, before attempting to answer this using simulated annealing (or any technique of your choice), first try to establish an upper bound on the value. A simple upper bound is the sum of all values. But such an upper bound might be excessively high. So, let’s consider another method. Define the value density of a task to be the value of the task divided by the number of days it takes. For example, T18 has a value density of 15/5 = 3. Now, take the highest deadline (51 in this case). Order the tasks by descending order of their value densities.

 T12: 15 T10: 8 T14: 6.5 T1: 5.666667 T19: 3.75 T16: 3.6 T2: 3.5 T8: 3.333333 T18: 3 T9: 2.6 T13: 2.6 T7: 2.571429 T17: 2.5 T4: 2.333333 T5: 1.857143 T6: 1.833333 T3: 1.666667 T15: 1.571429 T11: 0.75

To compute the upper bound, note their computation time and see how many of these tasks starting from the one with highest value density can complete by day 51. If there is any time left over, then use the next task’s value density and multiply it by the number of days left.

In this case, we can complete all tasks down to task T17 in 50 days using the value density ordering. The upper bound, therefore, is the total value for tasks T12, T10, T14, T1, T19, T16, T2, T8, T18, T9, T13, T7, T17, which is 187 plus the value density of T4 times the one day remaining (2.3). The total, therefore, is 189.3. Now this upper bound is generous in that we have loosely assumed that all these tasks have deadline 51. But it may still be useful.

Also, the value density heuristic may yield a good starting configuration for simulated annealing. Okay, enough hints. Try it for yourself.

Solution to Overloaded Scheduling and Freezing Crystals

How much value can you obtain in total from these tasks?

We’ll follow the heuristic of preferring tasks with the greatest value density (value/computation time), suggesting a schedule of T14 T19 T16 T10 T8 T1 T2 T15 T12 T5 T7 T3. These give values of 13, 15, 18, 16, 10, 17, 14, 11, 15, 13, 18, and 10 respectively for a total of 170. That is quite close to the upper bound of 183. See if you can anneal to something better.

^[2]One can speculate that the reason people who make the most radical discoveries in science are young is that they act like flies. They are less committed to the methods that have worked in the past. Older people, like bees, make greedy moves hoping that experience will lead to an optimum result. To keep the edge, older folk should look around more.