Real-Time Software | Core Techniques and Algorithms in Game Programming2003

Video games are software applications. Specifically, they belong to a class called real-time software applications. This is an important classification because it will help us understand how they behave and why many coding decisions are made. For those unfamiliar with these concepts, I will stop and briefly explain their characteristics.

In a formal definition, real-time software means computer applications that have a time-critical nature or, more generally, applications in which data acquisition and response must be performed under time-constrained conditions. Consider, for example, a computer program that displays information about arrivals on a large screen in an airport terminal; several lines of text display information about flight numbers, status, time of landing, and so on. Clearly, the software responds to timely events an airplane lands, delays are announced, and so on. The arrival of these bits of information is highly unpredictable, and the application must process and respond to them accordingly. Moreover, this time-dependent information must then be displayed on a screen to provide a visual presentation of the time-dependent data. This is what real-time software is all about.

Now, consider a slightly more involved example a software application designed to aid in air traffic control (see Figure 2.1). The application senses space with a radar, displays information about planes and their trajectories on a screen, and enables ground personnel to assist pilots in reaching their destination in a timely and safe manner by sending messages to them. Looking at the internals of the system, you will see that it consists of:

A data acquisition module in this case, coupled with physical radar
A display/computation module, which helps ground personnel visualize data
An interaction module to send signals to planes so they know what to do

Figure 2.1. Air traffic controller.

graphics/02fig01.gif

Here we are moving one step further from our previous example. We are watching a real-time, interactive application, an application that responds to events that arrive at any point in time, displays information related to those events, and allows the operator to interact with them. Games are not very different from this architecture. Imagine that we eliminate the radar, generate "virtual" air traffic using a software simulator, and tell the user he must make planes land safely. Add a scoreboard to that and a game over screen, and it begins to sound familiar.

All games are indeed interactive, real-time applications. The operator (henceforth called the player) can communicate with the game world, which itself simulates real-time activity using software components. An enemy chasing us, elevators going up and down, and returning fire are all examples of the kind of "virtual" real time found in games. But there is more time to games than you might think. Games are also time constrained; they must display information at a set pace (usually above 25 frames per second) to allow interaction to become seamless. This certainly limits the scope of both the real-time simulators and the presentation layer found in games. We cannot do more than what the hardware allows in the given time slice. However, games are a bit like magic. The trick is to make the impossible seem possible, crafting worlds that seem larger than what the hardware allows through multimedia presentations well beyond the player's expectations.

As a summary, games are time-dependent interactive applications, consisting of a virtual world simulator that feeds real-time data, a presentation module that displays it, and control mechanisms that allow the player to interact with that world.

Because the interaction rate is fast, there is a limit to what can be simulated. But game programming is about trying to defy that limit and creating something beyond the platform's capabilities both in terms of presentation and simulation. This is the key to game programming and is the subject of this book.

Part I, "Gameplay Programming," deals with the coding of the real-time simulator that implements the game world. Part II, "Engine Programming," covers the presentation layer.

Real-Time Loops

As mentioned earlier, all real-time interactive applications consist of three tasks running concurrently. First, the state of the world must be constantly recomputed (the radar senses space in the airplane example or the virtual world simulator is updated in a game). Second, the operator must be allowed to interact with it. Third, the resulting state must be presented to the player, using onscreen data, audio, and any other output device available. In a game, both the world simulation and the player input can be considered tasks belonging to the same global behavior, which is "updating" the world. In the end, the player is nothing but a special-case game world entity. For the sake of simplicity, I will follow this rule and will thus refer to games as applications consisting of two portions: an update and a render routine.

As soon as we try to lay down these two routines in actual game code, problems begin to appear. How can we ensure that both run simultaneously, giving the actual illusion of peeking into the real world through a window? In an ideal world, both the update and render routines would run in an infinitely powerful device consisting of many parallel processors, so both routines would have unlimited access to the hardware's resources. But real-world technology imposes many limitations: Most computers generally consist of only one processor with limited memory and speed. Clearly, the processor can only be running one of the two tasks at any given time, so some clever planning is needed.

A first approach would be to implement both routines in a loop (as shown in Figure 2.2), so each update is followed by a render call, and so forth. This ensures that both routines are given equal importance. Logic and presentation are considered to be fully coupled with this approach. But what happens if the frames-per-second rate varies due to any subtle change in the level of complexity? Imagine a 10 percent variation in the scene complexity that causes the engine to slow down a bit. Obviously, the number of logic cycles would also vary accordingly. Even worse, what happens in a PC game where faster machines can outperform older machines by a factor of five? Will the AI run slower on these less powerful machines? Clearly, using a coupled approach raises some interesting questions about how the game will be affected by performance variations.

Figure 2.2. Coupled approach.

graphics/02fig02.gif

To solve these problems, we must analyze the nature of each of the two code components. Generally speaking, the render part must be executed as often as the hardware platform allows; a newer, faster computer should provide smoother animation, better frame rates, and so on. But the pacing of the world should not be affected by this speed boost. Characters must still walk at the speed the game was designed for or the gameplay will be destroyed. Imagine that you purchase a football game, and the action is either too fast or too slow due to the hardware speed. Clearly, having the render and update sections in sync makes coding complex, because one of them (update) has an inherent fixed frequency and the other does not.

One solution to this problem would be to still keep update and render in sync but vary the granularity of the update routine according to the elapsed time between successive calls. We would compute the elapsed time (in real-time units), so the update portion uses that information to scale the pacing of events, and thus ensure they take place at the right speed regardless of the hardware. Clearly, update and render would be in a loop, but the granularity of the update portion would depend on the hardware speed the faster the hardware, the finer the computation within each update call. Although this can be a valid solution in some specific cases, it is generally worthless. As speed and frames-per-second increase, it makes no sense to increase the rate at which the world is updated. Does the character AI really need to think 50 times per second? Decision making is a complex process, and executing it more than is strictly needed is throwing away precious clock cycles.

A different solution to the synchronization problem would be to use a twin-threaded approach (depicted in Figure 2.3) so one thread executes the rendering portion while the other takes care of the world updating. By controlling the frequency at which each routine is called, we can ensure that the rendering portion gets as many calls as possible while keeping a constant, hardware-independent resolution in the world update. Executing the AI between 10 and 25 times per second is more than enough for most games.

Figure 2.3. Twin-threaded approach.

graphics/02fig03.gif

Imagine an action game running at 60 fps, with the AI running in a secondary thread at 15 fps. Clearly, only one of every four frames will carry out an AI update. Although this is good practice to ensure fixed-speed logic, it has some downsides that must be carefully addressed. For example, how do we ensure that the four frames that share the same AI cycle are effectively different, showing smoother animation and graphics?

More frames means nothing if all the frames in an AI cycle look exactly the same; animation will effectively run at 15 fps. To solve this problem, AIs are broken down into two sections. The real AI code is executed using a fixed time step, whereas simpler routines such as animation interpolators and trajectory update routines are handled on a per-frame basis. This way those extra frames per second will really make a difference in the player's experience.

But the threaded approach has some more serious issues to deal with. Basically, the idea is very good but does not implement well on some hardware platforms. Some single-CPU machines are not really that good at handling threads, especially when very precise timing functions are in place. Variations in frequency occur, and the player experience is degraded. The problem lies not so much in the function call overhead incurred when creating the threads, but in the operating system's timing functions, which are not very precise. Thus, we must find a workaround that allows us to simulate threads on single-CPU machines.

The most popular alternative for those platforms that do not support a solid concurrency mechanism is to implement threads using regular software loops and timers in a single-threaded program. The key idea is to execute update and render calls sequentially, skipping update calls to keep a fixed call rate. We decouple the render from the update routine. Render is called as often as possible, whereas update is synchronized with time.

To achieve this result, we must begin by storing a time stamp for each call performed in the update call. Then, in subsequent loop iterations, we must compute the elapsed time since the last call (using the time stamp) and compare it with the inverse of the desired frequency. By doing so, we are testing whether we need to make an update call to keep the right call frequency. For example, if you want to run the AI 20 times per second, you must call the update routine every 50 milliseconds. Then, all you have to do is store the time at which you perform each call to update, and only execute it if 50 milliseconds have elapsed since then. This is a very popular mechanism because many times it offers better control than threads and simpler programming. You don't have to worry about shared memory, synchronization, and so on. In practical terms, it's a poor man's thread approach, as shown in Figure 2.4

Figure 2.4. Single-thread fully decoupled.

graphics/02fig04.gif

Here is the source code in C for such an approach:

 long timelastcall=timeGetTime(); while (!end)    {    if ((timeGetTime()-timelastcall)>1000/frequency)       {       game_logic();       timelastcall=timeGetTime();       }    presentation();    }

Notice how we are using the timeGetTime() call from the Win32 API as our timer. This call returns the time (in milliseconds) elapsed since Windows was last booted. Thus, subtracting the result of two timeGetTime() calls we can measure the period of time between them down to one millisecond of accuracy.

Now, the above code partially addresses our concerns and is a good starting point. Still, it is a bit far away from a professional game loop: We are assuming the logic tick takes 0 time to complete, we are not handling Alt-Tab scenarios, and so on. For completeness, I will now supply a professional-grade game loop. The ideas are basically the same, taken one step further to offer better, finer control. Here is the source code:

 time0 = getTickCount(); while (!bGameDone)    {    time1 = getTickCount();    frameTime = 0;    int numLoops = 0;    while ((time1 - time0) > TICK_TIME && numLoops < MAX_LOOPS)       {       GameTickRun();       time0 += TICK_TIME;       frameTime += TICK_TIME;       numLoops++;       }    IndependentTickRun(frameTime);    // If playing solo and game logic takes way too long, discard    // pending time.    if (!bNetworkGame && (time1 - time0) > TICK_TIME)       time0 = time1 - TICK_TIME;    if (canRender)       {       // Account for numLoops overflow causing percent > 1.       float percentWithinTick = Min(1.f, float(time1 - time0)/TICK_TIME);       GameDrawWithInterpolation(percentWithinTick);       }    }

Now, let's go step by step. The loop has two components: The first (the while controlling the access to GameTickRun) takes care of game logic, while the second (the if controlling access to GameDrawWithInterpolation) is the render portion.

In the game logic portion, we control if the elapsed time since the last logic call has surpassed a TICK_TIME (in milliseconds). If you want your AI code to be computed 20 times per second, the TICK_TIME is 50. Then, we put this inside a while clause because we might want to run several game logic ticks at once, especially if there was a pause, disk swapping slowed the application down, and so on. Notice we incorporate the TICK_TIME into the new timing of time0. Then, IndependentTickRun is called to handle player input, perform general housekeeping, and so on. These routines are rarely time-critical, so they don't need to use precise timing functions.

Finally, we reach the render stage. Notice we begin by computing the percentage of the current tick we have covered. That's stored in percentWithinTick. That's interesting, as it allows us to trigger the render call with this factor, which will perform fine interpolation so not all the frames in a given tick look alike.

Let's now move one layer below our initial macro level analysis and explore the components of both the game logic and presentation sections. By doing so, I will present a global game framework that will help you understand which software pieces must be used to make a game, along with their internal relationships.