Let's leave the people behind for a minute and look at the software. Software can fail in a variety of ways. It is useful to classify defects into categories that reveal how the defect was introduced and how it can be found or, even better, avoided in the future. The Orthogonal Defect Classification (ODC) system, developed by IBM, was developed for this purpose. This system defines multiple categories of classification, depending on the development activity that is taking place. This chapter explores the eight defect type classifications, and examines their relevance to game defects. The defect type classifies the way the defect was introduced into the code. As we go along, keep in mind that each defect can be either the result of incorrect implementation or of code that is simply missing. The defect types listed next summarize the different categories of software elements that go into producing the game code:
Function
Assignment
Checking
Timing
Build/Package/Merge
Algorithm
Documentation
Interface
Note ‚ | If you have trouble remembering this list, try remembering the acronym "FACT BADI." |
Defect examples in this section are taken from the Dark Age of Camelot (DAOC) game Version 1.70i Release Notes, posted on July 1, 2004. Dark Age of Camelot is a Massive Multiplayer Online Role-Playing Game (MMORPG) that is continually modified by design to continue to expand and enhance the players' game experience. As a result, it is patched frequently with the dual purpose of fixing bugs and adding or modifying capabilities. This gives us the opportunity to examine it as it is being developed, as opposed to a game that has a single point of release to the public.
The defect description by itself doesn't tell us how the defect was introduced in the code ‚ which is what the defect type classification describes. Since I don't have access to the development team's defect tracking system to know exactly how this bug occurred, let's take one specific bug and look at how it could have been caused by any of the defect types.
Here is a fix released in a patch for Dark Age of Camelot that will be referenced throughout the examples in this chapter:
"The Vanish realm ability now reports how many seconds of super-stealth you have when used."
If that's how it's supposed to work, then you can imagine the bug was logged with a description that went something like this:
"The Vanish realm ability fails to report how many seconds of super-stealth you have when it's used."
See the "Vanish" sidebar for additional details of this ability.
A Function error is one that affects a game capability or how the user experiences the game. The code providing this function is missing or incorrect in some or all instances where it is required.
Description:
Provides the stealther with super stealth, which cannot be broken. Also will purge DoTs and Bleeds and provides immunity to crowd control. This ability lasts for 1 to 5 seconds depending on level of Vanish. The stealther also receives an increase in movement speed as listed. A stealther cannot attack for 30 seconds after using this ability.
Effect:
L1 - Normal Speed, 1 sec immunity
L2 - Speed 1, 2 sec immunity
L3 - Speed 5, 5 second immunity
Type: Active
Re-use: 10 min.
Level 1: 5
Level 2: 10
Level 3: 15
Classes for ability Vanish:
Infiltrator, Nightshade, Shadowblade
from Allakhazam's Magical Realmat http://camelot.allakhazam.com/ability.html?cabil=73
Here's an imaginary code snippet that illustrates code that could be used to set up and initiate the Vanish ability. The player's Vanish ability level is passed to a handler routine specific to the Vanish ability. This routine is required to make all of the function calls necessary to activate this ability. The g_vanishSpeed and g_vanishTime arrays store values for each of the three levels of this ability, plus a value of 0 for level 0. These arrays are named with the " g_ " prefix to indicate they are global, since the same results apply for all characters that have this ability. Values appearing in all uppercase letters indicate these are constants.
Missing a call to a routine that displays the time of the effect is an example of a Function type defect for this code. Maybe this block of code was copied from some other ability and the "vanish" globals were added but without the accompanying display code. Alternatively, there could have been a miscommunication about how this ability works and the programmer didn't know that the timer should be displayed.
void HandleVanish(level) { if (level == 0) return; // player does not have this ability so leave PurgeEffects(damageOverTime); IncreaseSpeed(g_vanishSpeed[level]); SetAttack(SUSPEND, 30SECONDS); StartTimer(g_vanishTime[level]); return; } // oops! Did not report seconds remaining to user - hope they don't notice
Alternatively, the function to show the duration to the user could have been included, but called with one or more incorrect values:
ShowDuration(FALSE, g_vanishTime[level]);
A defect is classified as an Assignment type when it is the result of incorrectly setting or initializing a value used by the program or when a required value assignment is missing. Many of the assignments take place at the start of a game, a new level, or a game mode. Here are some examples for various game genres:
Sports
Team schedule
Initialize score for each game
Initial team lineups
Court , field, rink, etc. where game is being played
Weather conditions and time of day
RPG, Adventure
Starting location on map
Starting attributes, skills, items, and abilities
Initialize data for current map
Initialize journal
Racing
Initialize track/circuit data
Initial amount of fuel or energy at start of race
Placement of powerups and obstacles
Weather conditions and time of day
Casino Games, Collectible Card Games, Board Games
Initial amount of points or money to start with
Initial deal of cards or placement of pieces
Initial ranking/seeding in tournaments
Position at the game table and turn order
Fighting
Initial health, energy
Initial position in ring or arena
Initial ranking/seeding in tournaments
Ring, arena, etc. where fight is taking place
Strategy
Initial allocation of units
Initial allocation of resources
Starting location and placement of units and resources
Goals for current scenario
First Person Shooters (FPS)
Initial health, energy
Starting equipment and ammunition
Starting location of players
Number and strength of CPU opponents
Puzzle Games
Starting configuration of puzzle
Time allocated and criteria to complete puzzle
Puzzle piece or goal point values
Speed at which puzzle proceeds
You can see from these lists that any changes could tilt the outcome in favor of the player or the CPU. Game programmers pay a lot of attention to balancing all of the elements of the game. Initial value assignments are important to providing that game balance.
Even the Vanish defect could have been the result of an Assignment problem. In the imaginary implementation that follows , the Vanish ability is activated by setting up a data structure and passing it to a generic ability handling routine.
ABILITY_STRUCT realmAbility; realmAbility.ability = VANISH_ABILITY; reamAbility.purge = DAMAGE_OVER_TIME_PURGE; realmAbility.level = g_currentCharacterLevel[VANISH_ABILITY]; reamAbility.speed = g_vanishSpeed[realmAbility.level] realmAbility.attackDelay = 30SECONDS; realmAbility.duration = g_vanishTime[realmAbility.level]; realmAbility.displayDuration = FALSE; // wrong flag value HandleAbility(realmAbility);
Alternatively, the assignment of the displayDuration flag could be missing altogether. Again, cut and paste could be how the fault was introduced, or it could have been wrong or left out as a mistake on the part of the programmer, or a misunderstanding about the requirements.
A Checking defect type occurs when the code fails to properly validate data before it is used. This could be a missing check for a condition or the check is improperly defined. Some examples of improper checks in C code would be the following:
"=" instead of "==" used for comparison of two values
Incorrect assumptions about operator precedence when a series of comparisons are not parenthesized
"Off by one" comparisons, such as using "<=" instead of "<"
A value ( *pointer ) compared to NULL instead of an address ( pointer ) ‚ either directly from a stored variable or as a returned value from a function call
Ignored (not checked) values returned by C library function calls such as strcpy
Back to our friend the Vanish bug. The following shows a Checking defect scenario where the ability handler doesn't check the flag for displaying the effect duration or checks the wrong flag to determine the effect duration.
HandleAbility (ABILITY_STRUCT ability) { PurgeEffect(ability.purge); if (ability.attackDelay > 0) StartAttackDelayTimer(ability.attackDelay); if (ability.immunityDuration == TRUE) // should be checking ability.displayImmunityDuration! DisplayAbilityDuration(ability.immunityDuration); }
Timing defects have to do with the management of shared and real-time resources. Some processes may require time to start or finish, such as saving game information to a hard disk. Operations that depend on that data shouldn't be prevented until completion of the dependent process. A user-friendly way of handling this is to present a transition such as an animated cut scene or a "splash" screen with a progress bar that shows the player that the information is being saved. Once the save operation is complete, the game resumes.
Other timing-sensitive game operations include preloading audio and graphics so that they are immediately available when the game needs them. Many of these functions are now handled in the gaming hardware, but the software still may need to wait for some kind of notification, such as a flag that gets set, an event that gets sent to an event handler, or a routine that gets called once the data is ready for use.
Note ‚ | As an example of an audio event notification scheme, Microsoft DirectMusic provides an AddNotificationType routine, which programmers can set up to notify their game when the music has started, stopped , been removed from the queue, looped, or ended. SetNotificationHandle is used to assign an event handle (created by the CreateEvent function), which is used when the game calls WaitForSingleObject with the notification handle, and then calls GetNotificationPMsg to retrieve the notification event. |
User inputs can also require special timing considerations. Double-clicks or repeated presses of a button may cause special actions in the game. There could be mechanisms in the game platform operating system to handle this or the game team may put its own into the code.
In MMORPG and multiplayer mobile games, information is flying around between players and the game server(s). This information has to be reconciled and handled in the proper order or the game behavior will be incorrect. Sometimes the game software tries to predict and fill in what is going on while it is waiting for updated game information. When your character is running around, this can result in jittery movement or even a "rubber band " effect, where you see your avatar run a certain distance and, all of a sudden, you see your character being attacked way back from where you thought you were.
Getting back to the familiar Vanish bug, let's look at a Timing defect scenario. In this case, pretend that one function starts up an animation for casting the Vanish ability, and a global variable g_animationDone is set when the animation has finished playing. Once g_animationDone is TRUE , the duration should be displayed. A Timing defect can occur if the ShowDuration function is called without waiting for an indication that the Vanish animation has completed. The animation will overwrite anything that gets put on the screen. Here's what the defective portion of code might look like:
StartAnimation(VANISH_ABILITY); ShowDuration(TRUE, g_vanishImmunityTime[level]);
And this would be the correct code:
StartAnimation(VANISH_ABILITY); while(g_animationDone == FALSE) ; // wait for TRUE ShowDuration(TRUE, g_vanishImmunityTime[level]);
Build/package/merge or, simply Build defects are the result of mistakes in using the game source code library system, managing changes to game files, or identifying and controlling which versions get built.
Building is the act of compiling and linking source code and game assets such as graphics, text, and sound files in order to create an executable game. Configuration management software is often used to help manage and control the use of the game files. Each file may contain more than one asset or code module. Each unique instance of a file is identified by a unique version identifier.
The specification of which versions of each file to build is done in a configuration specification ‚ config spec for short. Trying to specify the individual version of each file to build can be time-consuming and error-prone , so many configuration management systems provide the ability to label each version. A group of specific file versions can be identified by a single label in the config spec.
Label | Usage |
---|---|
[DevBuild] | Identifies files that programmers are using to try out new ideas or bug fix attempts. |
[PcOnly] | Developing games for multiple platforms may require a different version of the same file that is built for only one of the supported platforms. |
[TestRelease] | Identifies a particular set of files to use for a release to the testers. Implies that the programmer is somewhat certain the changes will work. If testing is successful, the next step might be to change the label to an "official" release number. |
[Release1.1] | After successful building and testing, a release label can be used to "remember" which files were used. This is especially helpful if something breaks badly later on and the team needs to backtrack either to debug the new problem or revert to previous functionality. |
Each file has a special evolutionary path called the mainline . Any new versions of files that are derived from one already on the mainline are called branches . Files on branches can also have new branches that evolve separately from the first branch. The changes made on one or more branches can be combined with other changes made in parallel by a process called a merge . Merging can be done manually, automatically, or with some assistance from the configuration management system, such as highlighting which specific lines of code differ between the two versions being merged together. A version tree provides a graphical view of all versions of a file and their relationship to one another. See Figures 3.1 through 3.3 for examples of how a version tree evolves as a result of adding and updating files in various ways.
When a programmer wants to make a change to a file using a configuration management system, the file gets checked out. Then, once the programmer is satisfied with the changes and wants to return the new file as a new version of the original one, the filed is checked in. If at some point in time the programmer changes her mind, the file check out can be cancelled and no changes are made to the original version of the file.
With that background, let's explore some of the ways a mistake can be made.
Specifying a wrong version or label in the configuration specification may still result in successfully generating a game executable, but it will not work as intended. It may be that only one file is wrong, and it has a feature used by only one type of character in one particular scenario. Mistakes like this keep game testers in business.
It's also possible that the configuration specification is correct, but one or more programmers did not properly label to the version that needed to be built. The label can be left off, left behind on an earlier version, or typed in wrong so that it doesn't exactly match the label in the config spec.
Another problem can occur as a result of merging. If a common portion of code is changed in each version being merged, it will take skill to merge the files and preserve the functionality in both changes. The complexity of the merge increases when one version of a file has deleted the portion of code that was updated by the version it is being merged with. If a real live person is doing the merges, these problems may be easier to spot than if the build computer is making these decisions and changes entirely on its own.
Sometimes the code will give clues that something is wrong with the build. Comments in the code like // TAKE THIS OUT BEFORE SHIPPING! could be an indication that a programmer forgot to move a label or check a newer version of the file back into the system before the build process started.
Referring back to Figure 3.3, assume the following for the Vanish code:
Versions 1 and 2 do not display the Vanish duration.
Version 1.1 introduced the duration display code.
Merging versions 2 and 1.1 produces version 3, but deletes the part of the code in version 1.1 that displays the duration.
For the Vanish display bug, here are some possible Build defect type scenarios:
The merge that produced version 3 deleted the part of the code in version 1.1 that displays the duration. Version 3 gets built but we get no duration display.
Versions 1.1 and 2 were properly merged, so the code in version 3 will display the duration. However, the label used by build specification has not been moved up from version 2 to version 3, so version 2 gets built and we get no duration display.
Versions 1.1 and 2 were properly merged, so the code in version 3 will display the duration. The build label was also moved up from version 2 to version 3. However, the build specification was hard-coded to build version 2 of this file instead of using the label, so we get no duration display.
Algorithm defects include efficiency or correctness problems that result from some calculation or decision process. Think of an algorithm as a process for arriving at a result (for example, the answer is 42) or an outcome (for example, the door opens). Each game is packed with algorithms that you may not even notice if they are working right. Improper algorithm design is often at the root of ways people find to gain an unexpected advantage in a game. Here are some places where you can find algorithms and Algorithm defects in games from various genres:
Sports
CPU opponent play, formation, and substitution choices
CPU trade decisions
Modeling the play calling and decision making of an actual coach or opponent
The individual AI behavior for all positions for both teams in the game
Determining camera angle changes as the action moves to various parts of the field/court/ice, etc.
Determining penalties and referee decisions
Determining player injuries
Player stat development during the course of the season
Enabling special powerups, awards, or modes (NBA Street Vol.2, NCAA Football 2005)
RPG, Adventure
Opposing and friendly character dialog responses
Opposing and friendly character combat decisions and actions
Damage calculations based on skills, armor , weapon type and strength, etc.
Saving throw calculations
Determining the result of using a skill, for example stealth, crafting , persuading , etc.
Experience point calculations and bonuses
Ability costs, duration, and effects
Resources and conditions needed to acquire and use abilities and items
Weapon and ability targeting, area of effect, and damage over time
Racing
CPU driver characteristics, decisions and behaviors ‚ when to pit stop, use powerups, etc.
Damage and wear calculations for cars , and damaged car behavior
Rendering car damage
Automatic shifting
Factoring effects of environment such as track surface, banking, weather
CPU driver taunts
Casino Games, Collectible Card Games, Board Games
Opposing player styles and degree of skill
Applying the rules of the game
House rules, such as when dealer must stay in Blackjack
Betting options and payouts/rewards
Fair distribution of results, for example no particular outcome (card, dice roll, roulette number, etc.) is favored
Fighting
CPU opponent strike (offensive) and block (defense) selection
CPU team selection and switching in and out during combat
Damage/point calculation, including environmental effects
Calculating and rendering combat effects on the environment
Calculating and factoring fatigue
Enabling special moves, chains, etc.
Strategy
CPU opponent movement and combat decisions
CPU unit creation and deployment decisions
Resource and unit building rules (pre-conditions, resources needed, etc.)
Damage and effect calculations
Enabling the use of new units, weapons, technologies, devices, etc.
First Person Shooters (FPS)
CPU opponent and teammate AI
Opposing and friendly character combat decisions and actions
Damage calculations based on skills, armor, weapon type and strength, etc.
Weapon targeting, area of effect, and damage over time
Environmental effects on speed, damage to player, deflection or concentration of weapons (for example, Unreal Tournament Flak Cannon rounds will deflect off of walls)
Puzzle Games
Points, bonus activation, and calculations
Determining criteria for completing a round or moving to the next level
Determining success of puzzle goals, such as forming a special word, or matching a certain number of blocks
Enabling special powerups, awards, or modes
To complicate matters further, some game titles incorporate more than one game "type" and its algorithms. For example, Star Wars: Knights of the Old Republic (KOTOR) is an RPG/Adventure game that also has points in the story line where you can play a card game against non-player characters in the game and engage in swoop bike racing ‚ though not both at the same time! Unreal Tournament 2004 is typically considered an FPS, but it also incorporates adventure and sports elements at various stages of the tournament.
Some other areas where Algorithm type defects can appear in the game code are graphics rendering engines and routines, mesh overlay code, z-buffer ordering, collision detection, and attempts to minimize the processing steps to render new screens.
For the Vanish bug, consider an Algorithm defect scenario where the duration value is calculated rather than taken from an array or a file. Also suppose that a duration of 0 or less will not get displayed on the screen. If the calculation (algorithm) fails by always producing a 0 or negative number result, or the calculation is missing altogether, then the duration will not get displayed.
The immunity duration granted by Vanish is one second at Level 1, two seconds at Level 2, and five seconds at Level 3. This relationship can be expressed by the equation
vanishDuration = (2 << level) - level;
So at Level 1, this becomes 2 - 1 = 1. For Level 2, 4 - 2 = 2, and Level 3, 8 - 3 = 5. These are the results we want, according to the specification.
Now what if by accident the modulus (%) operator was used instead of the left shift (<<) operator? This would give a result of 0 - 1 = -1 for Level 1, 0 - 2 = -2 for Level 2, and 2 - 5 = -3 for Level 3. The immunity duration would not get displayed, despite the good code that is in place to display this duration to the user. An Algorithm defect has struck!
Documentation defects occur in the fixed data assets that go into the game. This includes text, audio, and graphics file content, as listed here:
Text
Dialogs
User interface elements (labels, warnings, prompts, etc.)
Help text
Instructions
Quest journals
Audio
Sound effects
Background music
Dialog (human, alien, animal)
Ambient sounds (running water, birds chirping, etc.)
Celebration songs
Video
Cinematic introductions
Cut scenes
Environment objects
Level definitions
Body part and clothing choices
Items (weapons, vehicles, etc.)
This special type of defect is not the result of improper code. The errors themselves are in the bytes of data retrieved from files or defined as constants. This data is subsequently used by statements or function calls that print or draw text on the screen, play audio, or write data to files. Defects of this type are detectable by reading the text, listening to the audio, checking the files, and paying careful attention to the graphics.
String constants in the source code that get displayed or written to a file are also potential sources of Documentation type errors. When the game has options for multiple languages, putting string constants directly in the code can cause a defect. Even though it might be the proper string to display in one language, there will be no way to provide a translated version if the user selects an alternate language.
The examples in this section take a brief detour from the Vanish bug and examine some other bugs fixed in the Dark Age of Camelot 1.70i release, which appear at the end of the "New Things and Bug Fixes" list:
If something damages you with a DoT and then dies, you see "A now dead enemy hits you for X damage" instead of garbage.
This could be a Documentation type defect where a NULL string, or no string, was provided for this particular message, instead of the message text that is correctly displayed in the new release. However, there may be other causes in the code. Note that this problem has the condition " ‚ and then dies" so maybe there is a Checking step that had to be added to retrieve the special text string. A point to remember here is that the description of the defect is usually not sufficient to determine the specific defect type, although it may help to narrow it down. Someone has to get into the bad code to determine how the defect occurred.
Grammatical fixes made to bug report submissions messages, autotrain messages, and grave error messages.
This one is almost certainly a Documentation type defect. No mention is made of any particular condition under which these are incorrect. The error is grammatical, so text was provided and displayed, but the text itself was faulty.
Sabotage ML delve no longer incorrectly refers to siege equipment.
This description refers to doing a /delve command in the game for the Sabotage Master Level ability. The quick conclusion is that this was a Documentation defect fixed by correcting the text. Another less likely possibility is that the delve text was retrieved for some other ability similar to Sabotage due to a faulty pointer array index ‚ perhaps due to an Assignment or Function defect.
The last ODC defect type that needs to be discussed is the Interface type. An interface occurs at any point where information is being transferred or exchanged. Inside the game code, Interface defects occur when something is wrong in the way one module makes a call to another. If the parameters passed on somehow don't match what the calling routine intended, then undesired results occur. Interface defects can be introduced in a variety of ways. Fortunately, these too fall into logical categories:
Calling a function with the wrong value of one or more arguments
Calling a function with arguments passed in the wrong order
Calling a function with a missing argument
Calling a function with a negated parameter value
Calling a function with a bitwise inverted parameter value
Calling a function with an argument incremented from its intended value
Calling a function with an argument decremented from its intended value
Here is how each of these could be the cause of the Vanish problem. Let's use ShowDuration , which was introduced earlier in this chapter, and give it the following function prototype:
void ShowDuration(BOOLEAN_T bShow, int duration);
This routine does not return any value, and takes a project-defined Boolean type to determine whether or not to show the value, plus a duration value, which is to be displayed if it is greater than 0. So, here are the Interface type defect examples for each of the seven causes:
ShowDuration(TRUE, g_vanishSpeed[level]);
In this case, the wrong global array is used to get the duration (speed instead of duration). This could result in the display of the wrong value or no display at all if a 0 is passed.
ShowDuration(g_vanishDuration[level], TRUE);
Let's say the BOOLEAN_T data type is #defined as int , so inside ShowDuration the duration value (first parameter) will be compared to TRUE , and the TRUE value (second parameter) will be used as the number to display. If the duration value does not match the #define for TRUE , then no value will be displayed. Also, if TRUE is #define d as 0 or a negative number, then no value will be displayed because of our rule for ShowDuration that a duration less than or equal to zero does not get displayed.
ShowDuration(TRUE);
No duration value is provided. If it defaults to 0 as a result of a local variable being declared within the ShowDuration routine, then no value will be displayed.
ShowDuration(TRUE, g_vanishDuration[level] 0x8000);
Here's a case where the code is unnecessarily fancy and gets into trouble. An assumption was made that the high-order bit in the duration value acts as a flag that must be set to cause the value to be displayed. This could be left over from an older implementation of this function or a mistake made by trying to reuse code from some other function. Instead of the intended result, this changes the sign bit of the duration value and negates it. Since the value used inside of ShowDuration will be less than zero, it will not be displayed.
ShowDuration(TRUE, g_vanishDuration[level] ^ TRUE);
More imaginary complexity here has led to an Exclusive OR operation performed on the duration value. Once again, this is a possible attempt to use some particular bit in the duration value as an indicator for whether or not to display the value. In the case where TRUE is 0xFFFF , this will invert all of the bits in the duration, causing it to be passed in as a negative number, thus altering its value and preventing it from being displayed.
ShowDuration(FALSE, g_vanishDuration[level+1]);
This can happen when an incorrect assumption is made that the level value needs to be incremented to start with array element 1 for the first duration. When level is 3, this could result in a 0 duration, since g_vanishDuration[4] is not defined. That would prevent the value from being displayed.
ShowDuration(FALSE, g_vanishDuration[level-1]);
Here the wrong assumption is made that the level value needs to be decremented to start with array element 0 for the first duration. When level is 1, this could return a 0 value and prevent the value from being displayed.
Okay, some of these examples are way out there, but pay attention to the variety of ways every single parameter of every single function call can be a ticking time bomb. One wrong move can cause a subtle, undetected, or severe Interface defect.