Chapter 14: Play Testing and Ad Hoc Testing | Game Testing All in One (Game Development Series)

Although the vast majority of this book is designed to help you take a methodical, structured approach to testing a game, this chapter focuses on a more chaotic , unstructured ‚ yet no less crucial ‚ approach to game testing.

Ad hoc testing, sometimes referred to as "general" testing, describes searching for defects in a less structured way. Play testing describes playing the game to test for such subjective qualities as balance, difficulty, and the often- elusive "fun factor." Because ad hoc testing is closer to the more traditional structured testing described in earlier chapters, this chapter examines it first.

Ad Hoc Testing

Ad hoc is a Latin phrase that can be translated as "to this particular purpose." It is, in its purest form, a single test improvised to answer a specific question.

Despite the most thorough and careful test planning and test design, or the most complex test suite you may have designed, even after being reviewed carefully by other test leads or the project manager, there is always something you (and they) might have missed.

Ad hoc testing allows you as an individual tester to explore investigative paths that may have occurred to you, even subconsciously or unconsciously, in the course of performing structured test suites on the game. During the course of testing a game you will have almost daily thoughts along the lines of "I wonder what happens if I do ‚ ?"

Ad hoc testing gives you the opportunity to answer those questions. It is the mode of testing that best enables you to explore the game, wandering through it as you would a maze.

There are two main types of ad hoc testing. The first is free testing , which allows the professional game tester to "depart from the script" and improvise tests on-the-fly . The second is directed testing , which is intended to solve a specific problem or find a specific solution.

Testing From the Right Side of Your Brain

Because it is a more intuitive and less structured form of testing, I sometimes call free testing "right-brain testing." Nobel prize-winning psychobiologist Roger W. Sperry asserted that the two halves of the human brain tend to process information in very different ways. The left half of the brain is much more logical, mathematical, and structured. The right half is more intuitive, creative, and attuned to emotions and feelings. It is also the side that deals best with complexity and chaos.

Note ‚

For a good summary of Sperry's ideas on this topic, especially as it applies to creativity, see Chapter 3 of Edwards, Drawing on the Right Side of the Brain(Tarcher/Perigee, 1989).

As the videogame industry continues to grow, there is continued pressure to offer bigger, better, and more in every aspect of a game's design ‚ more features, more user customization, more content, and more complexity. At its best, ad hoc testing allows you as a tester to explore what at times can appear to be an overwhelmingly complex game design.

Ad hoc testing also presents you with an opportunity to test the game as you would play it. What type of gamer are you? Do you like to complete every challenge in every level and unlock every unlockable? Do you like to rush or build up? Do you favor a running game or a passing game? Do you rabbit through levels or explore them leisurely? Ad hoc testing allows you to approach the game as a whole and test it according to whatever style of play you prefer (for an expanded description of player types, see Chapter 12, "Cleanroom Testing").

"Fresh Eyes"

Just as familiarity breeds contempt, it can also breed carelessness on the part of testers forced to exercise the same part of the game over and over. Battle fatigue often sets in over the course of a long project. It's very easy to become "snowblind," a condition in which you've been looking at the same assets for so long that you can no longer recognize anomalies as they appear. You need a break.

Ad hoc testing can allow you to explore modes and features of the game that are beyond your primary area of responsibility. Depending on the manner in which your project is managed, you as a tester may be assigned to one specific area, mode, feature, or section of the game. All the test suites you perform on each build may focus on that specific area. Ad hoc testing allows you to move beyond to other areas, and allows other testers to explore your area, without a test suite to guide them.

By using ad hoc testing, you can put fresh eyes on various parts of the game and find previously overlooked issues.

This method can include the following:

Assigning members of the multiplayer team to play through the single-player campaign.
Assigning campaign testers to skirmish (or multiplayer) mode.
Assigning the config/compatibility/install tester to the multiplayer team.
Assigning testers from another project entirely to spend a day (or part of a day) on your game.
Asking non-testers from other parts of the company to play the game (see the section "Play Testing" later in this chapter).
Performing ad hoc testing early will quickly help to reveal any lingering deficiencies in your test plans, combinatorial tables, and test flow diagrams (see the following sidebar).

Who Turned the Lights On?

A venerable PC games publisher operated a handful of test labs in its various studios around the country, and the local test managers often would send builds of their current projects to each other for ad hoc testing and "idiot checking."

When one test manager handed the latest build of another studio's PC Formula One racing game to two of his testers, he was surprised to see them back in his office minutes later. "Crashed it already," they reported proudly.

"How?" the manager cried. "You've barely had time to get past the main menu!"

"We turned the headlights on!"

As you might expect, the default time in the default track in the default mode was "day." When the two testers started their race in this mode, they turned the headlights on "just to see what happens." The game crashed instantly.

Needless to say, this counterintuitive pair of settings (time = day and headlights = on) was added to the combinatorial tables by the chastened but wiser test lead.

Tip ‚

The "fresh eyes" concept is applicable to structured testing as well. It's wise to have testers rotate the specific suites they're responsible for periodically ‚ even every build.

Making Order Out of Chaos

Ad hoc testing is a natural complement to structured testing, but it is by no means a substitute for it. Whether you have been given a specific assignment by your test lead or you're playing through the single-player campaign "just to see what happens," your testing should be documented, verifiable , and worthwhile.

Set Goals and Stick to Them

Before you begin, you should have a goal. It need not (and should not) be as complex or as well thought out as the test cases and test suites discussed earlier. But you need to know where you're going so you don't wind up wasting your (and the project's) time. Briefly write out your test goal before you launch the game.

(Whether you actually achieve the goal of your free testing is less important. If, in the course of trying to reach your goal, you stumble upon a defect you hadn't intended to find, that's great. That is what free testing is all about.)

This goal can be very simple, but it must be explicit. Here are some examples:

How far can I play in story mode?
Can I play a full game by making only three-point shots?
Is there a limit to the number of turrets I can build in my base?
Can I deviate from the strategy suggested in the mission briefing and still win the battle?
Is there anywhere in the level I can get my character stuck in the geometry?

If you're leading a multiplayer test, let every other tester know the purpose of the game session before it starts. Successful multiplayer testing requires communication, coordination, and cooperation, even if it seems that the testers are merely running around the level trying to shoot each other. In most circumstances, one tester should direct all of the other players in order to reach an outcome successfully. This can often be as difficult as herding kittens. If one tester in a multiplayer test loses sight of the aim of the test, the amount of time wasted is multiplied by the number of testers in the game. Don't let your team fall into this trap.

Tip ‚

In your testing career, avoid the use of the verb "to play" when you refer to game testing. This will help to counter the widely held notion that your department "just plays games for a living." It will also help to reinforce to your test team that your work is just that, work. I've taken to correcting people who refer to testing as playing with the following observation: "The first time you play through a game, you're playing. The fortieth time, you're working."

If You're Not Taping, You're Not Testing

You should constantly take notes as you're testing through the game. Game designer Will Wright ( The Sims ) has said that gameplay is made up of "interesting decisions." It is imperative that you keep track of these decisions ‚ writing down which options you choose, paths you take, weapons you equip, plays you call, and so on ‚ in a very meticulous and diligent manner. In so doing, when you encounter a defect, you will better be able to come up with a reproducible path (see the following sidebar, "How to be a Repro Man (or Woman)").

Documentation may be difficult when you're in the middle of a 12-trick chain in a Tony Hawk -style stunt game. That's where videotape becomes an almost indispensable test tool. Every tester should have console, a TV, and a VCR with a tape to record every move they make in the game. (PC testers will need a video card with a TV output.)

Taping should not become a crutch, or an excuse for less-than -diligent work on the part of the tester. It should serve as a research tool and a last-resort means of reporting a defect. Use the following steps as a guide when you are taping:

Start the VCR and press the record button before you start the game. (It's too easy to forget, otherwise .)
When you come to a defect you can't reproduce, rewind the tape, study the tape, then show it to your test lead and colleagues to discuss what may have caused the bug and whether anyone else has seen the same behavior in similar circumstances.
If you absolutely , positively cannot reproduce the defect, rip a clip of the video to an .AVI file or other digital format. This will allow you to attach the video to a bug report, email it to the developer, or even keep it on your computer for reference.
Once you've filled up a videotape, replace it with a fresh one in your VCR. Keep the old tape in a safe spot for a couple of days, however, in case you need to refer back to it.

It's generally safe to record over the old tape once the next build enters test.

Free testing should have clear goals. The work done should be documented (via videotape) and documentable (through clear, concise , reproducible bug reports ). It should also be worthwhile. The following are but a few of the common pitfalls you should avoid when free testing:

Competing with other testers in multiplayer games. It's not about your individual score or win/loss record, it's about delivering a good product.
Competing against the AI (or yourself) in single-player games.
Spending a lot of time testing features that may be cut. You may be made aware that a certain mode or feature is "on the bubble," that is, in danger of being eliminated from the game. Adjust your focus accordingly .
Testing the most popular features of the game. Communicate frequently with your test lead and colleagues so you can stay current with what areas, features, and modes have been covered (and re-covered) already. Focus on the "unexplored territory."
Spending a disproportionate amount of time testing features that are infrequently used. You're wasting your (and the project's) time spending day after day exploring every nook and cranny of the map editor in your RTS, for example. Only about 15% of all users typically ever enter a map editor, and fewer than 5% actually use it to create maps. You want those folks to have a good experience, but not if it places the other 85% of your players at risk.

Avoid Groupthink

Because ad hoc testing depends on the instincts , tastes, and prejudices of the individual tester, it's important as a test manager to create an environment where testers think differently from one another. Gamers are not a uniform, homogenous group ; your test lab shouldn't be, either. If you've staffed your lab with nothing but hardcore gamers, you won't find all the bugs , nor will you ship the best product.

Groupthink is a term coined by social psychologist Irving Janis to describe a situation in which flawed decisions or actions are taken because a group under pressure often sees a decay in its "mental efficiency, reality testing and moral judgment." One common aspect of groupthink is a tendency toward self-censorship ‚ where individuals within a group fail to voice doubts or dissent out of a fear of being criticized, ostracized, or worse . This is a danger in game testing because the majority of people who aggressively seek game tester jobs are men in their early 20s ‚ young enough that pressure to conform to the peer group is still very strong.

Note ‚

For more information on groupthink, see Janis, Victims of Groupthink(Houghton Mifflin, 1972).

Tip ‚

Turn your hardcore gamers into hardcore testers. Hardcore gaming is not the same as hardcore testing. So-called "hardcore" gamers are generally a masochistic lot ‚ they willingly pay for games weeks and months before they're even released; they gladly suffer through launch week server overload problems; they love to download patches. Use the methods described in this book to get them to understand that bug-fixing patches can be the exception, rather than the rule. All it takes is careful test planning, design, and execution.

You may encounter attitudes in your test lab such as

"Everybody has broadband, so we don't need to test modem play."
"Nobody likes the L.A. Clippers, so I won't test using them as my team."
"Everybody played StarCraft , so we don't need to test the tutorial in our own RTS."
"Nobody likes CTF (capture the flag) mode, so we don't need to spend a lot of time on it."
"Nobody uses melee weapons, so I'll just use guns."

Your job as a tester, and test manager, is to be aware of your and your team's pets and pet peeves, and to create an atmosphere in which a variety of approaches are discussed and respected freely and frequently. Cultivate and encourage different types of play styles. Recruit sports gamers. Recruit casual and non-gamers. Foster diversity.

Testing as Detective Work

The second broad category of ad hoc testing is directed testing. You could best describe this method as "detective testing" because of its specific, investigative nature. The simplest form of directed testing answers a very specific question, such as

Does the new compile work?
Can you access all the characters ?
Are the cut-scenes interruptible?
Is saving still broken?

The more complex type of directed testing becomes necessary when testers find a major defect that is difficult or seemingly impossible to reproduce. The tester has "broken the game," but can't figure out how she or he did it. Like a good homicide case, the tester finds himself with a body (the bug) and an eyewitness (himself or other testers). Unlike a homicide case, the focus is not on "whodunnit." The perpetrator is a defect in the code. The focus is "howdithappen."

How to Be a Repro Man (or Woman)

One of the most critical bits of information in any bug report is the rate of reproduction. In a defect tracking database, this field may be called (among other things) frequency, occurrence rate, "happens," or repro rate. All these various terms are used to describe the same thing. Reproduction rate can be defined as the rate at which, following the steps described in the bug report, anyone will be able to reproduce a defect.

This information is generally expressed as a percentage ranging from 100% to "once," but this can be misleading. Assume, for example, that you find a defect during the course of your free testing. After a little research, you narrow down the steps to a reproducible path. You follow those steps and get the bug to happen again. You could, reasonably, report the defect as occurring 100% of the time ‚ you tried the steps twice and it happened both times. However, it may be just as likely that the bug is only reproducible 50% of the time or less, and you just got lucky, as though you had flipped a penny and got it to land heads up twice in a row.

For this reason, many QA labs report the repro rate as the number of attempts paired with the number of observed occurrences, (for example, "8 occurrences out of 10 attempts"). This information is far more useful and accurate, because it allows your test lead, the project manager, and anyone else on the team to evaluate how thoroughly the bug has been tested . It also serves to keep you honest about the amount of testing you've given the defect before you write your report. How likely are you to report that a crash bug happens "once" if you only tried to reproduce it once? If you want to maintain your credibility as a member of the test team, you won't make a habit of this.

On the other hand, with certain defects, even a relatively novice tester can be certain that a bug occurs 100% of the time without iterative testing. Bugs relating to fixed assets, such as a typo in in-game text, can safely be assumed to occur 100% of the time.

The word "anyone" is critical to the definition above, because a defect report is not very helpful if the tester who found the bug is the only one able to re-create it. Because videogame testing is often skill-based, it is not uncommon to encounter a defect in a game (especially a sports, fighting, platform jumper , or stunt game) that can only be reproduced by one tester, but that tester can reproduce the bug 100% of the time. In an ideal situation, that tester will collaborate closely with other members of the team so that they can zero in on a path that will allow the others to re-create the bug.

If this is not possible due to time or other resource constraints, be prepared to send a videotape or .AVI clip of the defect to the development team or, in the worst cases, send the tester to the developer to do a live demonstration of the bug. This is very costly and time consuming because, in addition to any travel expenses, the project is also paying for the cost of having the tester away from the lab (and not testing) for a period of time.

In summary, the more reproducible a bug is, the more likely it is that it will be fixed. So always strive to be a true "repro man."

Directed testing commonly begins when one or more testers report a "random" crash in the game. This is a very frustrating experience, because it often delays running complete test suites and a significant amount of time may be spent restarting the application and re-running tests. Unstable code, especially in the later phases of the project, can be very stressful. Again, remember Rule 1: Don't panic .

Tip ‚

"Random" crashes are seldom random. Use directed testing and the scientific method to eliminate uncertainty along your path to being able to reproduce the bug often enough so that you can get the development team to find and fix it.

The Scientific Method

It's no coincidence that the department where game testers work is often called the lab . Like most laboratories, it's a place where the scientific method is used both to investigate and to explore. If, like most of us, you've forgotten this lesson from middle school science class, here's a review of the steps in the scientific method:

Observe some phenomenon .
Develop a theory ‚ a hypothesis ‚ as to what caused the phenomenon.
Use the hypothesis to make a prediction; for example, if I do this, it will happen again .
Test that prediction by retracing the steps in your hypothesis.
Repeat steps 3 and 4 until you are reasonably certain your hypothesis is true.

These steps provide the structure for any investigative directed testing. Assume you've encountered a quirky defect in a PC game that seems very hard to reproduce. It may be a condition that breaks a script, gets your character stuck in the geometry of the level, causes the audio to drop out suddenly, or that favorite of game testers and players alike, a crash to your PC's desktop. Here's what you do:

First , review your notes. Quickly jot down any information about what you were doing when the defect occurred, while it's still fresh in your mind. Review the videotape. Determine as best you can the very last thing you were doing in the game before it crashed.

Second , process all this information and make your best educated guess as to what specific combination and order of inputs may have caused the crash. Before you can retrace your steps, you have to determine what they were. Write down the input path you think most likely caused the crash.

Third , read over the steps in your path until you are satisfied with them. You guess that if you repeat them, the defect will occur again.

Fourth , reboot your computer, restart the game, and retrace your steps. Did you get the crash to occur again?

Fifth , if you did, great! Write it up. If you didn't, change one (and only one) step in your path. Try the path again, and so on, until you successfully re-create the defect.

Unfortunately, games can be so complex that this process can take a very long time if you don't get help. Don't hesitate to discuss the problem with your test lead or fellow testers. The more information you can share, the more brainstorming you can do, the more " suspects " you can eliminate, and the sooner you'll nail the bug.