Play Testing


Play testing is entirely different from the types of testing discussed so far in this book. The previous chapters have concerned themselves with the primary question of game testing: Does the game work?

Play testing concerns itself with a different but arguably more important question: Does the game work well?

The difference between these two questions is obvious. The word "well" implies an awful lot in four little letters . The answer to the first question is binary; the answer is either yes or no. The answer to the second question is far from binary because of its subjective nature. It can lead to a lot of other questions:

  • Is the game too easy?

  • Is the game to hard?

  • Is the game easy to learn?

  • Are the controls intuitive?

  • Is the interface clear and easy to navigate?

And the most important question of all:

  • Is the game fun?

Unlike the other types of testing covered so far, play testing concerns itself with matters of judgment, not fact. As such, it is some of the most difficult testing you can do.

A Balancing Act

Balance is one of the most elusive concepts in game design, yet it is also one of the most important. Balance refers to the game achieving a point of equilibrium between various, usually conflicting, goals:

  • Challenging, but not frustrating

  • Easy to get into, but deep enough to compel you to stay

  • Simple to learn, but not simplified

  • Complex, but not baffling

  • Long, but not too long

Balance can also refer to a state of rough equality between different competing units in a game:

  • Orcs vs. humans

  • Melee fighters vs. ranged fighters

  • The NFC vs. the AFC

  • The Serpent Clan vs. the Lotus Clan

  • Paul vs. Yoshimitsu

  • Sniper rifles vs. rocket launchers

  • Rogues vs. warlocks

  • Purple triangles vs. red squares

The test team may be asked by the development team or project manager for balance testing at any point in the project life cycle. It is often prudent to suggest delaying any serious consideration of balance until at least Alpha, because it is hard to form useful opinions about a game if key systems are still being implemented.

Once the game is ready for gameplay testing, it is important for test feedback to be as specific and presented in as organized and detailed a manner as any other defect report. Some project managers may ask you to report balance issues as bugs in the defect-tracking database; others may ask the test lead to keep gameplay and balance feedback separate from defects. In either case, express your gameplay observations so that they seem fact-based , and hence authoritative .

Let's examine some feedback I collected from my testers when we conducted balance testing on Battle Realms , a PC RTS developed by Liquid Entertainment. It became clear very early in the course of play testing that the Lotus Warlock unit may be over- powered . One tester wrote:

 Lotus Warlocks do too much damage and need to be nerfed. 

If you've ever spent any time on Internet message boards , comments like this should look very familiar. The tester is not specific. How much damage is too much? Relative to what? If nerfed means "made less powerful," how much less? 50%? 50 points? The development team is not very likely to take this comment seriously, thinking it's an impulsive, emotional reaction. (And it was. The tester had just been on the receiving end of a warlock rush.)

 Lotus Warlocks should have a 5-second cooldown added to their attack. 

This tester is overly specific. He has identified a problem (overpowered warlocks) and gone too far by appointing himself game designer and declaring that the solution is a five-second cooldown (that is, a delay of five seconds between the end of a unit's attack and the beginning of its next attack). This comment presumes three things: that the warlocks are indeed overpowered, that the designers agree that the best solution is to implement a cooldown, and that the code has been written (or can be written) to support a cooldown between attacks. The development team is likely to bristle at this presumption (even if it is a viable solution).

 Lotus Warlocks are more powerful than the other three races' highest-level units. Their attack does approximately 10% more damage than the Dragon Samurai, Serpent Ronin and Wolf Clan Werewolf. They get three attacks in the same time it takes the other clans' heavy units to do two attacks. Players who choose to play as the Lotus Clan win 75% of their games. 

This comment is specific and fact-based. It gives the producers and designers enough information for them to start thinking about rebalancing the units. It does not, however, suggest how the problem should be solved .

Sometimes, however, testers may have suggestions to make

"It's Just a Suggestion"

Play testing occurs constantly during defect testing. Because testers are not robots, they will always be forming opinions and making judgments , however unconscious, about the game they are testing. Occasionally, a tester may feel inspired to suggest a design change. In some labs, these are called "suggestion bugs," and are frequently ignored. Because bugs stress out programmers, artists , and project managers, they rarely appreciate the bug list being cluttered up with a lot of suggestion, or "severity S," defects.

A far more successful process of making your voice heard as a tester, if you're convinced you've got a valuable (and reasonable) idea for a design change, is the following:

  • Ask yourself whether this is a worthwhile change. "Zorro's hat should be blue," is not a worthwhile change.

  • Express your idea in the positive. "The pointer color is bad," is a far less helpful comment than, "Making the pointer green will make it much easier to see."

  • Sleep on it. It may not seem like such a good idea in the morning.

  • Discuss it with your fellow testers. If they think it's a good idea, then discuss it with your test lead.

  • Ask your test lead to discuss it with the project manager or lead designer.

  • If your test lead convinces the development team that your idea has merit, at that point you may be asked to enter the suggestion into the defect database as a bug so that it can be tracked like any other change. Only do this if you are asked to.

I know this process works. As a tester, I have had design tweaks I suggested incorporated into more than a dozen games; yet I've never written a single suggestion bug.

It's Hard Work Making a Game Easy

One element of game balance that becomes the most difficult to pin down late in the development cycle is, ironically, difficulty . Games take months and years to develop. By the time a game enters full-bore testing, the game testers will likely have completed the game more often that even the most ardent fan. The design and development team may have been playing the game for more than a year. Over the course of game development, the following take place:

  • Skills improve with practice. If you couldn't grind a rail for more than 10 feet when you got the first test build of a stunt game, you can now grind for hours and pull off 20-trick chains without breaking a sweat.

  • AI patterns, routes, and strategies are memorized. The behavior of even the most sophisticated AI opponents becomes predictable as you spend weeks playing against them.

  • Puzzles stop being puzzling. In adventure games or other types of games with hide-and-seek puzzle elements, once you learn how to solve a puzzle or where an item is hidden, it's impossible to unlearn it.

  • Tutorials stop tutoring. It's very difficult to continue to evaluate how effective a lesson is if you've already learned the lesson.

  • Jokes become stale.

  • What was once novel becomes very familiar. And familiarity breeds contempt.

The upside of all this is that, on release day, the development and test teams are the best players of their own game on the planet. (This won't last long, though, so you should enjoy "schooling" new players online while you can.)

The downside is that you (and the rest of the project team) lose your ability to objectively evaluate difficulty as the game approaches release. Nothing of what is supposed to be fresh and new to a player seems fresh and new to you. That is why you need another set of fresh eyes: outside gameplay testers.

External Gameplay Testing

External testing begins with resources outside of the test and development teams, but still inside your company. These opinions and data can come from the marketing department, as well as other business units. It's a good idea to have everyone who is willing, from the CFO to the part-time receptionist , to play test the game if there are questions that remain to be answered .

Here we must be careful to keep in mind Dr. Werner Heisenberg's warning that the act of observing something changes the reality observed . Even small children are aware they're participating in a focus group or play test. Because they (or their adult counterparts) are often eager to please , they may tell you what they think you want to hear. Entire books have been written on how to manage this problem with consumer research.

Note ‚  

For more information on managing problems with consumer research, see Sudman and Wansink, Consumer Panels(South-Western, 2002).

Although outside gameplay testing and opinion gathering is an effort typically initiated by the development or design teams, it is often implemented and managed by the test team.

Subject Matter Testing

If your game takes place in the real world, past or present, the development team may wisely choose to have subject matter experts review the game for accuracy.

During the development of the PC jet fighter simulator Flanker , producers at the publisher, SSI, used the Internet to pull together a small group of American and Russian fighter pilots who were given Beta builds of the game. Their feedback about the realism of the game, from the feel of the planes to the Russian-language labels on the cockpit dials, proved invaluable.

These experts posted their comments to a closed message board, and their feedback was carefully recorded, verified , and passed on to the development team. The game was released to very good reviews and was given high marks for its realistic creation of Soviet-era fighter planes.

Such an expert panel tends to be relatively small and easy to manage. It's much more challenging to manage a mass Beta test effectively.

Beta Testing

External Beta testing can give you some very useful data. It can also give you a ton of useless data if the testing is not managed properly.

As mentioned in Chapter 5, there are two types of Beta testing: closed and open . Closed Beta occurs first and is carefully controlled. Closed Beta testers are screened carefully and usually have to answer a lot of questions before they are accepted into the Beta test. These questions can range from the technical specifications of their computer to which specific games they've played recently.

The simplest type of closed Beta testing occurs on console or other offline platforms. Testers are recruited to come into the publisher's or developer's offices, review and play the game, and then fill out a questionnaire or participate in a focus group discussion.

Open Beta occurs after closed Beta concludes. Open Beta is open to all who are interested in participating. Although developers will still solicit some level of gameplay feedback from this much larger group, their role may be primarily to load test the network code and shake out such items as the login system, matchmaking, overall network stability, lag, and so on.

Although Beta testers won't run test cases, they will report defects, in addition to providing gameplay feedback. Most Beta test managers email a bug reporting form or host a bug reporting site that allows Beta testers to report defects, make comments, and ask questions.

Besides playing the game the way it would "normally" be played, here are some other strategies you can adopt as an individual Beta tester:

  • Try to create infinite point-scoring , money-making, or experience-producing strategies.

  • Try to find ways to get stuck in the game environment, such a pinball bouncing forever between two bumpers or an adventurer who falls into the river and can't get out.

  • Spend some time in every feature, mode, or location provided in the game.

  • Spend all of your time in one feature, mode, or location and fully explore its options and functions.

  • Try to find ways to access prohibited modes or locations such as locked race tracks.

  • See what happens when you try to purchase, acquire, or use items and abilities that were designed for characters at a level much higher than yours.

  • Try to be the first one to accomplish something "first" in the game, such as becoming the first level 2 character, the first to enter a particular town, the first to win a match, the first to form a clan, and so on.

  • Wear, wield , and/or activate as many stat-increasing items as you can at one time, such as armor or powerups .

  • Try to be the one with the "most" of something in the game, such as wins, points, money, trophies, or vassals.

Likewise, you can conspire with other Beta testers to create situations that might not have been foreseen by the game developers, or were impossible for them to test, such as:

  • Get as many people as you can to show up in the same location in the game.

  • Get as many people as you can to log into the game at the same time.

  • Get as many people as you can to join the same match at the same time.

  • Get as many people as you can to send you an in-game message at the same time.

  • Create an in-game chat/message group with as many people as possible.

  • Get multiple people to try to give you items at the same time.

  • Get as many people as you can to stand within range of your "area of effect" spell.

  • Get as many people as you can to cast stat increasing or decreasing spells on you.

Who Decides?

Ultimately, decisions that relate to changing the design, rebalancing, adding (or cutting) features, even delaying the release to allow more time for "polish" are not made by game testers. The testers' role is to supply the appropriate decision- makers and stakeholders with the best opinions and information they can, so that the best decisions can be made.




Game Testing All in One
Game Testing All in One (Game Development Series)
ISBN: 1592003737
EAN: 2147483647
Year: 2005
Pages: 205

flylib.com © 2008-2017.
If you may any questions please contact us: flylib@qtcs.net