The Importance of a Technically Stable Launch


When the industry speaks of the launches of games such as World War II Online and Anarchy Online ( AO ), the comment heard most often is, "What made them think they were ready to launch???"

Those two games have become the epitome of the bad launch, and it has cost them dearly in long- term revenue. Neither game was ready to launch when it did and, compounding the problem, both games sold over 40,000 units at retail in the first two days, well above the final simultaneous user levels in the final tests. This created an extreme, yet untested, load on the login and game servers. Both crashed and burned on launch day, experienced ongoing technical problems during the initial two weeks post-launch , and ended up with a lot fewer paying subscribers because of these issues.

So, why did they launch? From public statements made by company representatives and a little deduction , the main reason was apparently financial ”the desire to start generating cash flow (or in PlayNet/Cornered Rat's case with World War II Online , the need , as evidenced by their later bankruptcy) and see a return on the investment after a long and expensive development process. The results were obviously not what they expected: poor word of mouth and initial customer retention rates far below the industry average.

The lesson: Learn from the mistakes of others.

Some of the problems others have seen at launch are listed here:

  • The game gets swamped by too many players trying to log on during the first day of service; login servers and the game itself slow to a crawl or crash repeatedly from the load.

    These days, it doesn't take much to create enough hype on the Internet to get 25,000 “35,000 people to buy the retail package on day one. Of course, these are the motivated buyers ; they are buying the retail SKU because they are going to log on to your game. If you aren't ready to handle 25,000 “35,000 simultaneous players, you're already in the hole, trying to climb out.

    To state the painfully obvious, nothing drives away potential subscribers faster than not being able to log in and play.

  • The guys upstairs want to start seeing some cash flow, so they order a launch, regardless of the state of bug fixes and technical stability in general.

    This problem comes in two forms:

    • Developers or QA misstating the situation ” Most executives today don't fully grasp the technical or support problems of these games. They realize that they don't know, and they depend on the developers and QA to state accurately and clearly whether the game is ready to launch. Whether from pressure, fear, inexperience with the process, or overestimating their own ability to fix problems quickly, developers and QA often misstate the actual condition of the game to give executives a "feel good" moment about ordering the launch. In a sense, this comes under the heading of lying to the stakeholders.

    • Executives ignoring the advice of the developers and/or the QA department ” Even when the development team and QA give accurate reports and recommend a delay in launch, executives have been known to order it anyway, based on the mistaken belief that any cash flow is better than none or just the plain need to start getting some cash in the door. [1] Don't launch with bugs just because you want the income flow. A bad launch actually costs you revenue in the long run because it creates bad word of mouth and reviews for the game. Launching too early hurts the return on investment (ROI), especially in the short and middle terms.

      [1] One of the authors uses an example of a persistent world project she consulted on a few years back. "What the company wanted was to improve acquisition and retention of subscribers. They had a horrible launch that generated some controversy, so I started by trying to back-track the reasons for it," she says. "One of the documents I was shown was the Quality Assurance book listing the unfixed bugs. There were well over 400 pages of them, including known crash bugs in both the game client and server. On the first page of the book was a letter from the head of QA recommending a launch delay, countersigned by the Producer and the Lead Tester and receipt of which was acknowledged by a senior management person. That letter was dated only a few days before the launch actually took place. Senior management obviously did not understand at the time the ramifications of launching in that state." She added, "This kind of thing is not at all unusual in the industry."

  • The client is shipped to retail with known, sometimes serious, problems because the development team is certain it can fix the problems by the time the game hits the shelves .

    While we admire the confidence of coders in their abilities to fix problems quickly, this borders on arrogance because it hasn't happened yet in any of the poor launches seen since 1996. It takes far more than 30 days to find, fix, and test a couple dozen bugs, and most PWs have launched with hundreds of known bugs, never mind the hidden, unexpected ones that lay in wait for launch.

    We have yet to see a situation where a development team was able to come through on this promise. Based on that experience, development teams shouldn't fool themselves that they can do it, and those responsible for green-lighting a launch shouldn't believe it can be done.

  • Player relations is swamped by email and in-game help requests due to understaffing and/or the number of technical problems in the game.

    How you publicly handle technical instability with the players is just as important as how you deal with it internally. Assuming you don't launch prematurely and don't hit many unanticipated technical snags, there will still be some technical problems due to the sheer load of people trying to access the game at the same time and, in a perfect example of Finagle's Law According to Niven, [2] some hidden systems and game mechanics bugs will reveal themselves only when the billing clock is ticking.

    [2] "The perversity of the universe tends to a maximum."

    To our knowledge, no PW launch since 1996 has featured an adequately staffed or trained Player Relations department; everyone keeps getting taken by surprise by the sheer load of email and in-game help requests.

    Here are some guidelines to help you plan for the load:

  • As a general rule of thumb, your direct, individual contacts with the player base per month, meaning in-game help requests, emails, technical support requests, and billing and account management issues, can very easily equal the total number of subscribers. In other words, if you think you'll sign up 40,000 subscribers the first month, you should plan for 40,000 separate help requests.

  • Each help request takes time to resolve. Most of them should be fairly easy to resolve, especially common issues such as, "I'm stuck on the game map!" However, those minutes can add up quickly, as you can see from Table 10.1.

    Table 10.1. Support Hours

    graphics/10tafig01.gif

    As you can see from Table 10.1, even at an average resolution time of one minute per player help request, the hours required to deal with those requests stack up quickly, as does the expense of hiring employees to deal with them. As it is standard in the industry to give the first month for free as a trial period, the more successful you are in attracting players in the first month, the higher the expense will be in personnel for what amounts to freeloading players.

    If the game is technically stable and the game mechanics don't present many unanticipated problems, or as you find and fix problems, the number of requests should steady out to a lower level after the first two to three months. If they don't, the team needs to sit down and figure out why and make recommendations on what should be fixed to lower those request totals. Also, once the player relations and other support people have some experience under their belts, you'll find that resolution times will drop, sometimes dramatically. However, every game is different and, depending on the complexity of it, help requests and resolution times may remain at high levels for a long time, perhaps the life of the game.

    Examining Table 10.1, you can see why emphasizing a technically stable launch, including monitoring and logging tools, and having full-featured tools for the player relations and other support staff can be critical to the profit margin. Keeping careful track of these metrics, striving for stability at launch, shooting for quick resolution times, and adjusting support service where and when it is needed can mean savings of literally hundreds of thousands of dollars in the first six months of operations.

  • Your Beta phase will help you determine the number of support people needed at launch.

    You aren't just testing the game through Beta; you're also testing your assumptions about workflow and staffing levels for player relations, community relations, and billing and account management. Actual staffing levels are discussed in other sections. We mention it here because the warning signals on staffing levels given out during testing are often ignored or misunderstood. There is a general tendency to assume that when Beta technical problems are fixed, no other serious ones will reveal themselves, causing a drop in the number of player help requests at launch. History has shown this to be a somewhat enthusiastic assumption, not grounded in experience.

    If the staff is overwhelmed by 20,000 simultaneous testers during the final load testing and you expect to ship 50,000 units to retail on the first day, one should assume that most of those 50,000 units will be sold in the first week and the buyers will try to connect to the game. If the staff was hard-pressed with 20,000 players, imagine how it will be for them with 30,000 “50,000 simultaneous players.

  • It is highly recommended you "overstaff" for the first month to two months.

    Since you're likely to see the biggest rush of new customers and the most problems with your service during the first month to two months of live operations, it is better to overspend than underspend on personnel during this time. Better the additional cost of an extra 15 or 20 gamemasters (GMs) for a couple months than to risk having bad word of mouth ruin the reputation of the game during this critical period. The idea is to be able to respond faster to player help requests, especially in-game and via email. More rapid player relations response times result in higher player satisfaction, resulting in higher retention rates. They also act as an acquisition tool through good word of mouth ("Hey, these guys really know what they are doing!"). If you are successful in this strategy, you may find your subscriber numbers soaring beyond even inflated expectations and the temporary support personnel becoming permanent.

    How many, with whom, and when to begin the overstaffing is a matter of theory, not established fact, because no one has done it yet. [3] Much also depends on whether the game is technically stable; if the game has serious connection or server crashing problems or serious in-game bugs that affect the performance of characters , no number of GMs is going to be able to keep up with the email and in-game petition load.

    [3] Star Wars Galaxies Creative Director Raph Koster and Executive Producer Rich Vogel have both mentioned this concept in seminars and lectures. Sony Online's Star Wars Galaxies is expected to launch early in 2003, so we may see the first instance of it then.

    Assuming that a launch is technically stable and has no serious game-impeding bugs, our experience with other launches indicates that a basic formula of one player relations person at launch per 1,000 retail units in the first shipment is probably a good number to start with. Following this formula, if you are shipping 50,000 units to retail in the first print run, you should have 50 player relations people on-hand on launch day, 100 for 100,000 units shipped, and so on.

    Next let us consider when and with whom. This is where it gets stickier. It takes time to train a GM for any particular product; these are complex products and you can't just drag someone in off the street and expect him/her to do even an adequate job. That means you have to bring these people in, at an absolute minimum, 30 days ahead of launch, and 60 days would be much better.

    Considering the expense, we'll assume that most publishers will choose the 30-day option. The only viable venue for finding temporary employees for a three-month contract quickly is, of course, temporary worker agencies, any number of which can be found in most metropolitan areas. Using "temps" allows you to scale quickly for the launch and then scale the support personnel numbers to meet the actual need after the launch phase has settled out. While in most cases we would not recommend temporary assignment workers for a duty as critical as player relations, the risk is acceptable for a launch phase under the following conditions:

  • The temps are used for Tier One help requests only, meaning the easiest problems to resolve.

    All other problems are escalated to the permanent staff, who are better able to resolve them quickly and efficiently .

  • Access to in-game powers is extremely limited to maintain security.

    Temps should not be able to manipulate or change player/character stats, create or delete objects from the game, remove or place objects on player/characters, or have any effect on non-player characters (NPCs). At most, temps should be able to "unstick" players from game terrain by moving them a very limited distance, and they should have the power to teleport themselves to any location within the game.

  • All temporary worker in-game actions and "chat" messages are logged, and those logs are reviewed on a daily basis.

    This helps you maintain quality control during the launch phase; even the limited powers noted here can be abused to give the temp's friends an advantage. For example, a temp who can teleport anywhere in the game can "scout" out locations or other players for friends, giving them the advantage of knowing where good treasure or vital NPCs are located or where enemies might be located for ambushing. As temps have no overriding loyalty to the company or the game, you should assume that at least one of them will try to get away with this.

How Much Hardware and Bandwidth?

Previous sections in this book have discussed technical stability and the importance of achieving it before launching a game. "Technical stability" for launch doesn't just mean that the software code works well; it also means making sure the physical infrastructure to handle the load is in place and tested before the paying customers hit the front door. In this case, "infrastructure" means having enough servers and server clusters, routers, and bandwidth capacity to handle the load, and to be able to grow gracefully if the load outstrips expectations.

The key phrase is "to grow gracefully." If the number of subscribers creating accounts outstrips the ability of the available infrastructure to handle them, the game will see increased latency on the game servers due to player overload on the machines and the bandwidth's capacity, traffic jams and delays on the login servers, and refused player connections as login servers or server clusters either hit their assigned peak connections or just plain give up and "die" from the overload. If this occurs, it doesn't matter if you launch with the most stable code and balanced game mechanics ever seen in the PW market; the players' perception of the game will be that it is buggy , was launched too soon, and/or is not ready for prime-time.

To prevent this from happening, you have to plan for both expected load and unexpected overload at launch and have the resources on hand to deal with it. At minimum, you need the extra hardware and bandwidth capacity on hand to set up and integrate a new server cluster on-demand. How much more you'll need depends on a couple factors.

How Many Servers/World Iterations?

Most server clusters for current PWs are configured to handle between 2,000 and 3,000 simultaneous users, out of a total of 10,000 or so paying accounts per server cluster. The actual numbers vary between games; for example, the original server cluster for Funcom's AO (a "dimension") was designed to host 10,000 simultaneous players and all of the game's subscriber accounts. On the whole, however, a configuration of 2,000 “3,000 users/10,000 accounts per cluster is the most common.

Depending on how many load testers participated during the open Beta tests, you probably had between two and five server clusters running at any one time (20,000 “50,000 load testers). Most publishers assume this is the launch load they'll have to handle, but that doesn't take into account the following:

  • The number of potential players that don't generally participate in testing. A significant number of people don't want to put time into the game until it is launched and supposedly stable. There are no hard and fast numbers for this category, but you can generally get a pretty good idea from retail pre-order numbers.

  • Players that subscribe multiple accounts for themselves, friends, and/or family members . This is another number that is tough to anticipate. As a self-protection measure, a publisher should assume that a minimum of 10% of the peak number of open Beta testers will open multiple accounts.

  • Marketing "buzz" around the game versus retail pre-orders and the number of units being shipped to retail in the first print run. Marketing and Press Relations departments have a bad rap for being "weasels" and not even playing the games they market, but they are usually quite good at creating a sense of anticipation for a PW release. In fact, they are generally so successful and so far out in front of the development team that the latter gets taken by surprise when the subscriber load quickly outstrips their own estimates and growth schedules.

The only way for publishers to protect themselves in any of these situations is to look at the initial print run and pre-orders.

How Many Units Are You Shipping to Retail?

Boxed goods publishers want to ship as many shelf units (often called SKUs) to retail as they think the market can handle. It makes sense, considering the normal business model for publishers; build the game, hype the game, and then sell as many units as possible during the "sweet spot" ”the first three to six months of a game's normal shelf life.

This makes perfect sense for a solo-play home game, but it can present problems for a PW game service. The problems a service can experience scale up with the number of total and simultaneous subscribers. The trick, then, is to correctly anticipate the launch load and add both additional server and bandwidth capacity to allow for graceful growth. There are two good metrics that can be used to estimate planned overcapacity levels: the first is the number of testers that took part in the final open Beta phase load tests; the second is the number of retail pre-orders and planned first print-run for the game.

  • Total simultaneous testers ” The testers that come in during the open Beta are generally the motivated players interested in playing the game. The number of simultaneous testers is pretty self-explanatory; if you hit 50,000 and the load tests went smoothly, you can assume that most of them are interested in the game and will purchase the retail unit.

    There is no way to truly gauge just how many of them have or will pre-order the retail unit, plan on being in line to buy it on launch day, or how many people they talked to who weren't in the test but whom they've convinced to buy the game (such as friends, a guild, or a team from another game). The best "fudge factor" to calculate is to just assume that at least 80% of them will buy the game and participate in launch week.

  • Retail pre-orders ” A somewhat more precise indicator of the number of people that will try to play during launch day is the number of pre-orders at retail chains. If the tester load is scaling up to 50,000 during open Beta but the retail pre-orders are at 75,000 two months before launch, you have a happy problem: the need to lay in more hardware and bandwidth. You can assume most of them will be picking up their copies in the first couple days of launch and getting online.

    About two months from the projected launch date, you should look at the pre-order and total open Beta tester numbers and make a determination on how many server clusters you're likely to actually need on launch day. If you have 75,000 pre-orders and you can comfortably hold 10,000 accounts per cluster, you'll need 8 server clusters to handle the expected load. At this point, you must decide how much over-capacity to stock up for. As stated earlier, a minimum of one extra server cluster on-hand is a necessity, not a luxury. Depending on the hype surrounding the game and how the estimates are fine- tuned during the last two months (if marketing is doing its job correctly, pre-orders of the retail unit will continue to grow), you may want to add one or two more extra clusters in reserve.

Staged Launch

One way to cut the risk of being overwhelmed during the launch phase is to stage out the launch. This is accomplished by not shipping all available retail units from the warehouse to the stores, but breaking them into three or four shipments and parceling them out in daily or weekly allotments. For example, instead of sending 100,000 units to retail on the first day, send 25,000, and then send another 25,000 a few days later, and so on. This effectively limits the number of people who can sign up new accounts at any one time, greatly reducing the chances that your hardware and bandwidth will be overwhelmed. Even just three days between major influxes of new players can give you much-needed time to do emergency troubleshooting; it is much better to inconvenience 25,000 subscribers than 50,000.

Staging your launch also allows you to gauge the "last-minute" sales and pre-orders caused by the enforced shortage and decide if you need to order server clusters and bandwidth over and above your current reserves .

A last warning on this: If demand outstrips resources, there is going to be the temptation to install your test server cluster as a live production server "just until we can get in new test hardware." Don't do it; without a test server, you'll be in the position of having to test fixes and changes on a live production server cluster, and this is completely unacceptable. Not only will the players on that cluster howl like banshees (and rightly so; they aren't paying money to be guinea pigs), your reputation will suffer as well. All it will take is one major crash of the "test" server for you to lose all trust from the player base.

How Much Bandwidth Capacity?

By the time open Beta is in full swing, you should have a good idea of the bandwidth consumption per connection to your game ”that is, the total bit rate per individual user and per distinct server cluster. These metrics, combined with your estimation of the number of production and reserve server clusters needed on launch day and the peak number of simultaneous users you'll have to support, will give you a pretty firm idea of the amount of bandwidth you'll need to have available.

Beyond that, you'll need to estimate a growth rate for the game for the first three to six months and make sure that the bandwidth capacity to handle that growth is on hand. The best recommendation we can make is to estimate your need at launch (plus the reserve clusters) and tack on enough extra capacity to handle at least one additional server cluster. If you are co-locating clusters at the network operations centers (NOCs) of an Internet backbone provider, such as Exodus, this isn't a problem; most backbone NOCs have plenty of capacity and can step up availability in an instant. If you plan on hosting your own server farms and laying in your own fiber, this is more problematic ; getting the local phone company to actually lay in the phone line can be like pulling teeth. Average wait times in major metropolitan areas are running over 30 days; this isn't something you can wait until the last minute to get done.



Developing Online Games. An Insiders Guide
Developing Online Games: An Insiders Guide (Nrg-Programming)
ISBN: 1592730000
EAN: 2147483647
Year: 2003
Pages: 230

flylib.com © 2008-2017.
If you may any questions please contact us: flylib@qtcs.net