A Note on Separate Test Teams

Some XP teams choose to have a separate test team and keep the acceptance testing tasks separate from development. We disagree with this approach, for several reasons:

Testers on a separate team have no input into the team's velocity and thus no control over their own workload.
Time for acceptance testing tasks becomes invisible when they aren't included with story tasks. As the end of the iteration comes up, everyone but the testers tends to forget about testing tasks.
Writing, automating, and running acceptance tests is the responsibility of the entire XP team, including the customer.
No story is complete until the acceptance tests have passed.

We're not saying this can never work, but it requires strong leadership and communication. It's simpler to include testers as part of the development team and testing tasks as part of the story tasks.

Iteration planning is best learned by doing. You'll find lots of pointers in XP publications, but they may not specifically address testing tasks. Here's an example of breaking out and estimating test infrastructure and testing tasks for the phone-directory application used in the preceding examples. For the purpose of a clear example, we're using a story containing functionality that, in a real project, might be broken out into multiple stories.

Example 6

Story: Create a screen that will allow the user to search on the Web for businesses by category of business and location.

In Chapter 9, we defined the acceptance tests shown in Table 13.1 for this story.

Table 13.1. All acceptance tests for directory application Story 1
Action	Data	Expected Result
1. Search	A category/location combination for which some businesses exist	Success: a list of each business within the category and location
2.	A category/location combo for which no businesses exist	Failure: empty list
3.	Enough concurrent users and category/location combinations to generate 200 searches and 1,000 hits/second	Success: each user gets appropriate results list within a reasonable response time
4.	Business name/location combo for which some businesses exist	Success: list of businesses that match the name at the specified location
5.	Business address/location combo for which some businesses exist	Success: list of businesses that match the address at the specified location
6.	Search that retrieves more than 20 matching businesses	Success: list has a "Search within these results" option
7.	Misspelled category/location combo for which businesses exist	Failure: contains a "Did you mean?" type message and a list of categories that includes the correct category

Let's say this is the first story in our first iteration. Below is the way we broke out and estimated the tasks in ideal hours. Remember, this is just an example. If your stories aren't this complex, you won't necessarily need all these tasks. We just want to illustrate all the possibilities.

Test Infrastructure Tasks

We need to select and implement a test tool that will allow us to run tests through the user interface. Assume we have a short list of tools we've used before, so we'll be able to decide on one fairly quickly. We estimate 3 hours.

Now, say we have a machine allocated for a test system. We need to install the base software on it and create some shell scripts to pull over our software from the integration system whenever we have a new build. We estimate 3 hours.

We also need some shell or database scripts to load, save, and restore the data in the test environment. We estimate 2 hours.

We want to find or create a tool to save and report results for each run of our functional tests. The whole team needs to look at this and decide the best approach. We estimate 8 hours.

Acceptance Testing Tasks

We know we'll need to work with the customer to define categories and business names. The customer should also come up with the searches to retrieve the various types of results. While we're compiling that, we'll also define error messages, the look and feel of the search form and results list, and probably come up with additional typical and potentially tricky user scenarios. In addition to all this, we'll need to figure out how many concurrent users we need for the load test and determine what "reasonable" response time is. We estimate 4 hours.

Once we've defined the details, we'll need to load test data into the test system and also into whatever format is necessary for the test automation to access it especially the load test, which will require a lot of different searches. We expect to obtain test data from the user and load it into the system using standard database utilities. Once again, we estimate 4 hours.

We're going to break out the test automation tasks for this story by anticipating writing an executable version of the test. We'll do that for real later on in the iteration (Chapter 16). For now, based on the acceptance tests we've defined, it appears we can handle all of them, except test 3, with a single search module.

This module will be parameterized with the search specification as a category and location, a business name and location, or a business address and location. It will perform the search according to the parameters and return the results list. We expect to first build a version of search that calls the system code directly (half hour) and later one in the test tool that goes through the user interface (2.5 hours), for a total of 3 hours to code and test this module.

Four variations on the expected results are possible:

A list of businesses that match the criteria
An empty list when no matches exist
A list with a "Search within" option
An empty list when a term has been misspelled, with a "Did you mean?" suggestion

We expect to need a verifyList module to recognize each of these variations. We estimate 1 hour to code and test this module.

Because test 3 involves multiple users and has performance criteria besides the functionality validated by our search module, we'll need a different module for that test, which we'll call multiSearch. This module will use search and verifyList to perform the searches and validate the results. It will coordinate the concurrent users doing the searches and will collect and validate the response times. Because we aren't as comfortable with multiuser automation, we expect to have to experiment to find the best way to do this, so we'll estimate a 4-hour design spike and another 4 hours to code and test this module.

Finally, setup and reporting for all tests except test 3 are trivial; we won't even estimate those. For test 3 to be valid, we need a good 2 hours at peak load and additional time to analyze the results, especially if the response times aren't good enough. There's a good chance we may have to do this test twice, so we'll plan for two execution tasks of 2 hours each and two analysis tasks of 2 hours each.

Table 13.2 shows our tasks and estimates.

Table 13.2. Example 6 tasks and estimates
Task	Estimate (hours)
Select test tool	3
Set up test system	3
Create scripts to load/save/restore test data	2
Find or build test results reporting tools	8

Define search test details	4
Load search test data	4
Code and test `search` module	3
Code and test `verifyList` module	1
`multiSearch` design spike	4
Code and test `multiSearch` module	4

First load test execution	4
First load test analysis	4
Second load test execution	4
Second load test analysis	4

Total	52

Example 6

Table 13.1. All acceptance tests for directory application Story 1

Test Infrastructure Tasks

Acceptance Testing Tasks

Table 13.2. Example 6 tasks and estimates