Equivalence Class Testing | Section I - Black Box Testing Techniques

Table of contents:

On the fourth day of his exploration of the Amazon, Byron climbed out of his inner tube, checked the latest news on his personal digital assistant (hereafter PDA) outfitted with wireless technology, and realized that the gnawing he felt in his stomach was not fear—no, he was not afraid, rather elated—nor was it tension—no, he was actually rather relaxed—so it was in all probability a parasite.

— Chuck Keelan

Introduction

Equivalence class testing is a technique used to reduce the number of test cases to a manageable level while still maintaining reasonable test coverage. This simple technique is used intuitively by almost all testers, even though they may not be aware of it as a formal test design method. Many testers have logically deduced its usefulness, while others have discovered it simply because of lack of time to test more thoroughly.

Consider this situation. We are writing a module for a human resources system that decides how we should process employment applications based on a person's age. Our organization's rules are:

0–16	Don't hire
16–18	Can hire on a part-time basis only
18–55	Can hire as a full-time employee
55–99	Don't hire[*]
[*]Note: If you've spotted a problem with these requirements, don't worry. They are written this way for a purpose and will be repaired in the next chapter.

Observation

With these rules our organization would not have hired Doogie Houser, M.D. or Col. Harlan Sanders, one too young, the other too old.

Should we test the module for the following ages: 0, 1, 2, 3, 4, 5, 6, 7, 8, ..., 90, 91, 92, 93, 94, 95, 96, 97, 98, 99? If we had lots of time (and didn't mind the mind-numbing repetition and were being paid by the hour) we certainly could. If the programmer had implemented this module with the following code we should test each age. (If you don't have a programming background don't worry. These examples are simple. Just read the code and it will make sense to you.)

 If (applicantAge == 0) hireStatus="NO";
 If (applicantAge == 1) hireStatus="NO";
 …
 If (applicantAge == 14) hireStatus="NO";
 If (applicantAge == 15) hireStatus="NO";
 If (applicantAge == 16) hireStatus="PART";
 If (applicantAge == 17) hireStatus="PART";
 If (applicantAge == 18) hireStatus="FULL";
 If (applicantAge == 19) hireStatus="FULL";
 …
 If (applicantAge == 53) hireStatus="FULL";
 If (applicantAge == 54) hireStatus="FULL";
 If (applicantAge == 55) hireStatus="NO";
 If (applicantAge == 56) hireStatus="NO";
 …
 If (applicantAge == 98) hireStatus="NO";
 If (applicantAge == 99) hireStatus="NO";

Given this implementation, the fact that any set of tests passes tells us nothing about the next test we could execute. It may pass; it may fail.

Luckily, programmers don't write code like this (at least not very often). A better programmer might write:

 If (applicantAge >= 0 && applicantAge <=16)
 hireStatus="NO";
 If (applicantAge >= 16 && applicantAge <=18)
 hireStatus="PART";
 If (applicantAge >= 18 && applicantAge <=55)
 hireStatus="FULL";
 If (applicantAge >= 55 && applicantAge <=99)
 hireStatus="NO";

Given this typical implementation, it is clear that for the first requirement we don't have to test 0, 1, 2, ... 14, 15, and 16. Only one value needs to be tested. And which value? Any one within that range is just as good as any other one. The same is true for each of the other ranges. Ranges such as the ones described here are called equivalence classes. An equivalence class consists of a set of data that is treated the same by the module or that should produce the same result. Any data value within a class is equivalent, in terms of testing, to any other value. Specifically, we would expect that:

If one test case in an equivalence class detects a defect, all other test cases in the same equivalence class are likely to detect the same defect.
If one test case in an equivalence class does not detect a defect, no other test cases in the same equivalence class is likely to detect the defect.

Key Point

A group of tests forms an equivalence class if you believe that:

They all test the same thing.
If one test catches a bug, the others probably will too.
If one test doesn't catch a bug, the others probably won't either.

Cem Kaner Testing Computer Software

This approach assumes, of course, that a specification exists that defines the various equivalence classes to be tested. It also assumes that the programmer has not done something strange such as:

 If (applicantAge >= 0 && applicantAge <=16)
 hireStatus="NO";
 If (applicantAge >= 16 && applicantAge <=18)
 hireStatus="PART";
 If (applicantAge >= 18 && applicantAge <=41)
 hireStatus="FULL";
 // strange statements follow
 If (applicantAge == 42 && applicantName == "Lee")
 hireStatus="HIRE NOW AT HUGE SALARY";
 If (applicantAge == 42 && applicantName <> "Lee")
 hireStatus="FULL";
 // end of strange statements

 If (applicantAge >= 43 && applicantAge <=55)
 hireStatus="FULL";
 If (applicantAge >= 55 && applicantAge <=99)
 hireStatus="NO";

Using the equivalence class approach, we have reduced the number of test cases from 100 (testing each age) to four (testing one age in each equivalence class)—a significant savings.

Now, are we ready to begin testing? Probably not. What about input values like 969, -42, FRED, and &$#!@? Should we create test cases for invalid input? The answer is, as any good consultant will tell you, "it depends." To understand this answer we need to examine an approach that came out of the object-oriented world called design-by-contract.

Note

According to the Bible, the age of Methuselah when he died was 969 years (Gen 5:27). Thanks to the Gideons who made this data easily accessible in my hotel room without the need for a high speed Internet connection.

In law, a contract is a legally binding agreement between two (or more) parties that describes what each party promises to do or not do. Each of these promises is of benefit to the other.

In the design-by-contract approach, modules (called "methods" in the object-oriented paradigm, but "module" is a more generic term) are defined in terms of pre-conditions and post-conditions. Post-conditions define what a module promises to do (compute a value, open a file, print a report, update a database record, change the state of the system, etc.). Pre-conditions define what that module requires so that it can meet its post-conditions. For example, if we had a module called openFile, what does it promise to do? Open a file. What would legitimate preconditions of openFile be? First, the file must exist; second, we must provide the name (or other identifying information) of the file; third, the file must be "openable," that is, it cannot already be exclusively opened by another process; fourth, we must have access rights to the file; and so on. Pre-conditions and postconditions establish a contract between a module and others that invoke it.

Testing-by-contract is based on the design-by-contract philosophy. Its approach is to create test cases only for the situations in which the pre-conditions are met. For example, we would not test the openFile module when the file did not exist. The reason is simple. If the file does not exist, openFile does not promise to work. If there is no claim that it will work under a specific condition, there is no need to test under that condition.

For More Information

See Bertrand Meyer's book Object-Oriented Software Construction for more on design-by-contract.

At this point testers usually protest. Yes, they agree, the module does not claim to work in that case, but what if the preconditions are violated during production? What does the system do? Do we get a misspelled word on the screen or a smoking crater where our company used to be?

A different approach to design is defensive design. In this case the module is designed to accept any input. If the normal preconditions are met, the module will achieve its normal postconditions. If the normal pre-conditions are not met, the module will notify the caller by returning an error code or throwing an exception (depending on the programming language used). This notification is actually another one of the module's postconditions. Based on this approach we could define defensive testing: an approach that tests under both normal and abnormal pre-conditions.

Insight

A student in one of my classes, let's call him Fred, said he didn't really care which design approach was being used, he was going to always use defensive testing. When I asked why, he replied, "If it doesn't work, who will get the blame - those responsible or the testers?"

How does this apply to equivalence class testing? Do we have to test with inputs like -42, FRED, and &$#!@? If we are using design-by-contract and testing-by-contract the answer is No. If we are using defensive design and thus defensive testing, the answer is Yes. Ask your designers which approach they are using. If they answer either "contract" or "defensive," you know what style of testing to use. If they answer "Huh?" that means they are not thinking about how modules interface. They are not thinking about pre-condition and post-condition contracts. You should expect integration testing to be a prime source of defects that will be more complex and take more time than anticipated.

Technique

The steps for using equivalence class testing are simple. First, identify the equivalence classes. Second, create a test case for each equivalence class. You could create additional test cases for each equivalence class if you have the time and money. Additional test cases may make you feel warm and fuzzy, but they rarely discover defects the first doesn't find.

Insight

A student in one of my classes, let's call her Judy, felt very uncomfortable about having only one test case for each equivalence class. She wanted at least two for that warm and fuzzy feeling. I indicated that if she had the time and money that approach was fine but suggested the additional tests would probably be ineffective. I asked her to keep track of how many times the additional test cases found defects that the first did not and let me know. I never heard from Judy again.

Different types of input require different types of equivalence classes. Let's consider four possibilities. Let's assume a defensive testing philosophy of testing both valid and invalid input. Testing invalid inputs is often a great source of defects.

If an input is a continuous range of values, then there is typically one class of valid values and two classes of invalid values, one below the valid class and one above it. Consider the Goofy Mortgage Company (GMC). They will write mortgages for people with incomes between $1,000/month and $83,333/month. Anything below $1,000/month you don't qualify. Anything over $83,333/month you don't need GMC, just pay cash.

For a valid input we might choose $1,342/month. For invalids we might choose $123/month and $90,000/month.

click to expand
Figure 3-1: Continuous equivalence classes

If an input condition takes on discrete values within a range of permissible values, there are typically one valid and two invalid classes. GMC will write a single mortgage for one through five houses. (Remember, it's Goofy.) Zero or fewer houses is not a legitimate input, nor is six or greater. Neither are fractional or decimal values such as 2 1/2 or 3.14159.

click to expand
Figure 3-2: Discrete equivalence classes

For a valid input we might choose two houses. Invalids could be -2 and 8.

GMC will make mortgages only for a person. They will not make mortgages for corporations, trusts, partnerships, or any other type of legal entity.

click to expand
Figure 3-3: Single selection equivalence classes

For a valid input we must use "person." For an invalid we could choose "corporation" or "trust" or any other random text string. How many invalid cases should we create? We must have at least one; we may choose additional tests for additional warm and fuzzy feelings.

GMC will make mortgages on Condominiums, Townhouses, and Single Family dwellings. They will not make mortgages on Duplexes, Mobile Homes, Treehouses, or any other type of dwelling.

click to expand
Figure 3-4: Multiple selection equivalence class

For valid input we must choose from "Condominium," "Townhouse," or "Single Family." While the rule says choose one test case from the valid equivalence class, a more comprehensive approach would be to create test cases for each entry in the valid class. That makes sense when the list of valid values is small. But, if this were a list of the fifty states, the District of Columbia, and the various territories of the United States, would you test every one of them? What if the list were every country in the world? The correct answer, of course, depends on the risk to the organization if, as testers, we miss something that is vital.

Now, rarely will we have the time to create individual tests for every separate equivalence class of every input value that enters our system. More often, we will create test cases that test a number of input fields simultaneously. For example, we might create a single test case with the following combination of inputs:

Key Point

Rarely will we have the time to create individual tests for every separate equivalence class of every input value.

Table 3-1: A test case of valid data values.
Monthly Income	Number of Dwellings	Applicant	Dwelling Types	Result
$5,000	2	Person	Condo	Valid

Each of these data values is in the valid range, so we would expect the system to perform correctly and for the test case to report Pass.

It is tempting to use the same approach for invalid values.

Table 3-2: A test case of all invalid data values. This is *not* a good approach.
Monthly Income	Number of Dwellings	Applicant	Dwelling Types	Result
$100	8	Partnership	Treehouse	Invalid

If the system accepts this input as valid, clearly the system is not validating the four input fields properly. If the system rejects this input as invalid, it may do so in such a way that the tester cannot determine which field it rejected. For example:

ERROR: 653X-2.7 INVALID INPUT

In many cases, errors in one input field may cancel out or mask errors in another field so the system accepts the data as valid. A better approach is to test one invalid value at a time to verify the system detects it correctly.

Table 3-3: A set of test cases varying invalid values one by one.
Monthly Income	Number of Dwellings	Applicant	Dwelling Types	Result
$100	1	Person	SingleFam	Invalid
$1,342	0	Person	Condo	Invalid
$1,342	1	Corporation	Townhouse	Invalid
$1,342	1	Person	Treehouse	Invalid

For additional warm and fuzzy feelings, the inputs (both valid and invalid) could be varied.

Table 3-4: A set of test cases varying invalid values one by one but also varying the valid values.
Monthly Income	Number of Dwellings	Applicant	Dwelling Types	Result
$100	1	Person	Single Family	Invalid
$1,342	0	Person	Condominium	Invalid
$5,432	3	Corporation	Townhouse	Invalid
$10,000	2	Person	Treehouse	Invalid

Another approach to using equivalence classes is to examine the outputs rather than the inputs. Divide the outputs into equivalence classes, then determine what input values would cause those outputs. This has the advantage of guiding the tester to examine, and thus test, every different kind of output. But this approach can be deceiving. In the previous example, for the human resources system, one of the system outputs was NO, that is, Don't Hire. A cursory view of the inputs that should cause this output would yield {0, 1, ..., 14, 15}. Note that this is not the complete set. In addition {55, 56, ..., 98, 99} should also cause the NO output. It's important to make sure that all potential outputs can be generated, but don't be fooled into choosing equivalence class data that omits important inputs.

Examples

Example 1

Referring to the Trade Web page of the Brown & Donaldson Web site described in Appendix A, consider the Order Type field. The designer has chosen to implement the decision to Buy or Sell through radio buttons. This is a good design choice because it reduces the number of test cases the tester must create. Had this been implemented as a text field in which the user entered "Buy" or "Sell" the tester would have partitioned the valid inputs as {Buy, Sell} and the invalids as {Trade, Punt, ...}. What about "buy", "bUy", "BUY"? Are these valid or invalid entries? The tester would have to refer back to the requirements to determine their status.

Insight

Let your designers and programmers know when they have helped you. They'll appreciate the thought and may do it again.

With the radio button implementation no invalid choices exist, so none need to be tested. Only the valid inputs {Buy, Sell} need to be exercised.

Example 2

Again, referring to the Trade Web page, consider the Quantity field. Input to this field can be between one and four numeric characters (0, 1, ..., 8,9) with a valid value greater or equal to 1 and less than or equal to 9999. A set of valid inputs is {1, 22, 333, 4444} while invalid inputs are {-42, 0, 12345, SQE, $#@%}.

Insight

Very often your designers and programmers use GUI design tools that can enforce restrictions on the length and content of input fields. Encourage their use. Then your testing can focus on making sure the requirement has been implemented properly with the tool.

Example 3

On the Trade page the user enters a ticker Symbol indicating the stock to buy or sell. The valid symbols are {A, AA, AABC, AAC, ..., ZOLT, ZOMX, ZONA, ZRAN). The invalid symbols are any combination of characters not included in the valid list. A set of valid inputs could be {A, AL, ABE, ACES, AKZOY) while a set of invalids could be {C, AF, BOB, CLUBS, AKZAM, 42, @#$%).

For More Information

Click on the Symbol Lookup button on the B&D Trade page to see the full list of stock symbols.

Example 4

Rarely will we create separate sets of test cases for each input. Generally it is more efficient to test multiple inputs simultaneously within tests. For example, the following tests combine Buy/Sell, Symbol, and Quantity.

Table 3-5: A set of test cases varying invalid values one by one.
Buy/Sell	Symbol	Quantity	Result
Buy	A	10	Valid
Buy	C	20	Invalid
Buy	A	0	Invalid
Sell	ACES	10	Valid
Sell	BOB	33	Invalid
Sell	ABE	-3	Invalid

Applicability and Limitations

Equivalence class testing can significantly reduce the number of test cases that must be created and executed. It is most suited to systems in which much of the input data takes on values within ranges or within sets. It makes the assumption that data in the same equivalence class is, in fact, processed in the same way by the system. The simplest way to validate this assumption is to ask the programmer about their implementation.

Equivalence class testing is equally applicable at the unit, integration, system, and acceptance test levels. All it requires are inputs or outputs that can be partitioned based on the system's requirements.

Summary

Equivalence class testing is a technique used to reduce the number of test cases to a manageable size while still maintaining reasonable coverage.
This simple technique is used intuitively by almost all testers, even though they may not be aware of it as a formal test design method.
An equivalence class consists of a set of data that is treated the same by the module or that should produce the same result. Any data value within a class is equivalent, in terms of testing, to any other value.

Practice

The following exercises refer to the Stateless University Registration System Web site described in Appendix B. Define the equivalence classes and suitable test cases for the following:
1. ZIP Code—five numeric digits.
2. State—the standard Post Office two-character abbreviation for the states, districts, territories, etc. of the United States.
3. Last Name—one through fifteen characters (including alphabetic characters, periods, hyphens, apostrophes, spaces, and numbers).
4. User ID—eight characters at least two of which are not alphabetic (numeric, special, nonprinting).
5. Student ID—eight characters. The first two represent the student's home campus while the last six are a unique six-digit number. Valid home campus abbreviations are: AN, Annandale; LC, Las Cruces; RW, Riverside West; SM, San Mateo; TA, Talbot; WE, Weber; and WN, Wenatchee.

References

Beizer, Boris (1990). Software Testing Techniques. Van Nostrand Reinhold.

Kaner, Cem,Jack Falk and Hung Quoc Nguyen (1999). Testing Computer Software (Second Edition). John Wiley & Sons.

Myers, Glenford J. (1979). The Art of Software Testing. John Wiley & Sons.

Preface

Section I - Black Box Testing Techniques

Section II - White Box Testing Techniques

Section III - Testing Paradigms

Section IV - Supporting Technologies

Section V - Some Final Thoughts