11.6 Introduction to Statistical Testing

11.6 Introduction to Statistical Testing

The reliability of a system is a function of the potential for failure in the software. A software failure event, on the other hand, is precipitated by a software fault in the code. Current thinking in the software development field is that faults in programs are randomly distributed through a software system by processes known only to Nature. For these faults to express themselves as software failures, all the code must be exercised in some meaningful way during the testing process. The success of the testing process is measured in a more or less ad hoc fashion by the extent to which all possible paths of a program have been taken at least once, whether the program has succumbed to external input values, etc.

In those program modules that have large values of the fault surrogate measure FI, it is reasonable to suppose that the numbers of faults will also be large. If the code is made to execute a significant proportion of time in modules having large FI values, then the potential exposure to software faults will be great and, thus, the software will be likely to fail.

Software faults and other measures of software quality can be known only at the point the software has finally been retired from service. Then, and only then, can it be said that all of the relevant faults have been isolated and removed from the software system. On the other hand, software complexity can be measured very early in the software life cycle. In some cases, these measures of software complexity can be extracted from design documents. Some of these measures are very good leading indicators of potential software faults. The first step in the software testing process is to construct suitable surrogate measures for software faults.

11.6.1 The Goal of Statistical Testing

The central thesis of statistical testing is that programs will fail because of indigenous faults. These faults are directly related to measurable software attributes. If you can understand the relationship between faults and specific attributes, you will know where to look for faults. Of particular utility, in this role as a fault surrogate, is FI. The important thing for the following discussion is that there exist some measurable program module attribute, say φ, that is highly correlated with software faults. Intuitively (and empirically), a program that spends a high proportion of its time executing a module set Mf of high φ values will be more failure prone than one that seeks our program modules with smaller values of this attribute.

From the discussion earlier, we see that a program can execute any one of a number of basic functionalities. Associated with each of these functionalities is an attendant value of φf. At the beginning of the test process, the first step in statistical testing is to understand the nature of the variability of φ between the functions that the program can execute. The next step in the statistical testing process is to understand the nature of the variance of φ within functionalities. The greater the variability in the φ of a program found during normal test scenarios, the more testing will be required. It is normally presumed that a major objective of software testing is to find all of the potential faults in the software. We cannot presume to believe that a test has been adequate until we have some reasonable assessment of the variability in the behavior of the system.

In this new perspective, the entire nature of the test process is changed. Our objective in this new approach will be to understand the program that has been designed. It may well be that the design of a program will not lend itself to testing in any reasonable timeframe given a high ratio of within-module variability to between-module variability. We cannot hope to determine the circumstances under which a program might fail until we first understand the behavior of the program itself.

The next logical step in the statistical testing process is to seek to identify the precise nature of the mapping of functionality to φf. That is, we need to identify the characteristics of test scenarios that cause our criterion measure of φf to be large. Test scenarios whose values of φf are large are those that will most likely provide maximum exposure to the latent faults in a program. In this new view, a program can be stress tested by choosing test cases that maximize φf.

11.6.2 Estimating the Functional Parameters

The initial stage in the testing process is primarily concerned with developing an understanding of the behavior of the software system that has been created by the design process. As such, all test suites during this phase will be carefully architected to express a single functionality. For each of these program functionalities fi, let us define a random variable a(i) defined on the domain of values of the functional complexity of the ith function. As each test of functionality fi is conducted, we will have a sample data point on a(i). After a sequence of tests of fi, we can compute a sample mean an estimate of the parameter μ(i), and the mean functional complexity of the ith functionality. Similarly, it will be possible to compute the sample variance

which is an estimate of the parameter , the variance of for the ith functionality. Note that the contribution in variability of each test is due largely to the set . If this is an empty set, the range of the functional complexity for a functionality will be considerably constrained.

Without any loss in generality, let us assume that the a(i) are defined on a normal probability distribution. This distribution is specified succinctly by its mean μ(i) and its variance of which the sample mean ā(i) and variance are sufficient statistics. It is now possible to construct a standard error of the estimate ā(i) as follows:

for a set of n tests of the ith function.

Once we have initiated a testing phase it would be desirable to be able to formulate a mathematical statement or a stopping rule for determining the conclusion of the test phase. A stopping rule for this first test phase is based on the attainment of an a priori condition for b and α that

That is, the initial test phase will continue until we have attained an estimate for each of the mean functional complexities of the software system with the 100(1 - α) percent confidence limits set as a condition of a test plan.

As the test process progresses during this first test phase, it may well emerge that the variation in the functional complexity will be inordinately large. We can begin to see in advance that a significant amount of total test resources will be consumed only on this first parametric estimation phase of the software. In this circumstance it is quite possible that the software system is not testable. This being the case, we are now in a position to specify that the implementation of certain functionalities, specifically those for which the within functionality variability is so large as to demand too many test resources, be redesigned and recoded. Not all software can be reasonably or economically tested.

This view represents a fundamental shift in test philosophy. In the past, software testers were obliged to take their best shot at whatever systems were designed and coded by staff outside the test process. Through the systematic introduction of measurement processes in the testing phase of the life cycle it is now possible to set criteria a priori for the testability of software systems.

In looking at the parameter estimation phase of software test, we are working strictly with the within functionality variability of a design. Let us now focus on the variability among the set of all functionalities. The focus in this next phase is on the construction of a sequence of test that will focus on the set of operations O, which will represent the user's view of the system.

11.6.3 The Optimal Allocation of Test Resources

It is our thesis that it would be entirely unrealistic to suppose that it would be possible to construct a fault-free software system or to test a large system exhaustively to find all possible faults in the system. It is not really of interest that a system has faults in it. What is relevant, however, is that there are no faults in the range of functions that the consumer of the software will typically execute. It would take unlimited test resources to certify that a system is, in fact, fault-free. What we would like to do is to allocate our finite test resources to maximize the exposure of the software system to such faults as might be present.

Let us assume that we are investigating a system designed to implement a set of n functional requirements. As a result of the design process, each of these functions fi will have and associated functional complexity . From the initial stages of the test process we will have an estimate ai for . We can now formulate an objective function for the allocation of test resources as follows:

Q = a1x1 + a2x2 + ··· + anxn

where xi is the amount of test resources (time) that will be allocated to the test of the ith function and Q is a measure proportional to FI (quality). It would be well to remember that functional complexity is an expected value for the FI of the design implementation of each functionality. FI was constructed to be a surrogate for software faults. Thus, a functionality whose functional complexity is large can be expected to be fault prone. By maximizing the value of Q in the objective function above a test plan will be seen to have maximized its exposure to the embedded faults in the system. Clearly the best way to maximize Q is to allocate all test resources to the function fi whose functional complexity is the largest. That is, allocate all resources to ai where ai > aj for all j i.

The real object of the test process at this stage is to maximize our exposure to software faults that will have the greatest impact on what functions the user will be performing with the software. In this sense, the test process is constrained by the customer's operational profile. As per the earlier discussion, a user of the software has a distinct set of operations O that he or she perceives the software will perform. These operations are then implemented in a rather more precise set of software functions F. From this operational profile we can then construct the following constraints on the test process to ensure that each of the user's operations receives it due during the test process.

b11xl + bl2x2 + ··· + blnxncl

b21xl + b22x2 + ··· + b2nxnc2

bm1x1 + bm2x2+ ··· + bmnxncm

The coefficients bij are proportionality constants that reflect the implementation of an operation in a set of appropriate functions such that:

and ci represents the test resources assigned to the ith operation. The total test effort is simply the sum of time apportioned to each of the functionalities, .



Software Engineering Measurement
Software Engineering Measurement
ISBN: 0849315034
EAN: 2147483647
Year: 2003
Pages: 139

flylib.com © 2008-2017.
If you may any questions please contact us: flylib@qtcs.net