+ All Categories
Home > Documents > Optimal improvement of the lower bound performance of partition testing strategies

Optimal improvement of the lower bound performance of partition testing strategies

Date post: 21-Sep-2016
Category:
Upload: yt
View: 212 times
Download: 0 times
Share this document with a friend
8
Optimal improvement of the lower bound performance of partition testing strategies T.Y.Chen Y.T. Y U Indexing terms: Optimal test distribution, Partition testing, Proportional sampling strategy, Random testing, Software testing, Subdomain testing Abstract: Although partition testing strategies are intuitively more appealing than random testing, previous empirical and analytical studies show that under unfavourable circumstances partition testing can be very ineffective. The problem of maximally improving the lower bound performance of partition testing by the choice of appropriate test distributions is investigated. An algorithm that generates optimal test distributions for this purpose is proposed and analysed. Moreover, the algorithm can also serve to systematically approximate the proportional sampling strategy, which has previously been proved to be at least as good as random testing. 1 Introduction Random testing is a common approach to testing a program [1-6]. It simply selects test cases randomly from the input domain, i.e. from the set of all possible inputs to the program. The selection can be based on a uniform distribution, or according to the operational profile of the program’s usage [7]. The main advan- tages of random testing are that the amount of over- head required in the automatic generation of test cases is minimal, and that it is easily amenable to statistical analysis and estimates of program reliability. An alternative approach called subdomain testing encompasses strategies in which the input domain is divided into subdomains and test cases are selected from each subdomain [6, 8, 91. The subdomains are usually formed from the functional specifications or the structural properties of the program. In general, these subdomains may overlap with one another. In this paper, however, we shall restrict our attention to the special case called partition testing, in which all sub- domains are actually disjoint [3-5, 10, 111. For exam- ple, in the path coverage testing method, the set of all inputs that traverse a given program path forms a sub- domain which is disjoint from all subdomains corre- sponding to other program paths. Intuitively, there are strong reasons to believe that subdomain testing should perform better than random testing in revealing failures. First, subdomain testing 0 IEE, 1997 IEE Proceedings online no. 19971792 Paper frst received 21st May and in revised form 10th November 1997 The authors are with the Department of Computer Science, University of Melboume, Parkville 3052, Australia strategies make use of more information about the specifications or the program codes. Secondly, sub- domain testing explicitly requires that every sub- domain, which usually corresponds to a particular feature of the program or the specification, be ‘exer- cised’ by selecting and executing at least one test case from the subdomain. Intuitively, in subdomain testing, these features are systematically ‘covered’, whereas in random testing it is likely that some software features never get ‘exercised’. Thirdly, many subdomain testing strategies are meant to be fault-based, i.e. the sub- domains are designed in the hope of revealing particu- lar faults more successfully. Indeed, Weyuker and Jeng [ 101 have argued that fault-based subdomain testing strategies can be very effective in uncovering faults in programs. Unfortunately, recent studies [I, 2, 4, 6, 10, 111 have shown that general subdomain testing is not always better than random testing. Worse still, in extreme cases the totality of all n test cases selected by a sub- domain testing strategy may be only marginally more effective than one single test case selected using random testing [4, 6, lo]. This worst case occurs when the actual failure-causing inputs (the inputs that produce incorrect outputs) all reside in one particular sub- domain from which only one test case is selected, and the size of that subdomain is comparable to the entire input domain. Those test cases selected from the sub- domains with no failure-causing inputs are simply ‘wasted’, and the only useful test case is slightly more effective than one test case selected from the entire input domain. In this paper, we show how the lower bound per- formance of partition testing can actually be maximally improved by appropriately allotting test cases to the subdomains. We shall describe and analyse an algo- rithm that computes the optimal test distributions for this purpose. Our algorithm requires only nominal computation overhead once a partitioning scheme has been chosen. Moreover, the algorithm can also serve to approximate the proportional sampling strategy [ 121, which is a special class of partition testing strategies that has been proved to have a better chance than ran- dom testing in detecting at least one failure [4]. The algorithm therefore can replace the informal intuitive guidelines suggested by Chan et al. [12]. 2 Optimal test distributions Consider the following situation: given a program to test, the tester has determined that there are k testing requirements, each of which must be satisfied at least 211 IEE Proc.-Softw. Eng., Vol. 144, No. 5-6, October-December 1997
Transcript

Optimal improvement of the lower bound performance of partition testing strategies

T.Y.Chen Y.T. Y U

Indexing terms: Optimal test distribution, Partition testing, Proportional sampling strategy, Random testing, Software testing, Subdomain testing

Abstract: Although partition testing strategies are intuitively more appealing than random testing, previous empirical and analytical studies show that under unfavourable circumstances partition testing can be very ineffective. The problem of maximally improving the lower bound performance of partition testing by the choice of appropriate test distributions is investigated. An algorithm that generates optimal test distributions for this purpose is proposed and analysed. Moreover, the algorithm can also serve to systematically approximate the proportional sampling strategy, which has previously been proved to be at least as good as random testing.

1 Introduction

Random testing is a common approach to testing a program [1-6]. It simply selects test cases randomly from the input domain, i.e. from the set of all possible inputs to the program. The selection can be based on a uniform distribution, or according to the operational profile of the program’s usage [7]. The main advan- tages of random testing are that the amount of over- head required in the automatic generation of test cases is minimal, and that it is easily amenable to statistical analysis and estimates of program reliability.

An alternative approach called subdomain testing encompasses strategies in which the input domain is divided into subdomains and test cases are selected from each subdomain [6, 8, 91. The subdomains are usually formed from the functional specifications or the structural properties of the program. In general, these subdomains may overlap with one another. In this paper, however, we shall restrict our attention to the special case called partition testing, in which all sub- domains are actually disjoint [3-5, 10, 111. For exam- ple, in the path coverage testing method, the set of all inputs that traverse a given program path forms a sub- domain which is disjoint from all subdomains corre- sponding to other program paths.

Intuitively, there are strong reasons to believe that subdomain testing should perform better than random testing in revealing failures. First, subdomain testing

0 IEE, 1997 IEE Proceedings online no. 19971792 Paper frst received 21st May and in revised form 10th November 1997 The authors are with the Department of Computer Science, University of Melboume, Parkville 3052, Australia

strategies make use of more information about the specifications or the program codes. Secondly, sub- domain testing explicitly requires that every sub- domain, which usually corresponds to a particular feature of the program or the specification, be ‘exer- cised’ by selecting and executing at least one test case from the subdomain. Intuitively, in subdomain testing, these features are systematically ‘covered’, whereas in random testing it is likely that some software features never get ‘exercised’. Thirdly, many subdomain testing strategies are meant to be fault-based, i.e. the sub- domains are designed in the hope of revealing particu- lar faults more successfully. Indeed, Weyuker and Jeng [ 101 have argued that fault-based subdomain testing strategies can be very effective in uncovering faults in programs.

Unfortunately, recent studies [I , 2, 4, 6, 10, 111 have shown that general subdomain testing is not always better than random testing. Worse still, in extreme cases the totality of all n test cases selected by a sub- domain testing strategy may be only marginally more effective than one single test case selected using random testing [4, 6, lo]. This worst case occurs when the actual failure-causing inputs (the inputs that produce incorrect outputs) all reside in one particular sub- domain from which only one test case is selected, and the size of that subdomain is comparable to the entire input domain. Those test cases selected from the sub- domains with no failure-causing inputs are simply ‘wasted’, and the only useful test case is slightly more effective than one test case selected from the entire input domain.

In this paper, we show how the lower bound per- formance of partition testing can actually be maximally improved by appropriately allotting test cases to the subdomains. We shall describe and analyse an algo- rithm that computes the optimal test distributions for this purpose. Our algorithm requires only nominal computation overhead once a partitioning scheme has been chosen. Moreover, the algorithm can also serve to approximate the proportional sampling strategy [ 121, which is a special class of partition testing strategies that has been proved to have a better chance than ran- dom testing in detecting at least one failure [4]. The algorithm therefore can replace the informal intuitive guidelines suggested by Chan et al. [12].

2 Optimal test distributions

Consider the following situation: given a program to test, the tester has determined that there are k testing requirements, each of which must be satisfied at least

211 IEE Proc.-Softw. Eng., Vol. 144, No. 5-6, October-December 1997

once. For example, the program may perform one of k functions depending on which input is selected. Resource considerations usually limit the number n of test cases to be executed. A problem naturally arises: what is the best way of distributing these n test cases?

Of course, if n < k, it would be impossible to test all the k functions using n test cases. The problem is also trivial if n = k , as the only solution then is to select exactly one test case for each requirement. In practice, we may be able to run more test cases, and if so, the performance of the testing varies with the distribution of test cases.

Several recent studies [3, 13, 141 have suggested dif- ferent test distributions that will optimise certain aspects of the quality of the testing. Tsoukalas et al. [3] compare random testing with partition testing by look- ing at the upper confidence bounds for the cost weighted performance of the two strategies. They show that when no failures are detected, the test distribution is optimal if the number of test cases selected from a subdomain D, is proportional to the product of the cost of failure for D, and the probability that a randomly selected input will belong to D,.

Li and Malaiya [13] show that for the purpose of defect removal, the optimal test distribution depends on the operational profile and the defect detectability profile of the program. They suggest that test inputs should have a distribution more biased than the opera- tional profile towards the frequently used domains if only limited testing can be afforded, but they should be more evenly distributed if very high reliability is to be achieved through extensive testing.

Gutjahr [14], on the other hand, presents a method to determine a test distribution that yields an unbiased estimator of the software failure cost with minimum variance.

In this paper, our concern is the fault-detecting abil- ity of partition testing strategies. Ideally, we would like to find a test distribution that would guarantee that the testing will always be effective. This is only possible, however, if we have some information or assumptions on the location of the failure-causing inputs. For exam- ple, we have shown in [5, 61 that if we know the relative failure rate of each subdomain, then there are several ways of distributing the test cases to ensure that the testing would be more effective than random testing.

Unfortunately, although the number and locations of the failures are fixed for a given program, it is more often that the tester does not have any knowledge about them prior to testing. If for the chosen test distri- bution the failure-causing inputs are actually located unfavourably, the testing can be very ineffective. This should clearly be avoided if possible, especially when software failure costs are high.

Although recent studies have proved that some spe- cial classes of partition testing strategies are at least as effective as random testing, there are associated practi- cal difficulties yet to be resolved. For instance, Wey- uker and Jeng [lo] observe that this is true if all subdomains are equal in size and the same number of test cases is selected from each subdomain. However, in practice, subdomains rarely have the same size. In [4] we have shown that the restriction of equal subdomain sizes can be removed, provided that test cases are allo- cated in proportion to the size of the subdomains. Such a way of allocating test cases is called the proportional sampling strategy [12], but it is not always possible to

212

follow this strategy strictly [12], as the number of test cases must be an integer. We shall discuss this in more detail in Section 5.

In the following work, we investigate the test distri- bution problem under the assumption of a complete lack of information about the failures. Here we address this problem by dealing with these situations using the maximin criterion, which is well known in the fields of artificial intelligence, games and decisions, operations research and business studies. To cope with the uncer- tainty due to the lack of information, the maximin cri- terion aims at getting the best out of the worst cases. However, since selection of test cases is done probabil- istically, we will only consider the ‘best’ or ‘worst’ in terms of a probabilistic measure of effectiveness of the testing.

In the literature, two metrics have been commonly used to assess the fault detecting ability of a testing strategy. They are the P-measure, the probability of detecting at least one failure [4, lo], and the E-mesaure, the expected number of failures detected [6, 91. In [6] we have shown that the two measures are not only closely related but also possess very similar properties. In particular, when all failure rates (the proportions of the failure-causing inputs in the subdomains) are small, the value of the E-measure is a good approximation of the P-measure. For these reasons, we shall mainly use the simpler E-measure to facilitate our analysis. The effect of using the P-measure will be discussed in Sec- tion 5.

Thus, in our context, the maximin criterion chooses the test distribution whose lower bound E-measure is the maximum possible. This ensures that with the same total number of test cases, partition testing will on average perform as well as possible, even when the fail- ure-causing inputs are located adversely.

One may wonder whether it would be too conserva- tive to use the maximin criterion in practice. However, we believe that there are ample reasons for the testers to take a more conservative approach. First, if failures are numerous, easy to detect or located favourably, they will probably be caught with most test distribu- tions anyway. There should be no need to worry about the favourable circumstances. On the other hand, we do need to be concerned with the possibility of our tests being too ineffective, especially when software failure costs are high. Ineffective tests not only leave possible program faults undetected, they also reduce our confidence of the program’s reliability. With the lack of information about the actual failures, it is more advisable to avoid unnecessary risks than to hope for the best.

Although the maximin criterion aims principally at improving the worst case performance, it actually also improves the effectiveness of the testing in many other ways as well. In a sense, improving the worst case also helps increase the probability that the program is cor- rect, particularly when no failures are detected. Indeed, the reliability and dependability of a program are often based on conservative measures rather than optimistic ones. By improving the worst case performance, the conservative estimate of the program’s reliability or dependability is also raised.

3

Let us introduce some notation to be used throughout this paper. Let D denote the input domain and D,

Formulation of the MME problem

IEE Proc -Softw Eng Vol 144, No 5-6, October-December 1597

denote the ith subdomain, i = 1, ..., k, where k is the number of subdomains. Those inputs in D that produce incorrect outputs are called failure-causing inputs. Let d, n and m be the total number of inputs, test cases selected and failure-causing inputs, respectively, in the entire input domain D. If m = 0 or m = d, then the test- ing will be equally (in)effective regardless of the test distribution chosen. Thus, we assume d > m > 0. We also assume that d 2 k and n 2 k, where k is the number of subdomains.

Following previous studies [4, 6, 101, we assume that test cases are selected randomly and uniformly from each subdomain, and that the cost of a failure is the same for all inputs. The fault-detecting ability of the testing, as measured by the expected number of failures detected (the E-measure) [6], is given by

Y 2 = 1

where d,, n, and m, are the number of inputs, test cases selected and failure-causing inputs in subdomain D,, respectively. As in other studies on optimal test distri- butions [3, 13, 141, we have assumed that all sub- domains are disjoint, and so EL1 d, = d, EL1 n, = n and C,=l m, = m.

To facilitate our exposition, we adopt the following notation and definitions: (i) I/< and Id denote the set of all positive integers not greater than k and d, respectively. Thus we write i E I, instead of i = 1, ..., k, and m E Zd instead of 0 < m < d. (ii) A failure distribution, or a fuilure vector, m = (ml, ..., ml,) is a k-tuple of integers representing a particular distribution of failure-causing inputs in the sub- domains. (iii) W(m) denotes the set of all possible distributions of m failure-causing inputs, i.e.

k

k

and Crrr,=m z = 1

(iv) A test distribution, or a test vector, n = (nl, ..., nk) is a k-tuple of integers representing the distribution of test cases in the subdomains. (v) V(n) denotes the set of all possible distributions of n test cases, i.e.

k

and En, = n 2=1

(vi) A test vector n is said to be jeasible if n E V(n). (vii) The sampling rute oj of a subdomain Di is defined as oj = nidi. Similarly, a sampling rate vector _o = (o,, ..., ok) is a k-tuple of positive real numbers representing the sampling rates of all the subdomains.

As explained, our objective is to find a test distribu- tion that maximises the lower bound performance of the testing when a particular partitioning scheme has been chosen, i.e. when the values of k, dl , ..., dk are fixed. With this in mind, the E-measure depends on the actual failure distribution m and the chosen test distri-

bution n. We shall make this dependency explicit by denoting the E-measure as E(m, n).

For any fixed m and n and any feasible test vector n E V(n), we define the lower bound E-measure of the test vector as

- E(m,n) = min E(m,n) mEW(m)

The MuxiMin E-measure (MME) Problem is to find feasible test vectors n E V(n) such that the lower bound E-measure is maximised, i.e. to solve the follow- ing equation for n:

r 1

4 problem

Maximin algorithm for solving the MME

From our formulation of the MME problem, it is obvi- ous that the solutions depend on the actual number m of failure-causing inputs and the total number n of test cases selected. We denote by S(m, n) the set of all solu- tions n of eqn. 2. For the time being, we shall treat the value of n as predetermined by the tester and therefore known in advance (but see Section 5 for the possibility of varying n). While the number m of failure-causing inputs is fixed for any given program, we do not have any knowledge of it prior to testing. Indeed we shall show at the end of this Section that in general, no sin- gle test distribution can be optimal for all possible val- ues of m. Our primary concern, however, is when the number of failure-causing inputs is small relative to the size of the input domain. After all, when m is large, failures will be relatively easy to detect, and the differ- ence in performance between using subdomain and random testing would be much less significant. Fortu- nately, when m is small, there does exist an optimal test distribution.

We shall first present an algorithm that solves the Single Failure MME Problem, which is the MME prob- lem restricted to the particular case when m = 1. Then we shall show that the algorithm also produces solu- tions to the general Multiple Failures MME Problem, provided that m is small relative to d.

4, I We first consider the special case when m = 1. Suppose that the failure-causing input is in Dj. Then mi = 1 and mi = 0 for all i f j . By eqn. 1, E(m, n) = nj/4 = 9. Thus, the lower bound E-measure of an arbitrary test vector n will be given by

Single failure MME problem

E( 1, n) = min o2 i E I k

Simple as it is, this observation illuminates the direc- tion for solving the MME problem. It tells us that for any test distribution, the testing will be least effective if the failure-causing input falls in the subdomain that has the lowest sampling rate. To avoid the testing becoming too ineffective, when we allocate test cases incrementally, priority should be given at each step to the subdomains with the lowest sampling rate so far.

We propose an incremental algorithm, called the Maximin Algorithm, to achieve this. Since it is neces- sary that n, 2 1 for all i, we initially allocate one test case for every subdomain to ensure the coverage. In each subsequent step, we allocate one test case to a subdomain that currently has the lowest sampling rate

213 IEE Proc -SoftW. Eng., Vol. 144, No. 5-6, Octoheu-December 1997

until all test cases are exhausted. Formally, the algo- rithm is stated as follows: The Maximin algorithm (i) Set ni := 1 and oi := lld, for i = 1, ..., k. (ii) Set q := n - k. (iii) While q > 0, repeat the following: (a) Find j E Ik such that 9 = min, Ik 0,.

(b) Set nj := nj + 1. (e) Set oj := oj 3- 11%. (d) Set q := q ~ 1.

For convenience, we shall refer to a test vector returned by the Maximin Algorithm as a maximin test vector.

Obviously, the algorithm always terminates although it is non-deterministic. In step 3a, when there are more than o n e j such that oj 5 oi for all i E Ik, then we have different choices of j resulting in different test vectors. Thus, it is meaningful to speak of the set A(n) of all maximin test vectors .

We now establish two lemmas showing that the algo- rithm always produces feasible test vectors that satisfy an important property. Lemma 4.1: The Maximin Algorithm always returns a feasible test vector, i.e. A(n) c V(n). Prooj That ni 2 1 for all i should be obvious. Also, note that q + ni = n is an invariant of the loop. Upon exit, q = 0 and Cik,, ni = n. Lemma 4.2: Let = ( E l , ..., f ik) E A(n) and E’% = miniEI, 6jdi. Then for all i E Ik

(See Appendix for proof). Lemma 4.2 tells us that a maximin test vector always

has the property that removing a test case from any subdomain will reduce the corresponding sampling rate to a minimum among all subdomains. This property will be used in the Appendix to prove the correctness of the Maximin Algorithm. Proposition 4.3: The Maximin Algorithm always pro- duces solutions to the Single Failure MME Problem, i.e. A(n) G S(1, n). (See Appendix for proof).

Thus, any maximin test vector is a solution to the Single Failure MME Problem, although the converse is not true. So, not all solutions to the Single Failure MME Problem can be generated from the Maximin Algorithm. Moreover, since the test vector generated from the Maximin Algorithm may not be unique, in general there can be more than one solution to the Sin- gle Failure MME Problem. This can be illustrated by the following example: Example 4.4: Suppose k = 3, m = 1, dl = 20, d2 = 40 and d3 = 50. I f n = 6, the Maximin Algorithm gives the unique test vector = (1, 2, 3). The corresponding sam- pling rate vector is 6 = (0.05, 0.05, 0.06), giving &(l, ) = 0.05.

Now if one more test case is to be selected using the Maximin Algorithm, it can be selected either from Dl or D,. In either case the lower bound E-measure remains 0.05.

However, if we choose the last test case from D, instead, then the lower bound E-measure is still 0.05.

214

To summarise, when n = 7, A(n) = ((2, 2, 3), (1, 3, 3)) and S(1, n) = A(n) U ((1, 2, 4)).

4.2 Multiple failures MME problem We now turn to the general case of m 2 1. First, we need to identify the failure distribution m that mini- mises the E-measure for a given test vector n. We have seen that in the single failure case, the E-measure is minimised when the failure-causing input is in the sub- domain of the lowest sampling rate. Suppose we now have another failure-causing input. To keep the E- measure minimal, the failure-causing input should be placed in a subdomain with a lower sampling rate, rather than one with a higher sampling rate. This observation is stated in the next lemma. Lemma 4.5: Let n E V(n) with oi > oj. Let m E W(m) be such that mi 2 1 and mi < 4.. Suppose that the fail- ure vector m’ is formed by moving one failure-causing input of m from Di to Dj. Then E(”, n) < E(m, n). Proof: This follows immediately from the fact that E(”, n) - E(m, n) = oj - oi < 0.

Thus, to keep the E-measure as small as possible, failure-causing inputs should occur in a subdomain with the lowest sampling rate. If that subdomain is full of failure-causing inputs, the remaining failure-causing inputs should be placed in the subdomain with the next lowest sampling rate until it is again full of failure- causing inputs, and so on. Our next proposition pre- cisely identifies the resulting worst case failure distribu- tion. Proposition 4.6: Let n E V(n) and m E W(m). Suppose that the subdomains have been renumbered in such a way that ol I ... I o k . Let s1 = 0 and s, denote the total size of the first i ~ 1 subdomains for i > 1. Then E(m, n) = &(m, n) if m represents the distribution of all the m failure-causing inputs that cluster in the subdomains with the lowest sampling rates, i.e. if there exists h E Ik such that sh < m < sh + dh and

i f i < h mi= m - s h i f i = h c i f i > h

(See Appendix for proof). Proposition 4.6 shows that when the E-measure is a

minimum, in general the failure-causing inputs may occur in several subdomains. Of special interest is the particular case when all failure-causing inputs are in one single subdomain. If so, then the lower bound E- measure is determined solely by the lowest sampling rate among the subdomains, and the Maximin Algo- rithm developed for the single failure case can now be used. We need a lemma before proving this. Lemma 4.7: Let E A(n) and n E V(n) be such that their lowest sampling rates are respectively 5; = miniEIk ej, and o h = miniGI, oi. If m I d/ and m < dh, then &(m, n) S E @ , 1. Proof: By proposition 4.3, E S(1, n). Hence &j 2 oh. Since m I d/ and m 5 dh, we have E(m, n) = moh mej - E @ , ). Proposition 4.8: If every subdomain is large enough to hold all the failure-causing inputs, then the Maximin Algorithm will also produce solutions to the Multiple Failure MME Problem. In other words, A(n) c S(m, n) if m 5 di for all i E Ik.

-

IEE Proc.-Softw. Eng., Vol 144, No. 5-6, October-December 1997

Proof: Let i E Ik, then by lemma 4.7, E(m, n) 5 &(m, ). Hence

E A(n). For any n E V(n), if m I d, for all E

Proposition 4.8 requires that the number of failure- causing inputs does not exceed the size of the smallest subdomain. This should not cause any problem if all subdomains are not too small and there are not too many failure-causing inputs. But if there is an extremely small subdomain, this condition may turn out to be very restrictive. For instance, if one of the subdomains contains a single input, then only the sin- gle failure case will be able to satisfy the condition in proposition 4.8.

We note, however, that lemma 4.7 requires only that the number of failure-causing inputs does not exceed the size of the subdomain that has the lowest sampling rate. Now if a subdomain is relatively small, its sam- pling rate will correspondingly be relatively large even if only a single test case is to be selected. In other words, the subdomain that has the lowest sampling rate will never be the smallest subdomain. Taking this into account, we find another sufficient condition as follows: Proposition 4.9: If m 4 dln, then A(n) c S(m, n). Proof: Let E A(n) and n E V(n) with the lowest sam- pling rates tiJ = minreIkti,, and o h = minZeIktil respec- tively. Then o h = lld Zfll d p h 5 lld Zkl d,o, = l/d Zbl, n, = nld. Thus m I dln 5 lioh I nh lq = dh. In a similar way, it can be proved that m 5 4. By lemma 4.7, &(m, n) 5 E(m, ). Since this is true for any arbi- trary n E V(n), we have E S(m, n).

Note that the conditions stated in propositions 4.8 and 4.9 are not necessary ones. This means that the Maximin Algorithm also works for many cases even when m > din and m > d, for some i, as shown in the following example,

Example 4.10: Suppose k = 4, n = 12, dl = 50, d2 = 23, d3. = 15 and d4 = 12. Here dln = 100/12 = 8.33 and min,,,, d, = 12. Then the only maximin test vector is = ( 5 , 3, 2, 2), with sampling rate vector 6 = (0.1, 0.130, 0.133, 0.167).

is a solution to the Multiple Failures MME Problem for m S 12 and m 5 8.33, respectively. However, it can be verified that is actually the only solution to the Multiple Failures MME Problem for m 5 21.

In fact, if m > d/n, then even for random testing the expected number of failures detected will be greater than one. Moreover, the probability P, of detecting at least one failure for random testing will then be given

S(m7 n>.

Now, by propositions 4.8 and 4.9,

by n

Pr=1- ( l - ? ) n

> l - ( d )

d since m >

since (I - i)n is increasing in n r 1 - e - 1

M 0.6321

In this case even random testing has a fair chance of detecting failures. Hence the maximin criterion may not be the most appropriate one. So for all practical pur- poses when the maximin criterion is to be used, the Maximin Algorithm should suffice to solve the Multi- ple Failures MME Problem.

To complete our analysis, we also present the follow- ing example which demonstrates that the set of solu- tions of the Multiple Failures MME Problem may actually depend on the exact value of m, if m is not restricted to being ‘small’. In other words, no test dis- tribution can be a solution to the MME problem for all m in such cases. Example 4.11: Consider example 4.10, where k = 4, n = 12, dl = 50, d2 = 23, d3 = 15 and d4 = 12. It can easily be shown that the test vector = (5 , 3, 2, 2), with sam- pling rate vector 3 = (0.1, 0.130, 0.133, 0.167), is the only solution to the Multiple Failures MME Problem form I 21.

may not be a solu- tion to the Multiple Failures MME Problem. For example, when m = 30, the test vector n = (6, 3, 2, 1) with sampling rate vector g = (0.120, 0.130, 0.133, 0.083) has a greater lower bound E-measure. This is because by proposition 4.6, E(30, n) = 12 x 0.083 + (30 - 12) x 0.12 = 3.16, and E(30, ) = 30 x 0.1 = 3. Thus - E(30, n) > E(30, ) and therefore E S(30, n). Moreo- ver, since is the unique element in S(1, n), it is clear that no test distribution can be optimal for both m = 1 and m = 30.

However, for larger values of m,

5 Discussion

We have basically solved the MME problem for a small number of failure-causing inputs. In this Section, we discuss some variations of the MME problem.

5. I MaxiMin P-measure problem As mentioned earlier, the P-measure, i.e. the probabil- ity of detecting at least one failure, is another com- monly used metric for quantifying the effectiveness of partition testing strategies. Its value is given by the fol- lowing formula [4, lo]:

MaxiMin P-measure (MMP) Problem can be formu- lated in a similar way to the MME problem. Thus, for fixed m and n and any feasible test vector n E V(n), the lower bound P-measure of the test vector is defined as

- P(m,n) = min P(m,n) mEW(m)

( 3 )

The MMP problem is then to find feasible test vectors n E V(n) such that the lower bound P-measure is max- imised, i.e. to solve the following equation for n:

r 1

However, since the formula for the P-measure is more complicated, it is expected that an exact analysis of the MMP problem would be much more difficult. In [SI, we performed an analysis of the worst case failure distribution using the P-measure. For fixed m and n and any test vector n E V(n), we have successfully char- acterised a failure distribution m that would lead to the lower bound P-measure of n, and we have also found

IEE Proc-Softw. Eng., Vol. 144, No. 5-6. October-December 1997 215

the precise value of _P(m, n) [SI. Essentially, we have solved eqn. 3, but the MMP problem, i.e. the solution to eqn. 4, remains open.

Fortunately, we have shown in [6] that the P-measure possesses very similar properties to the E-measure, and in particular the former can be approximated by the latter when all failure rates involved are small. In view of this, when the number of failure-causing inputs is small, it is reasonable to conjecture that the Maximin Algorithm developed in this paper should produce test vectors that are at least close to the solutions of the MaxiMin P-measure Problem.

5.2 Approximating the proportional sampling strategy We have proved in [4] that if test cases are selected in proportion to the subdomain sizes (i.e. if proportional sampling is used), then partition testing will be at least as good as random testing in terms of the probability of detecting at least one failure. Very often, the propor- tional sampling strategy cannot be strictly applied, as the number of test cases selected from each subdomain must be a positive integer. For instance, consider exam- ple 4.4 where k = 3, dl = 20, d2 = 40 and d3 = 50. If n = 6, then proportional sampling would require n1 = 1.09, n2 = 2.18 and n3 = 2.73, which is clearly not feasi- ble. Chan et al. 1121 have proposed several practical guidelines to approximate the strategy, but their approach is less formal and basically an ad hoc one.

In fact, the Maximin Algorithm can also be viewed as a way of approximating the proportional sampling strategy. Intuitively, the algorithm iteratively allocates test cases to the relatively under-sampled subdomains. If the subdomain sizes and the number of test cases are such that proportional sampling is actually possible, the algorithm will eventually produce such a test distri- bution. Otherwise, the Maximin Algorithm will pro- duce test distributions that follow the proportional sampling strategy as closely as practically possible. In example 4.4, for instance, the solution test vector for n = 6 is (1, 2, 3), which is reasonably close to the test dis- tribution (1.09, 2.18, 2.73) required by strictly following the proportional sampling strategy. Indeed, applying the Maximin Algorithm to the examples used in [12] actually gives the same test distributions as those resulting from the use of their approximation methods.

With our approach, we have the added benefit of knowing for sure that the testing will have a best possi- ble lower bound performance. Moreover, our approach is universally applicable, as long as the relative sub- domain sizes can be estimated. In contrast to the sev- eral guidelines Chan et al. [12] propose to cater for different situations, we make use of a uniform approach to handle all combinations of subdomain sizes. In a sense, our approach here produces the same solutions to the problem studied in [12] but with addi- tional merits, in a more elegant manner and founded on a sound theoretical basis.

5.3 Varying the number of test cases For the convenience of our analysis, we have assumed that the total number n of test cases is predetermined and fixed. This restriction, however, is more apparent than real, as the Maximin Algorithm basically allocates test cases incrementally. In fact, the Maximin Algo- rithm successively produces solutions to the MME problem for t test cases, where t = k , k + 1, ..., n. Thus,

216

if the tester decides to run fewer test cases, the test dis- tribution generated from an earlier iteration in the Maximin Algorithm can be used. On the other hand, if at any time n test cases distributed according to the Maximin Algorithm have been executed and subse- quently the tester decides to execute n’ more test cases, then with an appropriate change in the initialisation step (step (i)), the Maximin Algorithm can be adapted to compute the distribution of these additional test cases in a way that ensures the best possible lower bound performance of the testing.

6 Conclusions

There is no doubt that subdomain testing has its own merit. Its close relationship with the concept of ‘cover- age’ adds to its intuitive appeal. Previous studies have suggested that subdomain testing is particularly effec- tive if it is fault-based [lo], or if the subdomains are revealing 1151. In the latter case, in theory, any repre- sentatives from the subdomains should be able to reveal the faults in the program.

Unfortunately, in practice subdomains are seldom truly revealing [lo], and the actual faults in the pro- gram are not necessarily any of those faults on which the subdomain constructions are based. Our initial assumptions about the type of faults in the program may turn out to be totally wrong. If so, subdomain testing may not perform as well as is often thought. For example, we may suspect that the program con- tains domain errors and so we use the domain testing strategy 1161. But if the actual errors are such that the failures are not close to the domain borders, the tests may turn out to be much less effective than we had hoped.

The maximin criterion is designed to protect against the worst case of the testing. It guarantees that the lower bound effectiveness of the tests is improved in the best possible way. So even when the worst happens, the tests will only be slightly less effective than random testing. Conversely, when the best happens, that is, when the subdomains are truly revealing or the faults are exactly as expected, the tests will still be very effec- tive. Thus, in the complete lack of any information of the failure distribution, the use of the maximin crite- rion as a basis of determining the test distribution is certainly worthy of consideration.

As explained, the maximin criterion is best suited for the tester who prefers not to take unnecessary risks at the expense of achieving higher gains (of the effective- ness of the testing). As such, it may not be the best one for all situations. Clearly, if some other information regarding the failures is known, this information should be taken into account when determining the test distribution. For example, in [5, 61 we have shown that if the failure rates of the subdomains can be estimated, then the sampling rates can be chosen to ensure better performance than random testing.

In this paper, we have investigated the problem of generating test distributions which satisfy the maximin criterion. We propose and analyse an algorithm, called the Maximin Algorithm, for achieving this purpose. This algorithm is easy to understand and can be imple- mented with negligible overhead. No knowledge of the failure-causing inputs or failure rates is required. The algorithm is applicable to any program with sub- domain sizes that are known or can be estimated.

IEE Proc -Softw Eng , Vol 144, No 5-6, October-December 1997

7 Acknowledgments

References

In particular, njld, > Ejid, and so

By lemma 4.2, for all i E I,, Ej/d, t (Ei - l)idi. Combin- ing this with eqn. 6, we have, for all i E Ik, n jd i > (Ei - l)/di, which implies ni > E j - 1 . Since both ni and E j - 1 are integers, we have

nj > f i j (7 )

ni 2 f i i for all i E (8) Now we have

k

n = n, since n E V(n) i=l

k

> f i i by (7) and (8) 2=1

= n since n E V ( n ) which is a contradiction Proposition 4.6: Let n E V(n) and m E W(m). Suppose that the subdomains have been renumbered in such a way that ol I ... 5 ok. Let s1 = 0 and s, denote the total size of the first i - 1 subdomains for i > 1. Then E(m, n) = E(m, n) if m represents the distribution of all the m failure-causing inputs that cluster in the subdomains with the lowest sampling rates, i.e. if there exists h E I,

4 i f i < h m, = { m - s h i f i = h

0 i f i > h

such that sh < m 5 sh + dh and

Proof: Suppose that m and n are as given in the propo- sition. Let m‘ = ..., m’k) E W(m). Our aim is to prove that E(”, n) 2 E(m, n).

If h = 1 , then k k

i=l i= 1

If h > 1, from the construction of m, we have

E(M, n) h.-1 / h.-1 \

i=l

h-1

i= 1

E(”, n) - E(m, n) E h-1

i=l k

i=l h-1

i=l

k i=l h-I

z=h 2 = 1

2 0 since when i 2 h, o, 2 o h , and when i < h, m‘, - d, 5 0 and o, - oh 5 0.

278 IEE Proc -Softw Eng , Val 144, No 5-6, October-December 1997


Recommended