A Requirements-Based Partition Testing Framework Using ......A Requirements-Based Partition Testing...

A Requirements-Based Partition

Testing Framework Using Particle

Swarm Optimization Technique

by

Afshar Ganjali

A thesis

presented to the University of Waterloo

in fulfillment of the

thesis requirement for the degree of

Master of Applied Science

in

Electrical and Computer Engineering

Waterloo, Ontario, Canada, 2008

c© Afshar Ganjali 2008

I hereby declare that I am the sole author of this thesis. This is a true copy of the

thesis, including any required final revisions, as accepted by my examiners.

I understand that my thesis may be made electronically available to the public.

ii

Abstract

Modern society is increasingly dependent on the quality of software systems.

Software failure can cause severe consequences, including loss of human life. There

are various ways of fault prevention and detection that can be deployed in different

stages of software development. Testing is the most widely used approach for

ensuring software quality.

Requirements-Based Testing and Partition Testing are two of the widely used

approaches for testing software systems. Although both of these techniques are

mature and are addressed widely in the literature and despite the general agree-

ment on both of these key techniques of functional testing, a combination of them

lacks a systematic approach. In this thesis, we propose a framework along with a

procedural process for testing a system using Requirements-Based Partition Testing

(RBPT). This framework helps testers to start from the requirements documents

and follow a straightforward step by step process to generate the required test cases

without loosing any required data. Although many steps of the process are man-

ual, the framework can be used as a foundation for automating the whole test case

generation process.

Another issue in testing a software product is the test case selection problem.

Choosing appropriate test cases is an essential part of software testing that can

lead to significant improvements in efficiency, as well as reduced costs of combi-

natorial testing. Unfortunately, the problem of finding minimum size test sets is

NP-complete in general. Therefore, artificial intelligence-based search algorithms

have been widely used for generating near-optimal solutions. In this thesis, we also

propose a novel technique for test case generation using Particle Swarm Optimiza-

tion (PSO), an effective optimization tool which has emerged in the last decade.

Empirical studies show that in some domains particle swarm optimization is equally

well-suited or even better than some other techniques. At the same time, a particle

swarm algorithm is much simpler, easier to implement, and has just a few parame-

ters that the user needs to adjust. These properties make PSO an ideal technique

for test case generation. In order to have a fair comparison of our newly proposed

algorithm against existing techniques, we have designed and implemented a frame-

work for automatic evaluation of these methods. Through experiments using our

evaluation framework, we illustrate how this new test case generation technique

can outperform other existing methodologies.

iii

Acknowledgements

It has been much work during these last two years. But at the same time, it has

been a lot of fun. A great part of this fun is due to the people I have been involved

with, both in my research and teaching duties, and in my personal life.

I have to start with showing my gratitude to Professor Ladan Tahvildari, who

has officially been my supervisor, but unofficially much more. I would like to

appreciate her care and support which was not limited to my research work. Her

insight and direction has enriched the content of this thesis.

I would like to acknowledge the support of Research In Motion (RIM) company.

Without the assistance of RIM, my time at University of Waterloo would not have

been as enriching. I should thank Gary Cort, Spencer Hill, Weining Liu and Julie

Rastelli for the countless chats and their kind assistance throughout this work.

I would also like to thank Professor Dasiewics and Professor Kontogiannis for

accepting to be members of my dissertation committee. I must thank them for tak-

ing the time out of their busy schedules to review my thesis and for their insightful

comments and suggestions.

I should also thank all the members of Software Technologies and Applied Re-

search (STAR) group for their moral support and valuable feedbacks.

Finally, would like to express my sincere gratitude to my family. My father, my

mother, my brother Yashar and my sister-in-law Hamideh have always been a great

support and have had the necessary understanding. This accomplishment would

have been more difficult to achieve without their constant encouragement.

iv

To my father and mother

for their infinite love, understanding and support.

v

Contents

List of Figures viii

List of Tables ix

List of Algorithms x

1 Introduction 1

1.1 The Problem . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2

1.2 Thesis Contribution . . . . . . . . . . . . . . . . . . . . . . . . . . . 3

1.3 Thesis Organization . . . . . . . . . . . . . . . . . . . . . . . . . . . 4

2 Backgrounds and Related Works 6

2.1 Software Testing Techniques . . . . . . . . . . . . . . . . . . . . . . 6

2.2 Requirements-Based Software Testing . . . . . . . . . . . . . . . . . 11

2.3 Partition Testing Techniques . . . . . . . . . . . . . . . . . . . . . . 14

2.4 Combinatorial Test Case Generation as an Optimization Problem . 16

2.5 Overview of Existing Combination Strategies . . . . . . . . . . . . . 18

2.5.1 Ant Colony Algorithm . . . . . . . . . . . . . . . . . . . . . 19

2.5.2 Genetic Algorithms . . . . . . . . . . . . . . . . . . . . . . . 19

2.5.3 Simulated Annealing . . . . . . . . . . . . . . . . . . . . . . 20

2.5.4 Tabu Search . . . . . . . . . . . . . . . . . . . . . . . . . . . 22

2.5.5 AETG: Automatic Efficient Test Generator . . . . . . . . . 23

2.5.6 IPO: In-Parameter-Order . . . . . . . . . . . . . . . . . . . . 24

2.5.7 CATS Algorithm . . . . . . . . . . . . . . . . . . . . . . . . 26

2.6 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27

vi

3 A Framework for Requirements-Based Partition Testing 28

3.1 Proposed Layered Architecture for the RBPT Framework . . . . . . 29

3.1.1 Features Layer . . . . . . . . . . . . . . . . . . . . . . . . . 30

3.1.2 Atomic Features Layer . . . . . . . . . . . . . . . . . . . . . 33

3.1.3 Test Scenarios Layer . . . . . . . . . . . . . . . . . . . . . . 34

3.1.4 Frame Sets Layer . . . . . . . . . . . . . . . . . . . . . . . . 35

3.1.5 Test Frames Layer . . . . . . . . . . . . . . . . . . . . . . . 37

3.1.6 Test Cases Layer . . . . . . . . . . . . . . . . . . . . . . . . 37

3.2 RBPT-Based Test Case Generation Process . . . . . . . . . . . . . 39

3.2.1 Requirements Modeling . . . . . . . . . . . . . . . . . . . . 39

3.2.2 Test Case Generation . . . . . . . . . . . . . . . . . . . . . . 41

3.3 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 41

4 Particle Swarm Optimization for Test Case Generation 42

4.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 42

4.2 Test Suites as Covering Arrays . . . . . . . . . . . . . . . . . . . . . 44

4.3 Particle Swarm Optimization for Software Testing . . . . . . . . . . 45

4.3.1 PSO Technique . . . . . . . . . . . . . . . . . . . . . . . . . 45

4.3.2 PSO for Test Case Generation . . . . . . . . . . . . . . . . . 47

4.4 Empirical Experiments . . . . . . . . . . . . . . . . . . . . . . . . . 52

4.4.1 Test Case Generation Framework . . . . . . . . . . . . . . . 52

4.4.2 Experimental Comparison with Other Algorithms . . . . . . 56

4.5 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 58

5 Conclusion and Future Work 60

5.1 Thesis Contributions . . . . . . . . . . . . . . . . . . . . . . . . . . 60

5.2 Future Work . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 61

References 63

vii

List of Figures

2.1 Distribution of Bugs and Required Effort for Fixing Them [35] . . . 13

2.2 Nesting of the Optimization Problem Categories [41] . . . . . . . . 17

3.1 RBPT Layered Structure 1 . . . . . . . . . . . . . . . . . . . . . . . 29

3.2 RBPT Layered Structure 2 . . . . . . . . . . . . . . . . . . . . . . . 30

3.3 Frame Set . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 37

3.4 Test Frames . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 38

3.5 Test Case Generation Process Diagram . . . . . . . . . . . . . . . . 40

4.1 A Simple Outline for PSO with Synchronous Update . . . . . . . . 46

4.2 Boundary conditions keep particles inside the limited area by chang-

ing their velocity in the appropriate direction. . . . . . . . . . . . . 50

4.3 Cyclic Walls Boundary Condition: Particle resides in the search

space by jumping to the other end point of the dimension without

any interference in its velocity. . . . . . . . . . . . . . . . . . . . . . 51

4.4 Test Case Generation Framework . . . . . . . . . . . . . . . . . . . 53

4.5 An example of 3 test suites, illustrating the PUTE measure. . . . . 55

4.6 Comparison of PSO with other existing test case generation algorithms. 59

viii

List of Tables

4.1 CA(N = 9; t = 3, k = 4, v = 2) . . . . . . . . . . . . . . . . . . . . . 44

4.2 An example test set and 3 test suites which provide 100% 2-wise

coverage on the values of the variables in the test set. . . . . . . . . 55

4.3 Selected Combination Strategies and their Settings . . . . . . . . . 57

ix

List of Algorithms

1 : Ant Colony Algorithm Outline . . . . . . . . . . . . . . . . . . . . 20

2 : Genetic Algorithms Outline . . . . . . . . . . . . . . . . . . . . . 21

3 : Simulated Annealing Outline . . . . . . . . . . . . . . . . . . . . . 22

4 : Tabu Search Outline . . . . . . . . . . . . . . . . . . . . . . . . . 23

5 : AETG Outline . . . . . . . . . . . . . . . . . . . . . . . . . . . . 24

6 : IPO Outline (Horizontal Growth) . . . . . . . . . . . . . . . . . . 25

7 : IPO Outline (Vertical Growth) . . . . . . . . . . . . . . . . . . . . 25

8 : CATS Outline . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 26

x

Chapter 1

Introduction

The development of high quality software requires considerable investment in qual-

ity assurance resources. Software testing as an important part of this process is

both expensive and time consuming. The whole testing process by various esti-

mates can take as much as 20% to more than 50% of the total development budget

of a software project and adds considerably to the length of the development cy-

cle [3, 5, 9, 53]. A tester is normally responsible to bring out a right mix of business

process knowledge, technical expertise and cutting edge technology for the company

to be able to deliver flexible and scalable services to the customers.

The Institute of Electrical and Electronics Engineers (IEEE) defines test [28] as

“a set of one or more test cases”. The IEEE also defines testing as “the process of

analyzing a software item to detect the differences between existing and required

conditions and to evaluate the features of the software item”. This definition makes

testers responsible for both verification and validation. Verification simply answers

the question “Does the system do what it is supposed to do?”. For verifying the

system, a tester should investigate the accuracy or correctness of the system ac-

cording to its specification. This comparison of the system’s response to what is

expected is straightforward if there is a well-defined specification that states what

the correct system response will be. This specification is called test standard in the

literature [49]. It is virtually impossible to automate testing if there is no standard

for the expected response and the automated test program can not make on the

fly subjective judgements about the correctness of the outcome. Therefore having

a test standard is essential for the automation of verification process. On the other

hand, validation is the process by which we confirm that the system is designed to

do things in the right way and it answers the question “Is what the system doing

correct?”. Validation is necessary to check for problems with the specification and

1

to demonstrate that the system is operational.

Different testing groups from different organizations do the verification and

validation using various test approaches [49]. There is a plethora of testing meth-

ods and testing techniques, serving multiple purposes in different life cycle phases.

Classified by purpose, software testing can be divided into correctness testing, per-

formance testing, reliability testing and security testing. Classified by life cycle

phase, software testing can be classified into the following categories: requirements

phase testing, design phase testing, program phase testing, installation phase test-

ing, acceptance testing and maintenance testing. By scope, software testing can

be categorized as follows: unit testing, component testing, integration testing, and

system testing.

From the long list of existing techniques and methods for testing, we will focus

on three of them which are widely used in software companies for testing their prod-

ucts. These three approaches are: Requirements-Based Testing (RBT), Partition

Testing and Combinatorial (Interaction) Testing. In Section 1.1, we talk about two

problems related to these methods. Section 1.2 briefly describes our proposed tech-

niques for solving these problems. Finally, Section 1.3 describes the organization

of the rest of this thesis.

1.1 Problem Description

In this thesis, we are going to address two different problems. The first problem

is related to the integration of requirements-based testing process and partition

testing. Both of these techniques are mature and have been studied widely in the

literature. However, despite the general agreement on both of these key techniques

of functional testing, there is no systematic approach for combining them. In this

thesis, we aim at presenting a simple framework for testing groups. This framework

can be used in various domains as a guideline for testing departments.

The second issue that we are going to focus on is the general problem of com-

binatorial test case generation using software specifications such as requirements.

Applying the partitioning techniques for testing, we come up with a list of vari-

ables (parameters), generated from the product specifications and a set of values for

each variable. We know that a common source of system faults is the unexpected

interaction between system components [58]. Therefore, for reducing the risk of in-

teraction problems we should test a large number of possible test configurations. A

test configuration can simply be defined as a combination of the different values of

2

the variables in the system. Let us consider k independent variables in the system

under test. Here, independence of variables means that the selection of a particular

value for one variable does not effect the selection of any other values for other vari-

ables. Now let us assume variable i has ni possible values, which are enumerated

as 1 . . . ni. A test configuration consists of a selection of values for each parameter

and hence can be indicated by a k-tuple. Since each test configuration ends up

as a different test case and each test case requires some time to be executed and

investigated by testers, the number of test configurations is the major cost factor

in the testing process. Clearly, the number of potential test configurations which is

equal to Πk1ni grows exponentially. Testing all these possible configurations called

exhaustive testing is almost impossible in practice due to the time and money con-

straints. The solution is to reduce the required test effort by putting a limit on the

required coverage of the possible value combinations and reducing the number of

test configurations.

In the literature, various coverage criteria are defined which can be used for

limiting the number of test configurations. One of the most well-known coverage

criteria is pairwise testing. In pairwise testing, all combinations of the values of

any two variables should be covered by at least one test case [52]. Based on the

observation that most faults are caused by interactions of at most two factors,

empirical results show that pairwise testing is practical and effective [7, 10, 15, 32].

Independent of which coverage criteria gets used, the goal of the combinatorial test

case generation is to reach the coverage goal using the minimum number of test

configurations.

1.2 Thesis Contribution

The major contribution of this thesis is to address the two problems described in

Section 1.1. For the first problem, we propose a novel framework which combines the

two mature techniques of requirements-based testing and partition testing into one

unified technique. We also define a procedural process for testing a system using this

new “Requirements-Based Partition Testing” (RBPT) framework. For the second

problem, we introduce a method for combinatorial test case generation applying

Particle Swarm Optimization (PSO) technique. The following list describes our

contributions in more details.

• Proposing a layered framework for requirements-based partition testing.

3

• Using particle swarm optimization for effective combinatorial test case gen-

eration.

• Proposing a simple boundary condition, called “Cyclic Walls”, for PSO that

can be used for solving the problems which have a finite search space.

• Developing a framework for automatic comparison of different test case gen-

eration algorithms.

• Introducing a new test case generation metric for assessing the effectiveness

of the test suites generated by different combination strategies.

• Illustrating through empirical experiments that PSO can be as effective as

other existing techniques for combinatorial test case generation.

1.3 Thesis Organization

The rest of the thesis is organized as follows:

• Chapter 2 presents a survey of the related works, and gives an overview on

the required background material. In the first section, it introduces differ-

ent software testing techniques and gives a brief definition for each of them.

Section 2.2 reviews requirements-based software testing. Section 2.3 presents

how partition testing works in general, and then focuses on category parti-

tion method which will be used later. Finally, the last section describes the

combinatorial test case generation problem as an optimization problem, and

presents a survey of existing combination strategies in the literature.

• Chapter 3 is about our first problem: integration of requirements-based test-

ing and partition testing. The first section of the chapter presents the layered

structure of the proposed framework, and the second section puts all the lay-

ers of the framework together and defines a test case generation process which

traverses all the layers of the framework one by one for generating a complete

test suite. The main goal of this process is to formalize and automate the

required activities for testing a newly developed system.

• Chapter 4 proposes a novel technique for test case generation, using Particle

Swarm Optimization (PSO), an effective optimization tool which has emerged

in the last decade. For comparing the results from PSO combination strategy

4

with other existing techniques in the literature, a benchmark framework is

presented in Section 4.4.1 of this thesis. Section 4.4.1 also proposes a new

effectiveness measure which produces better and more precise assessments

from the output of such algorithms. Finally, in Section 4.4.2 the proposed

benchmarking framework and effectiveness measure are used for executing

some experiments. Results show that PSO combination strategy can be as

effective as other existing algorithms.

• Chapter 5 reviews the thesis contributions, and outlines future directions.

5

Chapter 2

Concepts and Related Works

Modern society is increasingly dependent on the quality of software systems. Soft-

ware failure can cause severe consequences, including loss of human life at extreme.

There are various ways of fault prevention and detection that can be deployed in

different stages of software development. Testing is the most widely used approach

for ensuring software quality.

In this chapter we will have an overview on existing software testing techniques.

The first section of the chapter briefly describes a wide range of testing methods.

Then, in the second and third sections, we focus on requirements-based testing and

partition testing respectively and explain what these techniques are and how they

are important in the testing of software systems. Finally in Section 2.4, we show

that the problem of combinatorial test case generation is an optimization problem

and review some of the existing methods for handling this problem in practical

cases.

2.1 Software Testing Techniques

Software testing is the process of executing a program or system with the intent of

finding errors and defects [37]. It also involves any activity aimed at evaluating an

attribute or capability of a program or system and determining that it meets its

required results [25]. Unlike other physical processes, software can fail in many un-

expected ways. Detecting all of the different failure modes for software is generally

infeasible. There are an abundance of software testing techniques in the literature

such as black or white box testing, static or dynamic testing, partition testing,

requirements-based testing, mutation testing. Most of these techniques and testing

6

methods are not very different from 20 years ago. Although there are many tools

and techniques available to use, an efficient testing technique also requires a tester’s

creativity, experience and intuition. Here in this section, we will have a brief review

on some of the well-known existing software testing methods.

• Static Testing Vs. Dynamic Testing: There are many approaches and

techniques that can be used in software testing. Reviews, walkthroughs or

inspections are some of these methods that can be considered as static testing,

whereas actually executing programmed code with a given set of test cases

is referred to as dynamic testing. Static testing is not essential and can be

omitted, even though it is very useful for avoiding a large group of defects.

Dynamic testing takes place when programs begin to be used for the first time

- which is normally considered the beginning of the testing stage. This may

actually begin before the program is 100% complete in order to test particular

sections of code (modules or discrete functions).

• Black Box Testing Vs. White Box Testing: The black box testing ap-

proach is a testing method in which test data are derived from the specified

functional requirements without any knowledge of the final program struc-

ture [42]. It is also referred to as requirements-based testing [25]. Since only

the functionality of the software module is of concern, black box testing is also

referred to as functional testing. In this approach tester treats the software

under test as a black box. The assumption is that the tester has access only to

the inputs, outputs and the specification, and the functionality is determined

by observing the outputs to corresponding inputs. For testing, various inputs

are exercised and the outputs are compared against specification to validate

the correctness. All test cases are derived from the specification documents

and no implementation details of the code are considered.

On the other hand in white box testing, contrary to black box testing, software

is viewed as a white box as the structure and flow of the software under test

are visible to the tester. Testing plans are made according to the details

of the software implementation, such as programming language, logic, and

styles. Test cases are derived from the program structure. White box testing

may also be referred to as glass-box testing or design-based testing.

There are many techniques available in white box testing. Some of these

techniques try to test the software exhaustively according to different coverage

criteria, for example by executing each line of code at least once (statement

7

coverage), traversing every branch statements (branch coverage), or covering

all the possible combinations of true and false condition predicates (multiple

condition coverage). Control flow testing, loop testing, and data flow testing

are some other examples of white box methods that map the corresponding

flow structure of the software into a directed graph. Test cases are carefully

selected based on the criterion that all the nodes or paths are covered or

traversed at least once. By doing so we may also discover unnecessary “dead”

code which is of no use, or never gets executed and can not be discovered by

functional testing.

We should note that many testing strategies may not be easily classified into

black box testing or white box testing. One reason is that all the testing

techniques will need some knowledge of the specification of the software under

test. Another reason is that the idea of specification itself is broad and it may

contain any requirement including the structure, programming language, and

programming style as part of the specification content.

• Unit Testing: In computer programming, unit testing is a procedure used

to validate that individual units of source code are working properly. A unit

is the smallest testable part of an application. In procedural programming a

unit may be an individual program, function, procedure, etc., while in object-

oriented programming, the smallest unit is a method; which may belong to a

base/super class, abstract class or derived/child class. Unit testing is typically

done by developers and not by software testers or end-users.

• Mutation Testing: In mutation testing, the original program code is changed

and many mutated programs are created, each containing one fault. Each

faulty version of the program is called a mutant. Test data are selected based

on the effectiveness of failing the mutants. The more mutants a test case can

kill, the better the test case is considered. The problem with mutation testing

is that it is computationally too expensive.

• Random Testing: In random testing, the test case selection process is very

simple and straightforward: they are randomly chosen. Study in [16] indi-

cates that under certain very restrictive conditions, random testing can be as

effective as partitioning testing. They showed consistent small differences in

effectiveness between partition testing methods and random testing. These

results were interpreted in favor of random testing since it is generally less

work to construct test cases in random testing since partitions do not have

8

to be constructed. But later investigations in [23] concluded that the Du-

ran/Ntafos model in [16] was unrealistic. One reason was that the overall

failure probability was too high. Some other studies in [21] followed up these

results and showed theoretically that partition testing is consistently more

effective than random testing under realistic assumptions. More recent re-

sults have been produced that favor partition testing over random testing in

practical cases [43]. Effectively combining random testing with other testing

techniques may yield more powerful and cost effective testing strategies.

• Combinatorial Testing and Pairwise Testing: If we partition the input

domain of a software system into a set of variables each having a set of possible

values, combinatorial testing method requires that for any given t > 1, all t-

wise combinations of the values of those variables be tested by at least one

test case. Pairwise testing is a special case of combinatorial testing, where

t = 2. In pairwise testing, given any pair of input variables (parameters)

of a system, every combination of valid values of the two variables must be

covered by at least one test.

Exhaustive testing is impractical due to resource constraints. It is not practi-

cal to cover all the parameter interactions. We need a good trade-off between

test effort and test coverage. Empirical studies show that many faults are

caused by the interactions between two variables (parameters). Hence, pair-

wise testing can be used as an effective testing method to make a balance

between test effort and test coverage. Studies in [6, 13] argue that the testing

of all pairwise interactions in a software system finds a large percentage of

the existing faults and provide empirical results to show that this type of test

coverage is effective.

• Performance Testing: Not all software systems have specifications on per-

formance explicitly. However, every system will have implicit performance

requirements. The software should not take infinite time or infinite resource

to execute. “Performance bugs” sometimes are used to refer to those design

problems in software that cause the system performance to degrade. Perfor-

mance has always been a great concern. Performance evaluation of a software

system usually includes resource usage, throughput, stimulus-response time

and queue lengths (the average or maximum number of tasks waiting to be

serviced by selected resources). Typical resources that need to be consid-

ered include network bandwidth requirements, CPU cycles, disk space, disk

access operations, and memory usage [50]. The goal of performance testing

9

can be performance bottleneck identification, performance comparison and

evaluation, etc. The typical method of doing performance testing is using a

benchmark designed to be representative of the typical system usage [54].

• Reliability Testing: Software reliability refers to the probability of failure

free operation of a system. It is related to many aspects of software, including

the testing process. Directly estimating software reliability by quantifying its

related factors can be difficult. Testing is an effective sampling method to

measure software reliability. Software testing (usually black box testing) can

be used to obtain failure data, and an estimation model can be further used

to analyze the data to estimate the present reliability and predict future

reliability. Therefore, based on the estimation, the developers can decide

whether to release the software, and the users can decide whether to adopt

and use the software. Risk of using software can also be assessed based on

the reliability information.

Hamlet in [22] advocates that the primary goal of testing should be to mea-

sure the dependability of tested software. There is agreement on the intuitive

meaning of dependable software: it does not fail in unexpected or catastrophic

ways [22]. Robustness testing and stress testing are variances of reliabil-

ity testing based on this simple criterion. IEEE defines the robustness of a

software component as the degree to which it can function correctly in the

presence of exceptional inputs or stressful environmental conditions [1]. Ro-

bustness testing differs with correctness testing in the sense that the functional

correctness of the software is not of concern. It only watches for robustness

problems such as machine crashes, process hangs or abnormal termination.

Stress testing, or load testing, is often used to test the whole system rather

than the software alone. In such tests the software or system are exercised

with or beyond the specified limits. Typical stress includes resource exhaus-

tion, bursts of activities, and sustained high loads.

• Security Testing: Software quality, reliability and security are tightly cou-

pled. Flaws in software can be exploited by intruders to open security holes.

With the development of the Internet, software security problems are becom-

ing even more severe. Many critical software applications and services have

integrated security measures against malicious attacks. The purpose of secu-

rity testing of these systems include identifying and removing software flaws

that may potentially lead to security violations, and validating the effective-

ness of security measures. Simulated security attacks can be performed to

10

find vulnerabilities.

2.2 Requirements-Based Software Testing

One of the first sources of information for the testers should be requirements. We

know that testing the software is an integral part of building a system. However,

if the software is based on inaccurate requirements, then even with a well-written

code, the software will be unsatisfactory. The specification must contain all the

requirements that are to be solved by our system. The specification should also

explicitly specify everything our system must do and the conditions under which it

must perform. In order for the requirements to be considered testable, the require-

ments ideally should have all of the following characteristics [36]:

• Deterministic: Given an initial system state and a set of inputs, one must

be able to predict exactly what the outputs will be.

• Unambiguous and Readable: All project members must get the same

meaning from the requirements; otherwise those requirements are ambiguous.

• Correct: The relationships between causes and effects must be described

correctly.

• Complete: All requirements should be included. No omissions are allowed.

• Non-redundant: The requirements should provide a non-redundant set of

functions and events.

• Placed under change control: Requirements, like all other deliverables of

a project, should be placed under change control.

• Traceable: Requirements must be traceable to each other, to the objectives,

to the design, to the test cases and to the code.

• Written in a consistent style: Requirements should be written in a con-

sistent style to make them easier to understand.

• Explicit: Requirements must never be implied.

• Logically consistent: There should be no logic errors in the relationships

between causes and effects.

11

• Reusable: Good requirements can be reused on future projects.

• Terse: Requirements should be written in a brief manner, with as few words

as possible.

• Annotated for criticality: Not all requirements are critical. Each require-

ment should note the degree of impact a defect in it would have on pro-

duction. In this way, the priority of each requirement can be determined,

and the proper amount of emphasis placed on developing and testing each

requirement.

• Feasible: If the software design is not capable of delivering the requirements,

then the requirements are not feasible.

The Requirements-Based Testing (RBT) process [36], which is one of the most

widely used software testing techniques, is based on these characteristics of good

requirements and addresses two major issues: first, validating that the requirements

are correct, complete, unambiguous, and logically consistent; and second, designing

a necessary and sufficient (from a black box perspective) set of test cases from those

requirements to ensure that the design and code fully meet those requirements.

One of the most important issues to be overcome in the process is to reduce the

immensely large number of potential tests down to a reasonable size test set.

According to recent studies, the majority of defects have their root cause in

poorly defined requirements [35] (see Figure 2.1). On the other hand, the cost of

fixing an error is cheaper the earlier it is found. If a defect was introduced while

coding, you just fix the code and recompile. However, if a defect has its roots in

poor requirements and is not discovered until integration testing then you must redo

the requirements, design, code, the tests, the user documentation, and the training

materials. All this extra work can send projects over budget and over schedule. If

a defect introduced during the requirements phase is not found until integration

testing or production, it will cost hundreds or even thousands of times more than

the case where it is found and fixed in the requirements phase. Therefore, the overall

RBT strategy is to integrate testing throughout the development life cycle and focus

on the quality of the requirements specification. Testing starts at the beginning of

the project, not at the end of the coding and we apply tests to assure the quality of

the requirements. This leads to early defect detection which has been shown to be

much less expensive than finding defects during integration testing or later. The

RBT process also has a focus on defect prevention, not just defect detection and

12

(a) Distribution of Bugs (b) Distribution of Effort to Fix Bugs

Figure 2.1: Distribution of Bugs and Required Effort for Fixing Them [35]

hence it minimizes expensive rework by minimizing requirements related defects

that could have been discovered, or prevented, early in the project’s life.

One of the most challenging aspects of the requirements-based testing is com-

municating with the people who are supplying the requirements. If we have a con-

sistent way of recording requirements, we can make it possible for the stakeholders

to participate in the requirements process. This way, as soon as a requirement

becomes visible we can start testing it and ask the stakeholders detailed questions.

We can apply a variety of tests to ensure that each requirement is relevant, and that

everyone has the same understanding of its meaning. We can ask the stakeholders

to define the relative value of requirements. We can also define a quality measure

for each requirement, and we can use that quality measure to test the eventual

solutions.

Prioritizing the requirements [4, 55] is another important issue which should

be considered in RBT process. If we can establish the relative priorities of the

requirements, then it helps greatly in establishing the rank of the tests that are

designed to verify the requirements and the amount of test coverage that will be

provided. Ranking provides a valuable tool for designers and developers to pass on

their knowledge and assumptions of the relative importance of various features in

the system.

13

2.3 Partition Testing Techniques

The term partition testing refers to a very general family of testing strategies. The

primary characteristic of these strategies is that the program’s input domain is

divided into subsets, with the tester selecting one or more element from each sub-

domain. In the testing literature, it is common not to restrict the term partition

to the formal mathematical meaning of a division into disjoint subsets, which to-

gether span the space being considered. Instead, testers generally use it in the

more informal sense to refer to a division into (possibly overlapping) subsets of the

domain. The goal of such a partitioning is to make the division in such a way that

when the tester selects test cases based on the subsets, the resulting test set is a

good representation of the entire domain. A partition can be defined using all the

information about a program. It can be based on requirements or specifications

(one form of black box testing), on features of the code (structural testing), even

on the process by which the software was developed, or on the suspicions and fears

of a programmer [23]. Ideally, the partitioning divides the domain into sub-domains

with the property that within each sub-domain, either the program produces the

correct answer for every element or the program produces an incorrect answer for

every element. Such a sub-domain is called revealing [56] or homogeneous [23]. If

a partition’s sub-domains are revealing, one need only randomly select an element

from each subset and run the program on that test case in order to determine

program faults. Informal guidelines for creating such a partition and theoretical

properties are discussed in [44, 56]. In practice, it is common for the division of the

input domain to be into non-disjoint subsets, and it is extremely unusual for the

sub-domains to be truly revealing.

Partition testing has two extreme cases: exhaustive testing and random testing.

Exhaustive testing requires that every element of the input domain be explicitly

tested. As a partition testing technique, therefore, exhaustive testing simply corre-

sponds to the division of the input domain into single element sub-domains. The

other extreme case is random testing. In this case, the partition consists of one

class, namely, the entire domain. Random testing can, therefore, be viewed as a

degenerate form of partition testing.

The strength of partition testing is its ability to use any and all available in-

formation during the software development life cycle, and to examine information

in combinations that may not have been thought of during development. Intu-

itively, the source of program bugs and defects is some unlikely combination of

requirements, design, and programmer inattention. By including these factors in

14

the sub-domain definition, we can be confident that nothing is missed in testing.

Good sub-domains are defined and refined throughout development as information

arises.

The category partition method [39] is one of the most effective partitioning tech-

niques. This method provides a way to quickly translate a design specification to a

test specification. It guides the tester to create functional test cases by decomposing

functional specifications into test specifications for major functions of the software.

It identifies those elements that influence the functionality and generates test cases

by methodically varying the elements over all values of interest. Thus, it can be

considered a black box integration technique. The category partition method pro-

vides a general systematic procedure for creating test specifications. The testers

main job is to develop categories, which are defined to be the major characteristics

of the input domain of the function under test, and to partition each category into

equivalence classes of inputs called choices. By definition, choices in each category

must be disjoint, and together the choices in each category must cover the input

domain. The steps below show the method in brief:

1. Analyze the specification to identify the individual functional units that can

be tested separately.

2. Identify the input domain, that is the categories (also called parameters or

variables) that affect the behavior of the function.

3. Partition each category into choices (also called values).

4. Specify combinations of choices to be tested.

5. Convert the test frames produced by the tool into test cases, and organizes

the test cases into test scripts.

Using this method, the obvious tests could be enumerated quickly and com-

pletely, leaving more time to think about more subtle issues. One of the first

benefits of this method is that we can achieve a fairly uniform coverage across a

large problem space.

15

2.4 Combinatorial Test Case Generation as an

Optimization Problem

Many problems of both practical and theoretical importance can be expressed as

a problem of choosing a “best” configuration or set of parameters to achieve some

goal. In the domains of computer science and operations research a hierarchy of

such problems has emerged, together with a corresponding collection of techniques

for their solution. The most general problem of such kind is the general nonlinear

programming problem:

Find x to

minimize f(x)

subject to gi(x) ≥ 0 i = 1, . . . ,m

hj(x) = 0 j = 1, . . . , p

where f , gi and hj are general function of the parameter x ∈ Rn. Defining some

specific conditions on the functions f , gi and hi results in different sets of problems.

The techniques for solving such problems in different sets are studied separately

in different branches of mathematics, operations research and computer science.

For example when f is convex, gi concave and hj linear, we have what is called

a convex programming problem. Inequalities involving concave functions define a

convex feasible region for the problem and the problem concerns the minimization

of a convex function on a convex set. The most well-known property of this problem

is that if a local minimum exists, then it is a global minimum [41].

In another situation when f and all the gi and hj functions are linear, we come

up with the linear programming problem. Linear programming is an important field

of optimization and many practical problems in operations research and engineering

fields can be expressed as linear programming problems. Any problem in this class

reduces to the selection of a solution from among a finite set of possible solutions.

The problem is what we can call combinatorial. The finite set of candidate solu-

tions is the set of vertices of the convex polytope defined by the linear constraints.

The widely used simplex algorithm [12] finds an optimal solution to a linear pro-

gramming problem in a finite number of steps, though it is not a polynomial time

algorithm. This algorithms is based on the idea of improving the cost by moving

from vertex to vertex of the polytope.

Another set of optimization problems is the integer linear programs. These come

about when we consider linear programs and try to find the best solution with the

restriction that it should have integer valued coordinates. The general integer linear

16

Nonlinear Programming

Convex Programming

Linear Programming

Integer Programming

Flow and Matching

Figure 2.2: Nesting of the Optimization Problem Categories [41]

programming problem is itself NP-complete.

Flow and matching problems which are special cases of both linear programs

and integer linear programs are another subset of optimization problems that can

be solved much more efficiently than even general linear programs. Figure 2.2

indicates the nesting of the problems mentioned so far.

In general an optimization problem can be defined as a pair (F, c) where F is

any set, the domain of feasible points (solutions) and c is the cost function which

is a mapping:

c : F → R

The problem is to find an f ∈ F for which:

c(f) ≤ c(y) for all y ∈ F

Such a point f is called a globally optimal solution or simply an optimal solution.

Considering the definition of combinatorial test case generation problem from

Section 1.1, we can say that this problem is an instance of optimization problem.

The set F will be the set of all possible test suites that we may use for testing the

software under test. The c will refer to a combination of two functions. First the

cost of executing the test cases included in F and second the coverage gained from

execution of those test cases. A globally optimal solution will be a test suite which

17

results in higher coverage of value interactions by including a minimum number of

test cases.

Optimization problems seem to divide into two categories: those with continuous

variables and those with discrete variables. The latter is called combinatorial. In

the continuous problems, we are generally looking for a set of real numbers or even

a function whereas in the combinatorial problems, we are looking for an object

from a finite or possibly countably infinite set. The test case generation problem

because of its discrete nature is a combinatorial optimization problem.

Now it is clear the we can reduce the test case generation problem to an opti-

mization problem. Hence, we may try to use the mature techniques in the research

operation and computer science fields to tackle the problem. However we do not

have that much chance because studies in [33] shows that combinatorial test case

generation problem is NP-complete. Therefore, we get limited to the approaches

that are practical for solving the NP-complete problems of moderate size, namely:

approximation, enumerative techniques, and local search methods.

In the next section, we review some of existing test case generation techniques

which try to find near-optimal solutions for combinatorial test case generation prob-

lem.

2.5 Overview of Existing Combination Strategies

Combination strategies are a class of test case selection methods where test cases

are identified by choosing interesting values, and then combining those values of

test object parameters. The combinations are selected based on some combinatorial

strategy.

There are various combination strategies introduced in the literature. They can

get classified into different categories according to their specifications. For example

we have a group of deterministic combination strategies like Orthogonal Arrays

(OA) [11, 57] which enable us to anticipate the number of test frames. Contrary

to this, non-deterministic algorithms like Genetic Algorithms (GA) may result in

different test frames or even different number of test frames in each execution.

Iterative combination strategies can be stopped by reaching the required number

of test cases or the expected coverage. On the other hand instant algorithms such

as OA generate the whole set of test frames together and can not be stopped in the

middle. Some of the combination strategies are designed to generate the required

18

test frames for a fixed predetermined amount of coverage such as OA which is used

for reaching pairwise coverage. On the other hand some other algorithms such as

GA are flexible and can be configured to generate the desired extent of coverage

over the values. These characteristics of different algorithms should be considered

for selecting a suitable algorithm in different cases.

In this section, we briefly introduce some of the well-known combination strate-

gies that are used for test case generation.

2.5.1 Ant Colony Algorithm

One of the combination strategies for generating test cases is based on Ant Colony

Algorithm (ACA) [48]. ACA was first used to solve the traveling salesman problem

(TSP) [14], but has been successfully used to solve other combinatorial problems.

An ACA was inspired by the behavior of natural ant colonies in finding paths from

the colony to food. The concept of an ACA is to mimic this behavior with simulated

ants crawling the graph representing the possible solutions for the problem. Each

ant represents one candidate solution.

An ACA algorithm is based on a set of assumptions. The first assumption is

that each path from a starting point to an ending point in the solutions graph

is associated with a candidate solution to a given problem. The second idea in

the algorithm comes from the concept of pheromone deposition by ants. When an

ant reaches the ending point, the amount of pheromone deposited on each edge of

the path followed by this ant is proportional to the quality of the corresponding

candidate solution. The third assumption is that when an ant has to choose among

different edges at a given point, the edge with a larger amount of pheromone is

chosen with higher probability. As a result, the ants eventually converge to a short

path, hopefully the optimum or a near-optimal solution to the target problem.

Algorithm 1 shows the outline of the ACA procedure for test case generation.

2.5.2 Genetic Algorithms

Genetic Algorithm (GA) [48] mimics the evolution of simple, single celled organisms.

It is based on the concept that the candidate solution created by swapping two good

candidates is also good. GAs have been widely used in solving problems ranging

from optimizations to machine learning.

19

Algorithm 1 : Ant Colony Algorithm Outline

1: Let UC be a set of all tuples of parameter values that are not yet covered bythe selected test frames;

2: while UC is not empty do3: Place m ants at the starting point (initialize the population of candidates);4: for a specified number of iterations do5: for all ant k do6: Generate a candidate test frame TFk;7: Evaluate TFk;8: Lay pheromone;9: end for

10: Apply pheromone evaporation;11: Each ant leaves more pheromone on the traversed path;12: end for13: Let TF be the best test frame found;14: Add TF to the test set;15: Remove those tuples in UC that are covered by TF ;16: end while

In a GA each candidate solution must be encoded as a chromosome which is

usually a string of values; by evolving the population of chromosomes, a good indi-

vidual (solution) is eventually obtained. In test case generation problem, however,

a test frame can be directly treated as a chromosome because a test frame is sim-

ply a string of values. The fitness function is used to estimate the goodness of a

candidate solution. We define the fitness function for a test frame as the number

of new t-wise combinations that are not covered by the given test set but are cov-

ered by that test frame. At the initialization, the initial population of candidate

test frames is generated at random. After the initialization, the GA goes into the

evaluation loop. The GA continues to evolve until the stopping conditions are met.

At each generation, the best chromosomes in the population are kept and survive

to the next generation intact. The remaining test cases in the next population are

created by selecting a set pf parent chromosomes and applying an appropriate type

of crossover and mutation on those parents. Algorithm 2 shows the outline of the

GA method for test case generation.

2.5.3 Simulated Annealing

The Simulated Annealing (SA) algorithm is modeled after the effect of a slow cool-

ing process on the molecules of a metallic substance [40]. Just as cooling brings

these molecules to an optimal rest energy, this algorithm slowly converges the state

20

Algorithm 2 : Genetic Algorithms Outline


2: while UC is not empty do3: Create an initial population P consisting of m candidates;4: for a specified number of iterations do5: Identify Elite individuals for survival consisting of σ best individuals from

P ;6: Apply selection to individuals in P to create Pmating, consisting of (m−σ)

individuals;7: Crossover Pmating;8: Mutate Pmating;9: P = Elite+ Pmating;

10: end for11: Let TF be the best test frame found;12: Add TF to the test set;13: Remove those tuples in UC that are covered by TF ;14: end while

being examined toward an optimal state. Prior to running, an initial positive tem-

perature Tinitial and a decimal decrement factor α lying strictly between 0 and 1

must be provided as input. The algorithm’s simulated cooling schedule is deter-

mined entirely by these two parameters, as Tinitial is multiplied successively by α

after each pass through the main loop.

At each step, the algorithm randomly selects one candidate solution from the

neighbourhood of the current state and evaluates its fitness. The neighborhood can

be defined in various ways according to the specifications of the problem domain.

If the neighbouring solution is more fit than the current solution, then the neigh-

bouring array becomes the new candidate test frame. However, should the selected

neighbouring solution be less fit than the current solution, the SA heuristic is em-

ployed. This heuristic is the main feature of the SA algorithm and guarantees that

at each step, there is a non-zero probability of moving to a state which is less fit

than the current state. SA can accept worse states according to some probability

which is called the acceptance probability. The acceptance probability is dependent

on the current temperature and the cost difference of the two candidate solutions.

Normally as the temperature decreases, the probability of accepting a worse state

decreases. Algorithm 3 indicates the high level outline of the SA algorithm for test

case generation.

21

Algorithm 3 : Simulated Annealing Outline


2: Let Tinitial, T and Tfinal be the initial, current and final temperatures respec-tively;

3: while UC is not empty do4: Generate a random initial solution;5: T = Tinitial;6: while T > Tfinal do7: for a specified number of iterations do8: Generate a candidate solution which is neighbor to the current one;9: if new solution is better than the current one then

10: Accept the move;11: else12: Probabilistically accept the move;13: end if14: end for15: Update temperature: T = T ∗ α;16: end while17: Let TF be the best test frame found;18: Add TF to the test set;19: Remove those tuples in UC that are covered by TF ;20: end while

2.5.4 Tabu Search

The Tabu Search (TS) algorithm [19, 51] is similar to SA algorithm in this way that

both of the algorithms allow the selection of a new state which is less fit than the

current state. In TS unlike SA, each neighbourhood search is exhaustive, rather

than random, ensuring that the state chosen at each step is the best neighbouring

option possible. The TS algorithm has only one input parameter, called the tabu

length and denoted by L. At each step of the search, the TS algorithm evaluates

the fitness of every solution in the neighbourhood of the current one. The solution

in the neighbourhood which is the most fit is selected as the new state, regardless

of whether it is more or less fit than the current state.

One concern in such an algorithm is that it is entirely possible for it to be caught

in an infinite loop and to stop propagating through the search space [19]. To avoid

this situation, the TS algorithm uses a List of forbidden moves, called the tabu list.

At any step, this list contains a history of the L most recent moves, L being the

specified tabu length. Prior to deciding which neighbouring solution shall become

the new state, the algorithm verifies that the move resulting in the most fit solution

22

is not contained in the tabu list. If the move is not forbidden, the fittest solution

is selected as the new state. Otherwise, the algorithm considers the move resulting

in the next best solution, and then the next until the move is not contained in the

tabu list. If every locally available move is forbidden, then the algorithm stops and

gets restarted using a new initial solution.

Algorithm 4 : Tabu Search Outline


2: while UC is not empty do3: Initialize the tabu memory;4: Generate a random initial solution;5: for a specified number of iterations do6: Generate the complete neighbourhood of current solution;7: Update current solution to the best neighbor solution which is not re-

stricted by the tabu list;8: Update the tabu list;9: end for

10: Let TF be the best test frame found;11: Add TF to the test set;12: Remove those tuples in UC that are covered by TF ;13: end while

2.5.5 AETG: Automatic Efficient Test Generator

The Automatic Efficient Test Generator (AETG) algorithm proposed in [9, 10] is

a greedy algorithm which constructs a test set by repeatedly adding a test frame

that covers a large number of non-covered value tuples (interactions). Because

of this, the resulting test set has the property that a test frame created earlier

has more significant impact on interaction coverage. This is an important practical

advantage because in practice it is often the case that not all test cases are executed

due to time or cost constraints; even in such a situation, the tester can maximize

the interaction coverage by simply performing the tests in an earliest first manner.

The AETG algorithm works in an incremental manner. This algorithm starts

with an empty test set and adds one test frame at a time until the 100% coverage

for all t-tuples of values is achieved. The AETG algorithm uses a greedy strategy

in selecting each test frame; it creates many different candidate test frames and

selects from them the one that covers the greatest number of new combinations.

Algorithm 5 presents a high level outline for AETG method for test case generation.

23

Algorithm 5 : AETG Outline


2: while UC is not empty do3: for k times do4: {Make a new candidate test frame:}5: Select the variable and the value included in most tuples in UC;6: Put the rest of the variables into a random order;7: For each variable in the sequence determined by previous step, select the

value that together with previous selected values in the candidate testframe is included in or covers most tuples in UC;

8: end for9: Let TF be the test frame among these k generated test frames that covers

the most tuples in UC;10: Add TF to the test set;11: Remove those tuples in UC that are covered by TF ;12: end while

The number of test frames generated by the AETG algorithm is related to the

number of constructed candidates (k in the algorithm) for each test frame. In

general, larger values of k yield smaller numbers of test frames. However, Cohen

et al. [9] report that using values larger than 50 will not dramatically decrease the

number of test frames.

2.5.6 IPO: In-Parameter-Order

For a system with two or more variables, the In-Parameter-Order (IPO) combina-

tion strategy [33, 52] generates a test suite that satisfies 100% pairwise coverage

for the values of the first two variables. The test suite is then extended to satisfy

pairwise coverage for the values of the first three variables, and continues to do

so for the values of each additional variable until all variables are included in the

test suite. To extend the test suite with the values of the next variable, the IPO

strategy uses two algorithms. The first algorithm presented as Algorithm 6 which

is related to horizontal growth of the test suite extends the existing test frames in

the test suite with values of the next variable.

The second algorithm, vertical growth, shown as Algorithm 7, creates additional

test frames such that the test suite satisfies pairwise coverage for the values of the

new variable.

IPO algorithm has some advantages over AETG algorithm. AETG is funda-

24

Algorithm 6 : IPO Outline (Horizontal Growth)

1: Let τ be a test suite that satisfies pairwise coverage for the values of parametersp1 to pi−1;

2: Assume that parameter pi contains the values v1, v2, . . . , vq;3: Let π be pairs between values of pi and values of p1 to pi−1;4: if |τ | ≤ q then5: for 1 ≤ j ≤ |τ | do6: Extend the jth test frame in τ by adding value vj;7: Remove from π pairs covered by the extended test frame;8: end for9: else

10: for 1 ≤ j ≤ q do11: Extend the jth test frame in τ by adding value vj;12: Remove from π pairs covered by the extended test frame;13: end for14: for q < j ≤ |τ | do15: Extend the jth test frame in τ by adding one value of pi such that the

resulting test covers the most number of pairs in π;16: Remove from π pairs covered by the extended test frame;17: end for18: end if

Algorithm 7 : IPO Outline (Vertical Growth)

1: Let τ be the set of already selected test frames;2: Let π be the set of still uncovered pairs;3: Let τ ′ be an empty set;4: for each pair in π do5: {Assume that the pair contains value w of variable pk, 1 ≤ k < i, and value

u of pi}6: if τ ′ contains a test frame with “-” (non-determined value) as the value of pk

and u as the value of pi then7: Modify this test frame by replacing the “-” with w;8: else9: Add a new test frame to τ ′ that has w as the value of pk, u as the value of

pi, and “-” as the value of every other parameter;10: end if11: end for12: τ = τ + τ ′;

25

mentally non-deterministic, whereas IPO is deterministic and this characteristic

makes it more predictable in terms of the size of the final test suite. Also compar-

ing the efficiency, AETG has a higher order of complexity, both in terms of time

and space, than IPO.

2.5.7 CATS Algorithm

The Constrained Array Test System (CATS) algorithm for generating test cases is

based on a heuristic algorithm that can be custom designed to satisfy t-wise cover-

age. The algorithm was described by Sherwood [47] and an outline of the algorithm

that generates a test suite to satisfy t-wise coverage is shown in algorithm 8.

Algorithm 8 : CATS Outline


2: Let Q be the set of all possible combinations (test frame) not yet selected;3: while UC is not empty do4: Select test frame TF from Q by finding the combination that covers most

pairs in UC. If more than one combination covers the same amount selectthe first one encountered;

5: Add TF to the test set;6: Remove TF from Q;7: Remove those tuples in UC that are covered by TF ;8: end while

The CATS algorithm has some similarities with AETG. The main difference

is that CATS examines the whole list of unused test frames to find one that adds

as much new coverage as possible while AETG constructs test case candidates

one parameter value at a time based on coverage information. The constructed

candidates are then evaluated to select the best possible, which is added to the test

suite. In CATS it is guaranteed that best test frame is always selected while in

AETG there are no guarantees. However, for large test problems, AETG is more

efficient since only a small set of test frames has to be evaluated in each step. The

non-deterministic nature of the AETG algorithm makes it impossible to exactly

calculate the number of test frames in a test suite generated by the algorithm.

26

2.6 Summary

This chapter presented an overview of existing software testing methods. We

discussed about requirements-based software testing and partition testing and re-

viewed the reasons why these techniques are popular among software testers. We

will come back to these two techniques in the next chapter that we try to put them

together in an integrated framework. We also reinvestigated the combinatorial test

case generation problem as an optimization problem and surveyed a group of dif-

ferent combination strategies in the literature which try to solve the NP-complete

test case generation problem using enumerative and local search techniques.

27

Chapter 3

A Framework for

Requirements-Based Partition

Testing

“Requirements-Based Testing” is a validation testing technique where we consider

each requirement and derive a set of test cases for that requirement. There is also

another technique in the literature, called “Partition Testing”, which is used for

minimizing the number of permutations and combinations of input data. In this

second technique the assumption is that input data and output results often fall

into different classes where all members of a class are related. Each of these classes

is an equivalence partition or domain where the program behaves in an equivalent

way for each class member. Hence, we just need test cases to be chosen from each

partition. Both of these techniques have advantages for the testing process.

Although both of these techniques are mature and are addressed widely in the

literature and despite the general agreement on both of these key techniques of

functional testing, a combination of them has not been studied systematically. In

this chapter we propose a framework along with a procedural process for testing a

system using “Requirements-Based Partition Testing” (RBPT). The idea is putting

the two techniques together might give a solid technique which can be used in

various domains for functional testing of large systems.

28

3.1 Proposed Layered Architecture for the RBPT

Framework

In our RBPT framework the process starts from a list of requirements1. These

requirements normally come from a requirements group in the organization. In

most cases, these requirements are defined in natural language and do not follow

any formal structure. Our goal is to process these requirements to come up with a

list of test cases that can verify the whole set of requirements effectively. As Fig-

ure 3.1 illustrates, we need some intermediate structured components to transmit

the available requirements through a process which ends up to a list of required

test cases.

Requirements

R1

Rn

R3

R2

Test Cases

TC1

TCm

TC3

TC2

Intermediate Layersof

Structured Components

Figure 3.1: The RBPT process starts with a list of requirements and ends up to alist of required test cases.

What we propose in our framework is a layered architecture and a step by

step process. This process can transform the requirements smoothly through those

intermediate layers toward the final list of required test cases. We use five layers of

intermediate components (other than requirements and test cases themselves) for

our process, namely: i) Features, ii) Atomic Features, iii) Test Scenarios, iv) Frame

Sets, and v) Test Frames (Test Configurations).

Figure 3.2 shows these layers of components and the relations from each layer

to the next one. In the remainder of this section, we will discuss these intermediate

components layer by layer and explain how we can go forward from each layer to

the next one.

1In this chapter our emphasis is on functional requirements.

29

Requirements

R1

Rn

R3

R2

Atomic Features

AF1

AFk

AF3

AF2

Features

F1

Fd

F3

F2

Test Scenarios

TS1

TSt

TS3

TS2

Frame Sets

FS1

FSt

FS3

FS2

Test Cases

TC1

TCm

TC3

TC2

Test Frames(Test Configurations)

TF1,1

TF1,p1

TF2,1

TF2,p2

TFt,1

TFt,pt

* * 1 1..* 1 1..* 1 1

1 *

1 *

1 *

1 1

Layer 1

Layer 5

Layer 4

Layer 3

Layer 2Figure 3.2: Five Intermediate Layers of Components Between Requirements andTest Cases and Their Relations

3.1.1 Features Layer

As mentioned before, requirements normally come from a requirements group in

the organization and do not follow any formal structure. This lack of accuracy in

defining the requirements may cause various problems for testers. Here is a partial

list of such problems.

• The requirements may not be understandable by testers.

• The requirements may need to be validated to assure that they define the

various aspects of the system in a correct way. They also may need to be

checked not to have any contradictions or inconsistencies among them.

• The language used in the definition of the requirements may not be suitable

to be used in the testing group. Testers sometimes prefer to use their own

words and language for describing the system. This helps them to understand

the requirements in a more clear manner.

• Sometimes a group of the requirements may not be testable. They may need

to be changed in a way that can be verified by testers.

30

• Some requirements may refer to the same thing and cause some redundancy

or one requirement may refer to more than one aspect of the system and may

be required to get broken.

In our framework, we think about testing group as an independent department

in the organization. This means that, because requirements come from a different

group, testers can not change them directly. Requirements are defined by require-

ments group and have got approved. Testers can only use this output as their

baseline and define their process based on that.2

To handle these problems, we put a layer in the process called “Features Layer”.

This layer is shown in Figure 3.2 as Layer 1. In one sentence we can say that features

are a translation of requirements for testing group. Testers should generate a list of

features considering the existing requirements. Putting such a layer in the process

gives testers this freedom to rephrase the requirements however they prefer, using

their own words and language. This layer in the process has a bunch of benefits for

testers. They can use it to get along with the problems they face in terms of the

requirements:

• Features are more understandable for testers. Simply because they have de-

fined them themselves! However they need to have a clear understanding of

the requirements to be able to generate the list of features. According to

Ostrand [39]:

. . . the tester has to ask the specification writer to clarify the

intention of a particular section or sentence. These questions are

themselves a form of testing, as they may expose errors in the spec-

ification before any code is written or in the implementation before

any code is executed.

If testers be able to produce the features from requirements, we can make sure

that they have reached a high level of comprehension of the requirements.

• Testers need to define the features according to the existing requirements.

This gives them this opportunity to validate the requirements at the same

2This does not mean that we deny the cooperation between different departments in the orga-nization. Testing group may be able to give some feedback to the requirements group to changethe defined requirements. However, since the requirements are a type of approved configurationitem and may get used by other departments, testers won’t be able to make changes in themfreely as they like.

31

time. At the end of the day when the list of features is ready we know that

they are consistent and complete and can be used for the rest of the process.

Again according to Ostrand [39]:

. . . not only do natural language specifications lack structure,

but they are frequently incomplete, ambiguous, or self-contradictory.

The process of transforming the specification into an intermediate

representation can be useful for revealing such problems.

• Unlike the requirements which come from requirements group, features are

items inside the testing group. Testers will be able to change them whenever

they like and also they can use their own words for describing the system.

This makes the whole process more understandable for testers.

• If a requirement is not testable according to the understanding of testers,

they can consider it in the features they produce. They may rephrase the

requirement or transform it to an equivalent form to make it testable. Even

they may produce a null feature assigned to that specific requirement to show

that they have not covered that requirement.

• The redundancy and complexity problem of the requirements can also get

fixed using the features layer. Redundant or similar requirements can get

mixed in one feature and complex requirements, on the other side, may get

broken to two or more simpler features.

We should note that although we put features layer as an intermediate compo-

nent layer in the process to manipulate the requirements in a better way, we still

need to keep an eye on the requirements themselves. By defining the features we

do no remove the requirements. They are still our baseline and our final goal is to

verify the system under test against those approved requirements. Features help us

to make the the testing process more smooth but we should keep track of the rela-

tions between the requirements and their correspondent features continuously. As

Figure 3.2 indicates the relation from requirements to features is a many to many

relationship. This provides complete flexibility to a tester to arrange the features

according to his/her own preference. A many to many relationship also supports

removing non-testable requirements in features layer, breaking one requirement to

two or more features, combining similar requirements into one feature and adding

extra features required by testers. All these relations should be recorded. We will

be able to use these data in order to trace back the process to the requirements

and measure the requirements coverage of the generated final test cases.

32

3.1.2 Atomic Features Layer

The second intermediate layer in our framework is called “Atomic Features Layer”,

as Figure 3.2 depicts. The purpose of this layer is breaking the features to a set of

atomic, non-breakable items which are named atomic features in this context.

After generating a list of testable features and assigning them to the correspond-

ing requirements, we should continue the process toward the generation of final test

cases. In most cases, the requirements and hence the related features are expressed

in a general manner. These general features may associate with different aspects

of the system under test or even different components in the system. Normally we

can not test such general features directly. Hence, before taking any other step,

we need to decompose these features as much as we can. For doing this, testers

should break each one of these general features to a non-empty set of atomic fea-

tures which can not get broken anymore. These atomic features should refer to

an atomic functionality of the system. They should expose a one-directional view

from the system under test.

Since features and atomic features are defined in natural language, giving a

comprehensive definition of them is not straightforward. However, the goal of

the layer is clear: we need to make features as much simple and pure as we can.

Later in the process we may come up with a situation where we find out that some

atomic features can be decomposed into simpler features. In those cases, we need to

come back to the atomic features and polish them for resolving the inconsistencies

occurred in the process.

Some people may argue that we can combine Layers 1 and 2 in the framework

and decompose the requirements to atomic features directly. This is possible, how-

ever it has some drawbacks. Separating features layer from atomic features makes

the process more smooth and applicable for testers. These two layers have some

similarities but they have different goals embedded in their nature. Features layer

tries to rephrase the requirements for the purpose of making them understandable

for testers while the main concern for atomic features is that these features get sep-

arated from each other as much as possible. Putting these layers together makes

the process hard to comprehend and manage for testers in practice.

As Figure 3.2 shows for the relations from Layer 1 to Layer 2, each feature

should get broken to one or more atomic features in the second layer. We assign

exactly one atomic feature to a feature on the first layer (i.e. copy the feature

itself to the next layer as an atomic feature) when it is atomic itself and can not

33

get decomposed anymore. This way we have exactly one parent feature for each

atomic feature and we can trace back to the requirements layer if required.

3.1.3 Test Scenarios Layer

Now that we have decomposed features to atomic features, next step would be

to start testing these atomic features. Actually requirements layer and the first

two intermediate layers in the framework help testers to get the requirements as an

input, understand and prepare them for testing. From the second intermediate layer

forward the main testing process gets started. In this third layer, testers should try

to figure out how they are supposed to test each atomic feature by writing a “Test

Scenario” for it.

A test scenario is a story that describes a hypothetical situation. In testing, we

check how the program copes with this hypothetical situation. The ideal scenario

test is i) motivating, ii) credible, iii) complex and iv) easy to evaluate [29]. Cem

Kaner, in his paper published at 2003, talks about these characteristics of an ideal

test scenario in detail. He also provides a thorough list of guidelines for generating

good scenarios. A good test scenario should also contain a list of pre-conditions,

steps and post-conditions which should be concerned during the testing of its related

atomic feature.

The test scenarios in our framework are slightly different from what Kaner intro-

duces. We limit the test scenarios to be defined for each atomic feature separately.

This means that testers go through atomic features one by one and define a set

of scenarios for testing that specific atomic feature. This way each test scenario

would be assigned to exactly one atomic feature. Normally, the scenarios described

by Kaner and used in scenario testing, try to test the business process flows from

end to end. This results in the following problems:

• Finding good scenarios will be a complex process and will require a large

amount of detailed and technical knowledge from different aspects of the

system under test. Gaining such information will take a large portion of

testers’ time.

• Figuring out which requirements are covered and which ones still need other

scenarios to be defined for them, will need an extraordinary amount of effort.

For getting along with these problems, we define the test scenarios by focusing

34

on the atomic features, one at a time. When we limit the domain to just one atomic

feature we can benefit from following enhancements:

• Testers deal with just a small part of the system under test, so they can

prepare suitable effective scenarios for testing that portion of the system more

efficaciously.

• By defining a set of scenarios for testing each atomic feature we get assured

that we have not missed any requirement to be tested. Since atomic features

cover all the requirements, we reach 100% requirement coverage by defining

the required test scenarios for each atomic feature.

The relation between atomic features and test scenarios in Figure 3.2 emphasizes

that we need at least one test scenario for each atomic feature.

3.1.4 Frame Sets Layer

In this step, we have a list of test scenarios which we should verify them. In our

framework, as its names implies, we use “Partition Testing” and more specifically

“Category Partition Method” [39] for this purpose. Again, we go through test

scenarios one by one. For verifying a specific test scenario we follow the standard

approach for category partition method, described by Ostrand et al. in their 1988

paper [39]:

The tester specifies categories of environments and input values,

partitions each category into a set of mutually exclusive choices, and

describes constraints that govern the interactions between occurrences

of choices from different categories.

As mentioned above testers should analyze the test scenarios one by one. Ac-

cording to the input values and environment conditions embedded in each test sce-

nario and its associated atomic feature, testers generate a list of categories (called

also “Variables” or “Parameters” in the literature). We use the term “Variable” in

the remainder of this chapter for referring to these categories. A “Variable” is a ma-

jor property or characteristic of an input parameter or environment condition that

can affect the test scenario’s execution behaviour. After determining the variables

involved in the test scenario, testers partition each variable into its distinct choices.

35

Each choice is in fact a partition class of the possible values that can be assigned

to one variable. In another word, we partition the possible values of each variable

to a number of mutually exclusive partition classes and call each class a choice.

Sometimes in the literature the term “Value” itself may be used to refer to a choice

(a class of values). Following that nomenclature we also use the term “Values”

instead of choices. Actually each value is a representative for one of the partition

classes of its related variable. We need just one representative from each partition

class because we assume that all the elements in that partition are equivalent in

terms of their effectiveness to reveal a defect. According to Ostrand [39]:

The idea is that all elements within an equivalence class are essen-

tially the same for the purposes of testing. If the testing’s main emphasis

is to attempt to show the presence of errors, then the assumption is that

any element of a class will expose the error as well as any other one.

If the testing’s main emphasis is to attempt to give confidence in the

software’s correctness, then the assumption is that correct results for a

single element in a class will provide confidence that all elements in the

class would be processed correctly.

After determining the variables and values involved in one test scenario, we

put all those variables and their related values together in a group called “Frame

Set”. Then associate that specific frame set to its corresponding test scenario. As

Figure 3.2 depicts, frame sets have a one-to-one relation with test scenarios and

contain the variables and values that affect the execution of their associated test

scenario. We call them frame sets because as we will see later, in the next chapter,

the variables and values of each frame set get used for generating a set of test

frames.

Another important issue is the priority of the variables, inside their containing

frame set. In another word, testers should prioritize the variables relatively. These

priorities will be used later for generating effective test frames. Assigning a high

priority to a variable means that as a tester we are supposed to consider different

values of that variable in testing to a higher extend of coverage. On the other

hand, assigning a low priority to a variable means that a lower extend of coverage

is required for testing the test scenario behind each frame set against the values of

that specific variable. We call a group of variables with same priority a “Priority

Set” as Figure 3.3 illustrates.

Most of the time testers will need extra information from other departments

such as “marketing” and/or “customer service” to find out which values or variables

36

...

Priority Set K(highest coverage)

Priority Set 2

Variables ... ...

Priority Set 1(lowest coverage)

...

LegendVariablePriority Set

Figure 3.3: A Frame Set contains a set of prioritized variables.

should be considered as testers’ main concern in the process. The historical data

can be also considered as a major factor in the prioritization of variables, if they

are available from previous similar projects and their results.

3.1.5 Test Frames Layer

Next layer in our framework is the “Test Frames Layer”. As its names implies,

in this layer we get the variables and values of each frame set and generate the

required “Test Frames” (Test Configurations). A test frame is an instance of a

frame set. For producing a test frame for an specific frame set, we assign to each

variable included in that frame set, one of its possible values.

Each frame set can be thought about as a two dimensional matrix. The columns

of that matrix are indexed by the variables included in the frame set. Each row

of the matrix corresponds to one test frame, by assigning one possible value to

each variable in the frame set. Figure 3.4 shows two frame sets, their variables and

generated test frames. The length of the test frames (the number of values included

in the test frame) may vary from one frame set to another. Actually the length of

the test frames inside each frame set is equal to the number of variables associated

with that specific frame set.

For generating test frames, there are different algorithms which combine the

values of the different variables based on some combinatorial strategy. These al-

gorithms are called “Combination Strategies” in the literature. Each combination

strategy is a test case selection method [20]. We had an overview of different com-

bination strategies in Section 2.5.

3.1.6 Test Cases Layer

The last layer of components in our framework is the “Test Cases Layer”. Test

cases are our final entities to be produced. The ultimate goal of our framework

37

Frame Sets and Including Test Frames

Var1,1 Var1,r1Var1,2

Val1,1,1FS1 TF1,1

TF1,p1 Val1,p1,r1Val1,p1,2Val1,p1,1

Val1,1,r1Val1,1,2

Vart,1 Vart,rtVart,2

Valt,1,1FSt TFt,1

TFt,pt Valt,pt,rtValt,pt,2Valt,pt,1

Valt,1,rtValt,1,2

ValueVariable

: The j-th Variable inside the i-th Frame Set

: The k-th Value assigned to the j-th Variable inside the i-th Frame Set

: The j-th Test Frame inside the i-th Frame Set

: The i-th Frame Set

: The number of Frame Sets

FSi

TFi,j

Vari,j

t

Vari,j,k

: The number of Test Frames inside the i-th Frame Setpi

ri : The number of Variables inside the i-th Frame Set

Figure 3.4: Each Test Frame assigns one possible value to each variable includedin the frame set.

38

was to generate a thorough list of test cases to verify the system under test against

its predefined requirements. In this last step, we generate one test case for each

test frame. There is a one-to-one relationship between test cases and test frames

as shown in Figure 3.2.

Each test case consists of a test frame (test configuration) and a scenario behind

that test frame. Hence for producing a test case we simply combine each test frame

with its associated test scenario. Since each test frame is associated with exactly

one frame set and each frame set has a one-to-one relation with exactly one test

scenario, so each test frame will have exactly one test scenario assigned to that.

By combining each test frame with its corresponding test scenario we end up to an

executable test case.

The test frames layer and the test cases layer can be produced automatically,

unlike the previous layers which require manual process by testers.

3.2 RBPT-Based Test Case Generation Process

Now that we have got introduced with the layered architecture of the proposed

framework, the next step will be to put everything together and define a test case

generation process which traverses these layers for generating a complete test suite.

The main goal of this process will be to formalize and automate the required ac-

tivities for testing a newly developed system.

Figure 3.5 presents our proposed process. It consists of two main stages: “Re-

quirements Modeling” and “Test Case Generation”. The first stage helps testers

build a requirements model that provides a semi-formal representation of the re-

quirements. This model is then used in test case generation stage to automatically

generate the optimal set of test cases that is needed to test the product adequately.

Next, testers execute those test cases (outside the scope of the framework) and

record the results. The following sections elaborate further on each aforementioned

stages.

3.2.1 Requirements Modeling

The first stage of our proposed process deals with the first two intermediate layers

of our framework: Features and Atomic Features Layers. The main goal of this

stage is to address the validation of the requirements to make sure that they are

39

Test Case GenerationRequirements Modeling

Time Line

Translation & Validation

Funtional Decomposition

Requirements Features

Test Scenario Definition

Variable-Value Identification

Test Scenarios Frame Sets Test Frames Test Cases

Variable-Value Combination

Atomic Features

Test Frame-Scenario

Combination

Legend

Manual Process

Automated Process Data Component

Data Flow

Figure 3.5: Test Case Generation Process Diagram

correct, complete, unambiguous, and logically consistent. As Figure 3.5 illustrates,

requirements modeling stage consists of two main steps, namely: i) Translation and

Validation and ii) Functional Decomposition.

In the first step mentioned above testers review the requirements. They try

to understand the requirements and validate them against their expectations from

the system under test. They also check that those requirements are in consistency

with each other. This step is critical in the whole process, because in this step we

develop a foundation for further testing and our future steps will rely heavily on

the correctness and completeness of the product requirements. Next, testers rewrite

the requirements as they like, using their own language and make a list of features

as described in Section 3.1.1. The output of this step will be a list of consistent

features which build the first intermediate layer in our proposed framework. The

relation between requirements and features should be carefully tracked during this

step and stored for further use.

The second step in the requirements modeling stage is functional decomposition

of the features. As we discussed in Section 3.1.2 we need to break the features as

much as possible from functionality point of view. In this step we do this decom-

position and generate a list of atomic features which build the second intermediate

layer of our framework. This decomposition helps us to concentrate on different

modules and issues more accurately and increase the effectiveness of our testing pro-

cess extensively. Again the relation between features and atomic features should

be stored for future use in the process.

40

3.2.2 Test Case Generation

Test case generation is the second stage of our proposed process. In this stage our

goal is to attain a necessary and sufficient set of test cases from software require-

ments to ensure that the design and code fully satisfy those requirements. Figure 3.5

depicts a four step plan for achieving this goal. These steps are called: i) Test Sce-

nario Definition, ii) Variable-Value Identification, iii) Variable-Value Combinations,

and iv) “Test Frame”-Scenario Combination.

The first step tries to use kind of scenario testing for verifying the atomic features

generated in the previous step. Identifying good test scenarios requires considering

an extended list of guidelines and specifications [29].

The second step starts to implement the category partition method [39] in our

process. In this step testers define a set of variables and their associated values for

each test scenario.

“Variable-Value Combination” step is simply for combining the chosen values

of the participating variables with the purpose of producing efficient and complete

set of test frames (test configurations).

In the final step, testers should put together the generated test frames and their

corresponding test scenarios. For having a meaningful test case we need to have

useful scenarios and also the parameters and variables inside that scenario should

get assigned an appropriate value. Test scenarios in the third layer of our proposed

framework are responsible for generation of perfect scenarios and test frames of

the fifth layer are in charge for producing efficient combinations of values. In this

step testers combine these two layers and the output will be a set of desired test

cases which can get executed for verifying the system under test against predefined

requirements.

3.3 Summary

In this chapter we designed a layered framework along with a procedural process

for testing a system using “Requirements-Based Partition Testing” (RBPT). We

discussed about each layer of the framework and how it is useful. The chapter

also put all the layers in the framework together and defined a test case generation

process. This process can be used to traverse the layers in the framework for

generating a complete test suite.

41

Chapter 4

Particle Swarm Optimization for

Test Case Generation

Choosing appropriate test cases is an essential part of software testing that can

lead to significant improvements in efficiency, as well as reduced costs of combina-

torial testing. Finding minimum size test sets is NP-complete. Therefore, artificial

intelligence-based search algorithms have been widely used for generating near-

optimal solutions. In this chapter, we propose a novel technique for test case gen-

eration, using Particle Swarm Optimization (PSO), an effective optimization tool

which has emerged in the last decade. Through some experiments, we illustrate how

this new technique can outperform other existing test generation methodologies.

4.1 Introduction

Software testing is a fundamental part of the software development life cycle (SDLC).

This stage of SDLC usually takes a big portion of the time required before soft-

ware release. Different testing methodologies are available to unravel any bugs

that might have been committed during previous phases. Choosing appropriate

test cases, can significantly increase the efficiency of testing, and at the same time

reduce any costs associated with it. This problem has been studied extensively in

the software engineering literature.

Partition testing and combinatorial testing refer to two sets of effective tech-

niques which are widely used for generating effective test cases. Partition testing

techniques [56, 39] divide the input domain of the program under test into subsets

with the testers choosing one or more elements from each subset. The assumption

42

is that all the elements in one subset are the same for the purpose of testing and

revealing bugs. These partitioning techniques result in input parameter models,

which are representations of the input space of the system under test via a set of

parameters and values for these parameters. Combinatorial testing [6] is usually

referred to a group of test case selection algorithms and techniques which use com-

binatorial designs for generating efficient test suites. Usually the large number of

parameters and values in the input parameter model needs a large set of combina-

tions to be tested. Combination strategies try to minimize the required number of

combinations for testing and generate test suites of reasonable size without loss of

effectiveness of test cases.

There are various combination strategies in the literature [20]. Some of them

are deterministic and generate a fixed number of test frames for the same input

while the others are non-deterministic and include some sort of randomness. Non-

deterministic algorithms may result in different test frames or even different number

of test frames during each execution on the same input. A group of deterministic al-

gorithms use algebraic notions such as orthogonal arrays and covering arrays. Since

construction of minimum size test set is NP-complete [33] it is unlikely to find an

efficient polynomial algorithm which always generates the optimal test set. Hence,

most of the non-deterministic algorithms get help from heuristic-based techniques

and artificial intelligence-based search algorithms. Various search-based algorithms

have been developed in the literature which try to generate a near-optimal test

set in a very short time. Genetic algorithms, ant colony algorithm and simulated

annealing are some of the well-known artificial intelligence-based techniques which

are used for test case generation [48, 38]. However these algorithms have started

getting competition from other heuristic search techniques, such as the Particle

Swarm Optimization (PSO). Various works (e.g. [59, 8, 24, 26]) show that particle

swarm optimization is equally well-suited or even better than some other techniques

in different domains. At the same time, a particle swarm algorithm is much simpler,

easier to implement and has just a few number of parameters that the user has to

adjust. These properties make PSO an ideal technique for test case generation.

In this paper we propose a new algorithm based on swarm particle optimization

technique (Section 4.3).

For comparing the resulting PSO combination strategy with other existing tech-

niques in the literature, we developed a framework which can be used for automatic

evaluation of different algorithms (Section 4.4.1). Normally, the size of the gener-

ated test suites gets used as a metric for comparison of different test case generation

techniques. In Section 4.4.1, we also propose a new effectiveness measure which

43

produces better and more precise assessment from the output of such algorithms.

Finally in Section 4.4.2, we use our proposed framework and effectiveness measure

for executing some experiments and show that PSO combination strategy can be

as effective as other existing algorithms and even beat them in some cases.

4.2 Test Suites as Covering Arrays

In the combinatorial design literature a covering array CAλ(N ; t, k, v), is an N × karray on v symbols such that every N × t sub-array contains all the t-tuples from

those v symbols at least λ times [11]. In other words, any subset of t-columns of

this array will contain each t-tuples of the symbols at least λ times. When λ = 1

we use the notation CA(N ; t, k, v). In such an array, t is called the strength, k the

degree and v the order. A covering array is optimal if it contains the minimum

possible number of rows. We use the notation CAλ(t, k, v) when the number of

rows is not determined yet. The following table shows a covering array with 9 rows

which covers all the 3-tuples of the 2 symbols (0 and 1 in this example) from 4

variables (parameters).

Test Frames Variables(Configurations) (Parameters)

1 00012 00103 01004 01115 10006 10107 11018 11109 1011

Table 4.1: CA(N = 9; t = 3, k = 4, v = 2)

This covering array is equivalent to a test suite which contains 9 test frames.

Each test frame assigns values to 4 variables where each variable has 2 values. The

whole test suite provides 3-wise coverage on the values of these 4 variables. The

mentioned covering array in the above example is not optimal. We can generate a

covering array with 8 (instead of 9) rows which gives us the same coverage. Finding

an optimal covering array of strength t, or equivalently generating a minimum size

test suite with t-wise coverage, is an NP-complete problem. If the number of values

44

for each variable is different we use mixed level covering arrays. A mixed level

covering array MCAλ(N ; t, k, (v1v2 . . . vk)) is a covering array where each variable

has vi distinct values and v =∑k

i=1 vi. Each column i of the mixed level covering

array contains only elements from the vi values of the ith variable. We use a

shorthand notation to describe covering arrays by combining equal entries in (vi :

1 ≤ i ≤ k). For example three entries each equal to two can be written as 23.

Consider a CA(N ; t, (wk11 wk22 . . . wks

s ). In this array we have:

k =s∑i=1

ki and v =s∑i=1

kiwi

The same shorthand notation can also be used for mixed level covering arrays.

4.3 Particle Swarm Optimization for Software Test-

ing

In this section, we will have an overview on particle swarm optimization technique

and we will show how we can use it for test case generation. Section 4.3.1 describes

the outline of the PSO, and Section 4.3.2 provides the details of applying PSO in

test case generation for combinatorial testing.

4.3.1 PSO Technique

Particle Swarm Optimization (PSO) is a very effective optimization tool, which

has emerged in the last decade. It was first introduced in 1995 by Kennedy and

Eberhart [17, 31]. Although, the original aim was to simulate the behavior of a

group of birds or a school of fish looking for food, it was quickly realized that it

can be applied in optimization problems.

PSO is similar to GA (Genetic Algorithm) and ACA (Ant Colony Algorithm)

in the way that it is a population-based meta heuristic algorithm. It is an approach

that manipulates a number of candidate solutions at once. A solution is referred

to as a particle, the whole population is referred to as a swarm. Each particle

represents a solution and moves in the search space to find better positions in the

space or in another word better solutions for the problem. Each particle also holds

the information essential for its movement such as:

45

• Its current position: xi

• Its current velocity: vi

• The best position it has achieved so far which is called personal best : pBesti

• The best position achieved by the particles in its neighborhood which is called

local best : lBesti

• The best position achieved by the particles in the whole swarm which is called

global best : gBesti

Particles adjust their velocity to move towards their personal best, local best

and the swarm’s global best.

PSO starts with a set of random solutions by assigning a random position to

each particle. Then similar to other local search algorithms it iteratively updates

the position of the particles in the hope of finding better solutions. During these

iterations each particle explores the search space by changing its position according

to an update rule. Update rule normally guides each particle toward the best

positions achieved by the particle itself, its neighbor particles and the best position

achieved by the whole swarm. This leads to further explorations of regions that

turned out to include more profitable solutions. Figure 4.1 shows the general outline

of the PSO algorithm.

• Initialize the swarm– While termination criteria is not met

• For each particle– Update the particle’s velocity (using update rule)– Update the particle’s position– Update the particle’s personal best

• End For• Update the lBest for each particle• Update the gBest

– End While• End

Figure 4.1: A Simple Outline for PSO with Synchronous Update

PSO can be synchronous or asynchronous depending on the location of lBest

update. This can be done outside the for loop as Figure 4.1 illustrates or can be

moved inside the loop. The former is called synchronous PSO and the latter is

asynchronous. Asynchronous version usually produces better results as it causes

46

the particles to use a more up-to-date information, however this might not be the

case, depending on the underlying problem.

4.3.2 PSO for Test Case Generation

Here we propose our method for test case generation using PSO.

Particle Initialization in Discrete PSO: In PSO each particle is a vector. The

order of each vector is the same as the order of the problem’s search space. In the

test case generation problem, we want to find test frames that give us better cov-

erage on the values of the related variables. If each test frame contains D variables

then the search space and hence the particle vectors will also be D-dimensional.

Since PSO is designed for solving continuous problems, each dimension of the par-

ticles should be able to hold any real number. However, the test case genera-

tion problem and many other optimization problems are set in a space featuring

discrete variables, so we require the use of a discrete version of PSO for dealing

with these problems. For dealing with this problem, first we initialize each parti-

cle vector with discrete values. Each particle will be a D-dimensional vector say

xj = (x1j , x

2j , . . . , x

Dj ) where each dimension is an integer between 0 and vi (number

of the values of the ith variable). Also, during the execution of the algorithm, we

simply round the calculated velocities to the nearest integer number. Since the ini-

tial positions and the velocities of particles are integers, each particle is guaranteed

to have integer positions at all times.

Particle Motions: In PSO particles move around the search space using an update

rule. The update rule is normally in the following form:

vdj (t) = ωvdj (t− 1) (4.1)

+ crdj (pBestdj (t− 1)− xdj (t− 1)) (4.2)

+ c′r′dj (lBest

dj (t− 1)− xdj (t− 1)) (4.3)

In each iteration of PSO, the velocity of each particle (vj) gets updated according

to the update rule and the particle moves around in the search space by adding

the newly calculated velocity to its current position. As we mentioned before, we

round the value of each dimension to the nearest integer number after each update.

47

In the above rule t is the time or the iteration number where j and d refer to

the particle index and the dimension respectively.

The first line of the update rule is called the inertia component which accommo-

dates the fact that a particle should not change its direction of movement suddenly.

The ω factor is the inertia weight which can be adjusted to increase or decrease

the amount of freedom a particle has for changing its direction. This parameter

regulates the trade-off between the global (wide-ranging) and local (nearby) explo-

ration abilities of the swarm. A small inertia weight facilitates global exploration

(searching new areas), while a large one tends to facilitate local exploration, i.e.

fine-tuning the current search area. A suitable value for the inertia weight usually

provides balance between global and local exploration abilities and consequently

results in a reduction of the number of iterations required to locate the optimum

solution. According to [2], it is better to initially set the inertia weight to a large

value, in order to promote global exploration of the search space, and gradually

decrease it to get more refined solutions.

The second and third lines of the update rule are called cognitive and social

components respectively. Here, c and c′ factors in the beginning of these two com-

ponents are acceleration coefficients which adjust the weight between cognitive and

social components of the update rule. Increasing c shifts the weight toward cogni-

tive component and causes the particles trust their own experience more and move

toward their pBest. On the other hand increasing c′ makes the social component

more impressive for particles and guides them toward their lBest. These factors

should get configured considering the problem domain.

For generating randomness in the update rule, two random factors r and r′ are

used which are random real numbers between 0 and 1. These two random factors

are generated for each dimension of each particle. Using the same random number

for all the dimensions of a particle results in linear PSO which usually produces

sub-optimal solutions.

Boundary Condition: In the test case generation problem each variable has a

fixed number of values and each dimension of the particle vector should refer to

one of the values of its correspondent variable. In other words, for a particle j,

all dimensions of its position xj should lie in [0, vi] where vi is the number of the

values for the ith variable. For meeting this condition during the execution of the

algorithm, we need to define a higher and lower bound for velocity dimensions and

set a boundary condition to handle the overflow situations where a particle flies out

of the permitted search space.

48

Setting the the maximum velocity dimension allowed for the particles, V imax,

is an important factor in PSO. If the maximum velocity is too high, particles

can fly past optimal solutions easily, resulting in poor final results. On the other

hand, if it is too low, particles can get stuck in local optimum. Since the particle

dimensions should be bound to the numbers between 0 and vi, defining V imax = vi/2

and restricting the velocity dimensions to be in [−V imax, V

imax], seems to be a good

choice for the test case generation problem because it both bounds the velocity and

provides coverage on the whole space.

Even after limiting the velocity of each particle, we need a boundary condition to

handle the cases where a particle goes out of the permitted limits. There are a few

different boundary conditions described in the literature such as absorbing walls,

reflecting walls and invisible walls which are proposed in [45]. For the absorbing

walls boundary condition when the particle reaches the boundary of a dimension,

the velocity in that dimension changes to zero. For the reflecting walls boundary

condition the sign of the velocity in the related dimension toggles when particle

reaches the boundary. Finally for the invisible walls boundary condition the particle

is allowed to fly through the boundary of the dimension, but the fitness of such a

particle outside the boundaries is not computed. Damping walls [27] is another

boundary condition which tries to lie in between the two absorbing and reflecting

techniques. In this boundary condition whenever a particle tries to escape the

search space in any of the dimensions, part of the velocity in that dimension gets

absorbed by the boundary and the particle is then reflected back to the search space

with a damped velocity along with a reversal of sign. Figure 4.2 depicts how these

boundary conditions work.

49

Bounded Dimension (d)

(t)djx

)(tdjx 1+

Absorbing


(t)djx

)(tdjx 1+

Invisible


(t)djx

)(tdjx 1+

Reflecting


(t)djx

)(tdjx 1+

Damping

Figure 4.2: Boundary conditions keep particles inside the limited area by changingtheir velocity in the appropriate direction.

The problem with these boundary conditions is that they all try to keep the

particle in the limited area by manipulating the velocity of the particle. Previously,

we mentioned that each particle adjusts its velocity to move towards its personal

best, local best and the swarm’s global best. This is how the particle gets directed

toward better solutions. The quick change of the velocity by boundary conditions

interrupts the particle’s smooth motion toward the target point. This is unavoidable

in the situations where the search space is not bounded from one or more directions.

However, in some other optimization problems such as test case generation, the

search space is limited from all sides. In order to avoid the interruption in the

velocity in such problems, we propose a novel boundary condition which we call

“cyclic walls”. This boundary condition can be used for the problems with finite

search space.

Figure 4.3 illustrates how cyclic walls boundary condition works. With this

boundary condition, whenever a particle tries to scape the search space in any of the

dimensions, it continues its motion with the same velocity, starting from the other

bound of that dimension. This happens by reseting the position in that dimension

to the other end point of the limited interval for that specified dimension. This

way the particles resides inside the search space without any extra manipulation

of its velocity. This can be helpful because the velocity and moving direction of

the particles are more important than their current position. Actually this is the

moving direction that guides the particles according to the experience of the swarm

toward more profitable regions and any interference in the velocity outside of the

update rule is not desirable.

50


(t)djx

)(tdjx 1+

Figure 4.3: Cyclic Walls Boundary Condition: Particle resides in the search spaceby jumping to the other end point of the dimension without any interference in itsvelocity.

Neighborhoods: As the above update rule implies, the motion of each particle

gets affected by the pBest of its neighbors. In another word particles share their

personal best information with each other. Selecting a proper neighborhood has

influence on the algorithm in many ways. It can improve the convergence speed of

the algorithm and helps in avoiding getting stuck at local optimum.

Various neighborhood topologies have been introduced [18, 30]. The most obvi-

ous topology is the gBest model in which all the particles of the swarm are neighbor

to each other and lBest is actually equal to gBest. This model causes fast propa-

gation of information in a swarm but can get stuck easily in local optimum.

Other models define a more specific neighborhood for each particle. One way

is to select the neighbors of a particle among those ones that are closest to it in

the search space. Being close gets defined based on the distance of the particles

in Cartesian space. This approach is more accurate but might be computationally

expensive. For reducing the cost we define the neighborhood based on the data

structure which maintains the particles. For example if the particles are stored in a

matrix (or array), we consider those ones next to each other in the matrix as neigh-

bors. Finally in order to have a fixed size for all the neighborhoods in the swarm,

we assume a cyclic nature for any of the data structures, when selecting neighbor

particles, even if that is not the case in reality. Still the size of the neighborhood

should be adjusted according to the experimental results in the domain.

Fitness Function: The fitness function is used to estimate the goodness of a

candidate solution. We define the fitness function F (s) for a test case “s” as the

number of new t-tuples from the values of the related variables that are not covered

by the given test set but are covered by “s”.

Stagnation Condition: In stagnation situation where there is no improvement

51

of gBest over a number of iterations, called stale period, we reset the position of

all the particles to refresh the search one more time.

Termination Criteria: The same as other local search techniques, various ter-

mination criteria can be used for PSO such as: i) Reaching a maximum number of

iterations, ii) Reaching a maximum number of evaluations, iii) Reaching an accept-

able solution, and iv) Reaching a maximum number of stagnations.

We use the second criteria for this problem and stop the algorithms after M

iterations where M is a fixed number which should be determined before evalua-

tions. We choose this criteria because first of all it gives a better estimate of the

cost of the algorithm in comparison with other criteria (The major cost in local

search algorithms is usually related to the evaluation part.). Secondly it results in

a fairer comparison of different algorithms. Those algorithms are better which gen-

erate better solutions using fewer number of evaluations and accordingly causing

less cost.

4.4 Empirical Experiments

For comparing our proposed PSO method with other algorithms, we have developed

a framework which can be used for automatic evaluation of different techniques.

Section 4.4.1 describes this framework from a high level point of view. This frame-

work is used in the Section 4.4.2 for a comprehensive empirical comparison between

different test case generation techniques.

4.4.1 Test Case Generation Framework

For having a precise evaluation of our proposed technique and a fir comparison

with other existing methods, we have developed a test case generation framework.

Figure 4.4 schematically presents the high level structure of this framework.

Test Case Generation Process: The framework obtains its settings as an input

from user (Figure 4.4). These settings include parameters such as the number of

test sets to be generated, number of variables in each test set and maximum number

of values for each variable. Then “Automatic Random Test Set Generator” builds

a group of required test sets and “Input Tables” and stores the settings of each

test set in one corresponding input table. The input tables also maintain the

52

Figure 4.4: Test Case Generation Framework

relationships and constraints between different variables or values. As shown in

Figure 4.4, these input tables get passed to the “Test Frame Generator” which has

access to a group of “Combination Strategies”. Then, the test frame generator

executes each combination strategy on all of the “Input Tables” for a fixed number

of times and places the “Test Frames” generated by each combination strategy for

each input table in a separate “Frame Set”. Finally, “Frame Set” analyzer goes

through all of the frame sets and computes the required metrics and gathers a list

of predetermined statistics. The output of the framework will be a set of “Results”

according to the values of those metrics which ranks the combination strategies by

their efficiency.

Generated Test Objects: Although lots of research has been done on different

combination strategies there is no any comprehensive set of input data to be used

for evaluation of these techniques. To deal with this lack of required input we

decided to create a large set of test sets that feature different grades of complexity.

AS we mentioned in Section 4.2, each test set with variables having varying number

of values is equivalent to a multi level covering array MCA(N ; t, k, (v1v2 . . . vk)). k

and vi which refer to the number of variables and values can be any positive integer

greater than 2. We defined an upper bound of 20 for k and an upper bound of 10

for vi which is the case in most of the practical situations. This way we will have a

19× 9 table, each cell of which with indices (i, j) refers to test sets with i variables

and maximum number of j values for each variable. According to such a table,

“Automatic Random Test Set Generator” 4.4 builds one test set related to each

53

cell of the table. This provides us 171 test sets with various amount of complexity

which can be used in benchmarking of different combination strategies.

Test Case Generation Metric: After executing different combination strategies

on the test objects produced by the framework’s test set generator, we need to

evaluate the output of these different algorithms. Normally the size of the generated

test suites gets used as a criteria for evaluation of each technique. Since most of the

existing techniques perform similar in terms of efficiency, the size of the produced

test suites will be also close to each other. In order to have a better evaluation of

the results and exaggerate the distance between these techniques, we need a finer

metric to be used for comparison. Here, we propose a novel metric for assessing the

output of the test case generation techniques which helps us to have a more precise

benchmarking.

One of the main goals in the test case generation problem is to reach higher

coverage using fewer number of test cases. Using the test suite size as the eval-

uation metric just shows a snapshot from the final result at the end point where

the combination strategies have reached 100% coverage. The rate of the coverage

growth during the test case generation process gets ignored, though it plays a big

role in recognition of profitable combination strategies. In practice, a test suite is

more preferable if it reaches higher coverage faster than the others. We formalized

this goal by introducing a measure called “Percentage Uncovered Tuples Extension

(PUTE)”. For a test set S with strength t and size n, we define PUTE as the

summation of the uncovered percentage of the t-tuples over all of the test frames

TFi included in the test set:

PUTE(S) =n∑i=1

U(TFi) (4.4)

Where U(TFi) is the percentage of the uncovered t-tuples after adding the ith test

frame. PUTE can get any value greater than zero while lower PUTE numbers

mean faster (better) coverage growth rates. PUTE is similar to AFPD metric

defined in [46] which is a measure of how rapidly a prioritized test suite detects

faults.

To illustrate this measure, consider an example test set with 3 variables of

size 2 and three test suites, S1, S2 and S3, generated by different algorithms on

this test set in Table 4.2. These three test suites provide full 2-wise coverage on

the values of the variables in the test set in different ways. Figure 4.5 shows the

54

1 2 3 40

10

20

30

40

50

60

70

80

90

100

(a) PUTE= 150

1 2 3 4 5 60

10

20

30

40

50

60

70

80

90

100

(b) PUTE= 175

1 2 3 4 5 60

10

20

30

40

50

60

70

80

90

100

(c) PUTE= 225

Figure 4.5: An example of 3 test suites, illustrating the PUTE measure.

Variable ValuesV ar1 a, bV ar2 A, BV ar3 1, 2

Test Test Percentage PercentageSuite Frames Coverage Uncovered

a A 1 25% 75%a B 2 50% 50%

S1 b A 2 75% 25%b B 1 100% 0%a A 1 25% 75%a B 2 50% 50%b A 2 75% 25%

S2 b B 2 83% 17%b A 1 92% 8%b B 1 100% 0%a A 1 25% 75%a A 2 42% 58%a B 1 58% 42%

S3 a B 2 67% 33%b A 1 83% 17%b B 2 100% 0%

Table 4.2: An example test set and 3 test suites which provide 100% 2-wise coverageon the values of the variables in the test set.

55

percentage of the covered tuples versus the number of test frames generated by each

test suite. For example in Figure 4.5a, the first test frame covers 3 of the 2-tuples,

producing 25 percent 2-wise coverage. The second test frame covers 3 more tuples

and add another 25 percent to whole coverage resulting in 50% 2-wise coverage.

In Figure 4.5, the area inside the inscribed rectangles (dashed boxes) represents

the percentage of tuples covered over the corresponding number of test frames of

the test suite. The solid lines connecting the corners of the inscribed rectangles

interpolate the gain in the percentage of tuples covered. The area above the curve

thus represents the the summation of the uncovered percentage of the 2-tuples over

all of the test frames included in the test set. This area is the test suite’s percentage

uncovered tuples extension (PUTE). The first test suite reaches full coverage by

just 4 test frames while the other two test suites include 6 test frames. Although

S2 and S3 have the same number of test frames included, S2 is more preferable,

because it reaches the higher coverage faster. The PUTE measure gives us the

ranking we need by reflecting lower values for better solutions and we can use it

for comparison of combination strategies even if they have the same number of test

frames in their final result.

4.4.2 Experimental Comparison with Other Algorithms

Various combination strategies have been developed and deployed for combinato-

rial test case generation. Here, in this section we compare our PSO technique with

some of these existing algorithms. Table 4.3 summarizes the selected algorithms

and the settings that we have used for executing them along with a list of references

for each algorithm. These techniques belong to different categories. For example,

ACA, GA and PSO are population-based search algorithms while SA and TS are

trajectory search methods. AETG and IPO are also 2 other well-known greedy

techniques which are included in our empirical study. We have also included Ran-

dom Testing method to see how much other techniques perform above random. All

these algorithms are implemented as part of our test case generation framework.

For our proposed PSO method, we used some of the settings suggested in [59]

and [34] as shown in Table 4.3 and executed the algorithm in asynchronous update

mode, with a neighborhood size of 10.

Our experiments are done using the 171 test objects which are generated ran-

domly by our random test set generator. In order to acquire results with sufficient

statistical significance, all our experiments were repeated 5 times. The comparison

of algorithms is in terms of 2-wise coverage.

56

Common SettingsParameter ValueCoverage Criteria 2-wise CoverageStale Period 10Max No. of Evaluations 1200

Parameter ValueNo. of Particles 40Inertia Weight (ω) Linearly decreased

from 0.9 to 0.4Acceleration Coefficients (c, c′) 1.49445Max Velocity Dimension V i

max vi/2Boundary Condition Cyclic WallspBest Update Mode AsynchronousNeighborhood Size 10

Parameter ValueNo. of Ants 20Initial Pheromone 0.4Update Factor 0.01Decay Factor 0.5

Parameter ValuePopulation Size 25Elite Size 1Selection Probability 0.8Crossover Probability 0.75Mutation Probability 0.03Massive Mutation Probability 0.25

Parameter ValueInitial Temperature 1000Final Temperature 0.01Temperature Reduction Factor 0.85No. of Iterations per Temperature 100

Parameter ValueTabu Length 15

Parameter ValueRepeat 50

Random Testing

Table 4.3: Selected Combination Strategies and their Settings

57

Figure 4.6 reflects the results of our experiments. It can be seen that PSO

algorithm produces test suites at least as effective as those produced by other

algorithms.

4.5 Summary

Particle Swarm Optimization (PSO) is a relatively recent heuristic search method

that is based on the idea of collaborative behavior of animals living in swarms. In

this chapter, we proposed a new test case generation algorithm for t-wise testing

based on particle swarm optimization technique. Other contributions of this work

are as follows:

• We proposed a simple boundary condition, called cyclic walls, for PSO that

can be used for solving the problems which have a finite search space.

• We developed a framework for automatic comparison of different test case

generation algorithms.

• We introduced a new test case generation metric for assessing the effectiveness

of the test suites, generated by different combination strategies.

• We illustrated through empirical experiments that PSO can be as effective as


We would like to note that more experiments with further test objects taken

from various application domains must be carried out in order to be able to make

more general statements about the relative performance of particle swarm opti-

mization and other techniques when applied to software testing. Systematically

varying the algorithm settings for the experiments would also help to draw more

comprehensive conclusions.

58

PSO

0

2000

4000

6000

8000

10000

12000

14000

16000

18000

20000

1 10 19 28 37 46 55 64 73 82 91 100 109 118 127 136 145 154 163

Test Objects

Ave

rage

PU

TE o

ver 5

runs

(a) Particle Swarm Optimization

ACA

0

2000

4000

6000

8000

10000

12000

14000

16000

18000

20000

1 10 19 28 37 46 55 64 73 82 91 100 109 118 127 136 145 154 163

Test Objects

Ave

rage

PU

TE o

ver 5

runs

(b) Ant Colony Algorithm

GA

0

2000

4000

6000

8000

10000

12000

14000

16000

18000

1 10 19 28 37 46 55 64 73 82 91 100 109 118 127 136 145 154 163Test Objects

Ave

rage

PU

TE o

ver 5

runs

(c) Genetic Algorithms

SA

0

5000

10000

15000

20000

25000

1 10 19 28 37 46 55 64 73 82 91 100 109 118 127 136 145 154 163

Test Objects

Ave

rage

PU

TE o

ver 5

runs

(d) Simulated Annealing

TS

0

10000

20000

30000

40000

50000

60000

70000

1 10 19 28 37 46 55 64 73 82 91 100 109 118 127 136 145 154 163

Test Objects

Ave

rage

PU

TE o

ver 5

runs

(e) Tabu Search

AETG

0

5000

10000

15000

20000

25000

1 10 19 28 37 46 55 64 73 82 91 100 109 118 127 136 145 154 163

Test Objects

Ave

rage

PU

TE o

ver 5

runs

(f) Automatic Efficient Test CaseGenerator

IPO

0

2000

4000

6000

8000

10000

12000

14000

16000

18000

20000

1 10 19 28 37 46 55 64 73 82 91 100 109 118 127 136 145 154 163

Test Objects

Ave

rage

PU

TE o

ver 5

runs

(g) In Parameter Order

Random

0

2000

4000

6000

8000

10000

12000

14000

16000

18000

20000

1 10 19 28 37 46 55 64 73 82 91 100 109 118 127 136 145 154 163

Test Objects

Ave

rage

PU

TE o

ver 5

runs

(h) Random Testing

Figure 4.6: Comparison of PSO with other existing test case generation algorithms.

59

Chapter 5

Conclusion and Future Work

In this chapter, we summarize the findings of the thesis and outline future directions

that could be pursued from this research. Section 5.1 summarizes the contributions

of the work presented in the thesis while Section 5.2 outlines some potential future

work to extending this research.

5.1 Thesis Contributions

This work addressed two problems. First was the problem of not having a sys-

tematic clear approach for using requirements-based testing and partition testing

together. We proposed a layered framework for this purpose and showed that how

a testing group can use this framework as a guideline for testing any software sys-

tem against its requirements defined by the requirements group of an organization.

The second problem was about effective combinatorial test case generation. We

argued that since this is an NP-complete problem, typically artificial intelligence-

based algorithms and enumerative and local search methods are used for finding a

near-optimal solution. We also proposed a novel algorithm using particle swarm op-

timization technique and discussed that this method can be as effective as existing

techniques and can outperform most of them in average. The major contributions

of the framework can be summarized as follows:

• Design and develop a novel layered framework which integrates requirements-

based testing and partition testing techniques into one simple and straight-

forward testing process.

60

• Design and develop a novel test case generation technique using particle

swarm optimization technique. We showed that this technique can outper-

form other existing methods.

• Propose a simple boundary condition, called cyclic walls, for PSO that can

be used for solving the problems which have a finite search space.

• Design and develop a framework for automatic comparison of different test

case generation algorithms.

• Propose a new test case generation metric for assessing the effectiveness of

the test suites, generated by different combination strategies.

• Set-up empirical experiments which confirm that PSO can be as effective as


5.2 Future Work

There are numerous ways to extend this research work. Adding more capabilities

to our proposed RBPT framework using other testing techniques can be the subject

of future research. Some other issues which need more investigation and research

are: realizations of the framework and empirical studies on the performance of the

framework; using different variations of PSO algorithm; using different parame-

ters for PSO; considering constraints among variables and values for combinatorial

testing; and prioritizing test cases according to the available data through RBPT

framework.

• Adding Capabilities to RBPT Framework: As we mentioned earlier in

the thesis, a large list of various software testing methods have been emerged

during time. Normally each technique has some advantages over other meth-

ods in specific situations. We can use the strengths of different testing tech-

niques and apply them on different layers of our proposed framework to add

more capabilities to our testing process.

• Realizations of the RBPT Framework and Empirical Studies: Unfor-

tunately lack of required data prevented us to be able to realize our framework

and compare its effectiveness against existing approaches. More studies would

help us better understand pros and cons of the framework. Future studies

61

can be conducted on a large set of software systems, ideally from different do-

mains. Large software systems from industry, the likely users of approaches

such as this, are good candidates for such experimentation.

• Using Variations of PSO: Different variations of PSO have been studied

in the literature. Future studies can use these variations with different neigh-

borhood topologies, different boundary conditions and also cooperative and

adaptive versions of PSO to conduct more experiments, in hope of reaching

a more solid test case generation algorithm.

• Considering Variable-Value Constraints: In some cases, there will be

conflicts among different variables and values in the input domain of the

software under test. A conflict exists when the result of combining two or

more values of different variables does not make sense. In their basic forms,

combination strategies generate test suites that satisfy the desired coverage

without using any semantic information. Hence, invalid test cases may be

selected as part of the final test suite. How to handle these conflicts has

not been adequately investigated. Future studies can clarify which constraint

handling methods work for which combination strategies and how the size of

the test suite and efficiency of the algorithms is affected by the constraint

handling method.

• Test Case Prioritization: One of the major concerns in our RBPT frame-

work is to consider the required data for test case prioritization. Test case

prioritization techniques schedule test cases for execution in an order that

attempts to maximize some objective functions. A variety of objective func-

tions are applicable such as rate of fault detection. An improved rate of fault

detection during regression testing can provide faster feedback on a system

under regression test and let debuggers begin their work earlier than might

otherwise be possible. Future studies can investigate integration of these pri-

oritization techniques into our RBPT framework. Lots of useful raw data can

be gathered in different layers of the framework which can be used as an input

for these prioritization techniques.

62

References

[1] IEEE standard glossary of software engineering terminology.

http://ieeexplore.ieee.org/stamp/stamp.jsp?arnumber=00159342, 1990.

10

[2] T. Beielstein, K. E. Parsopoulos, and M. N. Vrahatis. Tuning PSO parameters

through sensitivity analysis. Sonderforschungsbereich (SFB) 531, 2002. 48

[3] B. Beizer. Software System Testing and Quality Assurance. Van Nostrand

Reinhold/co Wiley, 1984. 1

[4] P. Berander and A. Andrews. Requirements Prioritization, pages 69–94. 2005.

13

[5] R. V. Binder. Testing Object-Oriented Systems: Models, Patterns, and Tools.

Addison-Wesley Professional, 1999. 1

[6] K. Burr and W. Young. Combinatorial test techniques: Table-based automa-

tion, test generation and code coverage. In Proceedings of the International

Conference on Software Testing Analysis and Review, pages 503–513, 1998. 9,

43

[7] K. Burroughs, A. Jain, and R. L. Erickson. Improved quality of protocol

testing through techniques of experimental design. In Proceedings of the IEEE

International Conference on Communications, pages 745–752 vol.2, 1994. 3

[8] B. Clow and T. White. An evolutionary race: A comparison of genetic algo-

rithms and particle swarm optimization for training neural networks. In Pro-

ceedings of the International Conference on Artificial Intelligence, volume 2,

pages 582–588, 2004. 43

[9] D. M. Cohen, S. R. Dalal, M. L. Fredman, and G. C. Patton. The AETG

system: An approach to testing based on combinatorial design. IEEE Trans-

actions on Software Engineering, 23:437–444, 1997. 1, 23, 24

63

[10] D. M. Cohen, S. R. Dalal, J. Parelius, and G. C. Patton. The combinatorial

design approach to automatic test generation. IEEE Software, 13:83–88, 1996.

3, 23

[11] M. B. Cohen, P. B. Gibbons, W. B. Mugridge, and C. J. Colbourn. Construct-

ing test suites for interaction testing. In Proceedings of the 25th International

Conference on Software Engineering, pages 38–48, 2003. 18, 44

[12] T. H. Cormen, C. E. Leiserson, R. L. Rivest, and C. Stein. Introduction to

Algorithms. The MIT Press, 2nd edition, 2001. 16

[13] S. R. Dalal, A. Jain, N. Karunanithi, J. M. Leaton, C. M. Lott, G. C. Patton,

and B. M. Horowitz. Model-based testing in practice. In Proceedings of the

International Conference on Software Engineering, pages 285–294, 1999. 9

[14] M. Dorigo, V. Maniezzo, and A. Colorni. The ant system: Optimization by

a colony of cooperating agents. IEEE Transactions on Systems, Man, and

Cybernetics, Part B, 26:29–41, 1996. 19

[15] I. S. Dunietz, W. K. Ehrlich, B. D. Szablak, C. L. Mallows, and A. Iannino.

Applying design of experiments to software testing: experience report. In Pro-

ceedings of the 19th International Conference on Software Engineering, pages

205–215, 1997. 3

[16] J. W. Duran and S. C. Ntafos. An evaluation of random testing. IEEE Trans-

actions on Software Engineering, 10:438–444, 1984. 8, 9

[17] R. Eberhart and J. Kennedy. A new optimizer using particle swarm theory. In

Proceedings of the 6th International Symposium on Micro Machine and Human

Science, pages 39–43, 1995. 45

[18] R. C. Eberhart, R. Dobbins, and P. K. Simpson. Computational Intelligence

PC Tools. Morgan Kaufmann Pub, 1996. 51

[19] F. Glover and F. Laguna. Tabu Search. Kluwer Academic Publishers, 1997.

22

[20] M. Grindal, J. Offutt, and S. F. Andler. Combination testing strategies: a

survey. Software Testing, Verification and Reliability, 15:167–199, 2005. 37, 43

[21] W. J. Gutjahr. Partition testing vs. random testing: the influence of uncer-

tainty. IEEE Transactions on Software Engineering, 25:661–674, 1999. 9

64

[22] D. Hamlet. Foundations of software testing: Dependability theory. In Foun-

dations of Software Engineering, pages 128–139, 1994. 10

[23] D. Hamlet and R. Taylor. Partition testing does not inspire confidence. IEEE

Transactions on Software Engineering, 16:1402–1411, 1990. 9, 14

[24] R. Hassan, B. Cohanim, and O. Weck. A comparison of particle swarm

optimization and the genetic algorithm. In Proceedings of the 46th

AIAA/ASME/ASCE/AHS/ASC Structures, Structural Dynamics and Mate-

rials Conference, 2005. 43

[25] W. C. Hetzel and B. Hetzel. The Complete Guide to Software Testing. John

Wiley and Sons, Inc., 1991. 6, 7

[26] R. J. W. Hodgson. Partical swarm optimization applied to the atomic cluster

optimization problem. In Proceedings of the Genetic and Evolutionary Com-

putation Conference, pages 68–73, 2002. 43

[27] T. Huang and A. S. Mohan. A hybrid boundary condition for robust particle

swarm optimization. IEEE Antennas and Wireless Propagation Letters, 4:112–

117, 2005. 49

[28] M. L. Hutcheson. Software Testing Fundamentals: Methods and Metrics. Wi-

ley, 1st edition, 2003. 1

[29] C. Kaner. The power of ‘what if...’ and nine ways to fuel your imagination.

Software Testing and Quality Engineering Magazine, 5:16–22, 2003. 34, 41

[30] J. Kennedy. Small worlds and mega-minds: Effects of neighborhood topology

on particle swarm performance. In Proceedings of the Congress on Evolutionary

Computation, volume 3, pages 1931–1938, 1999. 51

[31] J. Kennedy and R. Eberhart. Particle swarm optimization. In Proceedings

of the IEEE International Conference on Neural Networks, volume 4, pages

1942–1948, 1995. 45

[32] D. R. Kuhn and M. J. Reilly. An investigation of the applicability of design

of experiments to software testing. In Proceedings of the 27th Annual NASA

Goddard/IEEE Software Engineering Workshop, pages 91–95, 2002. 3

[33] Y. Lei and K. C. Tai. In-Parameter-Order: A test generation strategy for

pairwise testing. In Proceedings of the 3rd IEEE International High-Assurance

Systems Engineering Symposium, pages 254–261, 1998. 18, 24, 43

65

[34] J. J. Liang, A. K. Qin, P. N. Suganthan, and S. Baskar. Comprehensive learn-

ing particle swarm optimizer for global optimization of multimodal functions.

IEEE Transactions on Evolutionary Computation, 10:281–295, 2006. 56

[35] J. Martin. An Information Systems Manifesto. Prentice Hall, 1984. viii, 12,

13

[36] G. Mogyorodi. Requirements-based testing: An overview. In Proceedings of the

39th International Conference and Exhibition on Technology of Object-Oriented

Languages and Systems, pages 286–295, 2001. 11, 12

[37] G. J. Myers. The Art of Software Testing. John Wiley and Sons, 1979. 6

[38] K. J. Nurmela and P. R. J. Ostergard. Constructing covering designs by sim-

ulated annealing, 1993. 43

[39] T. J. Ostrand and M. J. Balcer. The category-partition method for specifying

and generating fuctional tests. Commun. ACM, 31:676–686, 1988. 15, 31, 32,

35, 36, 41, 42

[40] R. H. J. M. Otten and L. P. P. P. Van Ginneken. The Annealing Algorithm.

Kluwer Academic Publishers, 1st edition, 1989. 20

[41] C. H. Papadimitriou and K. Steiglitz. Combinatorial Optimization: Algorithms

and Complexity. Dover Publications, 1998. viii, 16, 17

[42] W. E. Perry. A Standard for Testing Application Software. Auerbach Publi-

cations, 1986. 7

[43] S. C. Reid. An empirical analysis of equivalence partitioning, boundary value

analysis and random testing. In Proceedings of the 4th International Software

Metrics Symposium, pages 64–73, 1997. 9

[44] D. J. Richardson and L. A. Clarke. A partition analysis method to increase

program reliability. In Proceedings of the 5th International Conference on Soft-

ware Engineering, pages 244–253, 1981. 14

[45] J. Robinson and Y. Rahmat-Samii. Particle swarm optimization in electromag-

netics. IEEE Transactions on Antennas and Propagation, 52:397–407, 2004.

49

66

[46] G. Rothermel, R. H. Untch, C. Chengyun, and M. J. Harrold. Prioritizing

test cases for regression testing. IEEE Transactions on Software Engineering,

27:929–948, 2001. 54

[47] G. Sherwood. Effective testing of factor combinations. In Proceedings of the

3rd International Conference on Software Testing, Analysis & Review, 1994.

26

[48] T. Shiba, T. Tsuchiya, and T. Kikuno. Using artificial life techniques to gener-

ate test cases for combinatorial testing. In Proceedings of the 28th Annual In-

ternational Computer Software and Applications Conference, volume 1, pages

72–77, 2004. 19, 43

[49] S. Siegel. Object Oriented Software Testing: A Hierarchical Approach. John

Wiley & Sons, 1st edition, 1996. 1, 2

[50] C. U. Smith. Performance Engineering of Software Systems. Addison-Wesley

Pub, 1990. 9

[51] J. Stardom. Metaheuristics and the search for covering and packing arrays.

PhD thesis, Simon Fraser University, 2001. 22

[52] K. C. Tai and Y. Lei. A test generation strategy for pairwise testing. IEEE

Transactions on Software Engineering, 28:109–111, 2002. 3, 24

[53] N. Tracey, J. Clark, K. Mander, and J. McDermid. An automated framework

for structural test-data generation. In Proceedings of the 13th IEEE Interna-

tional Conference on Automated Software Engineering, pages 285–288, 1998.

1

[54] F. I. Vokolos and E. J. Weyuker. Performance testing of software systems. In

Proceedings of the 1st International Workshop on Software and Performance,

pages 80–87, 1998. 10

[55] I. Weerd, S. Brinkkemper, R. Nieuwenhuis, J. Versendaal, and L. Bijlsma. To-

wards a reference framework for software product management. In Proceedings

of the 14th IEEE International Requirements Engineering Conference, pages

312–315, 2006. 13

[56] E. J. Weyuker and T. J. Ostrand. Theories of program testing and the appli-

cation of revealing subdomains. IEEE Transactions on Software Engineering,

SE-6:236–246, 1980. 14, 42

67

[57] A. W. Williams. Determination of test configurations for pair-wise interaction

coverage. In Proceedings of the IFIP TC6/WG6.1 13th International Confer-

ence on Testing Communicating Systems: Tools and Techniques, pages 59–74,

2000. 18

[58] A. W. Williams and R. L. Probert. A measure for component interaction

test coverage. In Proceedings of the ACS/IEEE International Conference on

Computer Systems and Applications, page 304, 2001. 2

[59] A. Windisch, S. Wappler, and J. Wegener. Applying particle swarm opti-

mization to software testing. In Proceedings of the 9th Annual Conference on

Genetic and Evolutionary Computation, pages 1121–1128, 2007. 43, 56

68

Date post:	12-Aug-2020
Category:	Documents
Upload:	others
View:	5 times
Download:	0 times

A Requirements-Based Partition Testing Framework Using ......A Requirements-Based Partition Testing...

Documents