Misc Topics in Testing
McCabe’s Cyclomatic Complexity
Number of “linearly independent paths”– useful in defining test coverage (See later) – Counts the number of closed loops in the graph
• FA() = 0 • Fs(m1,m2) = m1 + m2
• FC(m1,m2) = m1 + m2 + 1• Fl(m1) = m1 + 1
v(P) = #edges - #nodes +2 (Familiar?)
McCabe: Example
Edges = 12
Nodes = 10
v = 12 - 10 + 2 = 4
4 Lin. Indep. Paths
More generally...
• Can define a set of prime flowgraphs – those which cannot be broken down by nesting– corresponding to the statements of the langauge
• And a measure for each
• Yields a Prime Decomposition Theorem:– “The decomposition of a flowgraph into primes
is unique”
A more general approach to CFGs
• For any language, a Prime Flowgraph is one which cannot be broken down by sequencing or nesting
if then repeat until
...
cases
??
Hierarchical measures (again)
• Define measure for each prime flowgraph
• Define measure for sequencing
• Define measure for nesting
Eg. number of nodes:
nd(P) = #nodes in P, for each prime nd F Fn nd Fi n
i
n( ;...; ) ( )1 1
1= − +∑
=
nd F F Fn nd F nd Fii
n( ( ,..., )) ( ) ( )1
1= +∑
=
Example: Structuredness
• Whether a program is structured can be seen as a measure as follows:
str(P) = 1 if P is one of the allowed primes
0 otherwise
str(F1;...Fn) = min(str(F1),...,str(Fn)
str(F(F1,...,Fn)) = min(str(F),str(F1),...,str(Fn))
Linearly Independent Paths
• The vector representation of a path is a vector which counts the number of occurrences of each edge.
• A set of paths is l.i. if none can be represented as a linear combination of the others (in the vector representation).
First number each edge
1 2
3 4
5
6 7
8
9
1011
12
A path can be representedas a vector counting edges visited
(1,0,1,0,1,1,0,1,0,0,0,1)
(1,0,1,0,1,0,1,0,0,0,1,1)
(1,0,1,0,1,0,1,0,1,1,1,1)
(0,1,0,1,1,1,0,1,0,0,0,1)
A B C D
Now can add and subtract vectors:
Eg. D-A = (-1,1,-1,1,0,0,0,0,0,0,0,0)
-1 1
-1 1
So E=B+D-A
E
How do we find test sets?
• Given a test strategy it is not easy to find test cases that exercise the required paths– Even for Statement Coverage some parts of the code
may be unreachable– A single path can achieve Branch Coverage for: while(...) do “some complex program” but unlikely to be possible in practice
Domain Partitioning
What have we been doing?
• Partitioning input space according to some property
• Selecting Test case inputs which are representatives of each partition – Eg to ensure different paths executed
• Assuming behaviour similar for all values of partition
Boundary Value Analysis
• Also important to test software at the boundaries of the partitions.– Less than (or equal)?– length of list (or n-1)?– closure reversal (“not <” is not “>”)?
• How do we identify boundaries?
Single variable case
• Open and closed intervals
min maxBoth ends closed
min maxHalf open
min maxBoth ends open
P1 P2 P3
Multiple variables• Input domains are multi-dimensional• Boundaries are hyperplanes• Can be open or closed at each intersection
open boundary
closed boundary
on point
off point
extreme point
Finding Test Cases• CFGs model software
• Test strategy to select paths to test
• Data flow Analysis to choose “best” test paths
• Now need to find test inputs which exercise those paths
Example
• Find All DU paths for example program
• Find test cases which execute the paths
smallest(int p) (*p>2*){int q = 2;
while(p mod q > 0 AND q < sqrt p) do q := q+1 ;if (p mod q = 0) then print(q,’is factor’) else print(p,’is prime’);}
d
u
u
u
Usagep q
d
u
ud
u
u
1
2
3
4
5
6
7
8
p
123123431235123435123571234357
q
232342352356434344354356
Program CFG ADUP
ADUP
p
1231235123435123571234357
q
232342352356434344354356
Subpathssubsumed
123571234357
2356
434
4356
100%coverage
12357812343578
123568
12343435 8
12343568
TestInput
p=3p=5
p=4,6,8...
p=4,8,12... 9,10,..15p=9,15,21..
Test Output
3 is prime5 is prime
2 is sm fact
11 is prime
3 is sm fact
How were test cases found?
• Required outcome at each predicate node
• Consider all requirements together
• Guess a value that will satisfy them
• Can we improve on this!
Symbolic Execution• How to find test inputs to exercise a path?
– Need certain choice at each predicate node
– Give a symbolic value to each variable
– Walk the path collecting requirements on symbolic input
• Then have a set of inequalities to solve
• Example: Find test cases for each path by symbolic execution:
smallest(p){int q = 2;
while(p mod q > 0 AND q < sqrt p) do q := q+1 ;if (p mod q = 0) then print(q,’is factor’) else print(p,’is prime’);}
F
F
p q
X Y
X 2
X 2
X 2
X 2
X 2
Conditions
X mod 2 =0OR2 ge sqrt X
X mod 2 > 0
Candidates
X=4,6,8,... 3,4
X=3,5,7,...
SolutionsX=3
Path 123578
p q
X YX 2while (T)
X 3
while (F)
if (F)
X is prime
Conditions
X mod 2 > 02 < sqrt X
X mod 3 = 0OR3 ge sqrt(X)
X mod 3 > 0
Candidates
X=3,5,7,...X=5,6,7..
X=3,6,9.. 3,4..9
X=4,5,7,8,..
Solutions
X=5,7
Output:5 is prime7 is prime
Path 12343578
p q
X YX 2while (F)
if (T)
Y is sm fact
Conditions
X mod 2 = 0OR2 ge sqrt X
X mod 2 = 0
Candidates
X=4,6,8,.. 3,4
X=4,6,8,..
Solutions
X=4,6,8..
Output:2 is sm fact
Path 123568
p q
X YX 2while (T)
X 3
while (F)
if (T)
Y is sm fact
Conditions
X mod 2 > 02 < sqrt X
X mod 3 = 0OR3 ge sqrt(X)
X mod 3 = 0
Candidates
X=3,5,7..X=5,6,7..
X=3,6,9.. 3,4..9
X=3,6,9..
Solutions
X=9,15,21..
Path 12343568
p q
X 2while (T)
X 3while (T)
X 4while (F)
if (_)
???????
Conditions
X mod 2 > 02 < sqrt X
X mod 3 > 03 < sqrt X
X mod 4 = 0OR4 ge sqrt(X)
X mod 4 ? 0
Candidates
X=3,5,7..X=5,6,7..
X=4,5,7,8..X=10,11,12..
X=4,8,12.. 3,4..16
X=.....
Solutions
[5,7,9,11,13..
[5,7,11,13,17[11,13,17,19..
[none from this[11,13
[must be falseX=11,13
Output:11 is prime13 is prime
Path 12343435_8
Difficulties with Symbolic Execution
• Generally, many paths are not feasible
• Conditions can become complex:– when complex expressions on rhs of
assignments– then program variables are complex
expressions in terms of the symbolic vars
• Sets of conditions can be computationally complex to solve
Possible Solutions
• Computational Complexity:– Use numerical methods to calculate the tests
• Straight line equivalents• Program Instrumentation
– Adaptive testing (later)
• Complex predicates– Condition/Decision strategies (later)
• Many Infeasible paths– Adaptive testing (later)
Straight Line equivalents
• Construct the “straight line” program corresponding to the path required.
• replace predicates with path constraints– a real valued expression which records the
requirement as a minimisation
• Solve the path constraints using numerical methods
Path Constraints
• Eg. if(x = y) is replaced by
c1 := abs(x-y) • and if(x>y) is replaced by
c2 := x-y• Then we must minimise the ci• Can use numerical methods to do this
Program instrumentation
• generally - a method to allow testing of a unit in place by augmenting program
• Here - add function calls which record value of key variables
• replace predicates with calls which guarantee correct path is taken
• run program to generate conditions
• Again use numerical methods to solve
Conditions and Decisions• Above strategies do not take account of
predicates with more than one conjunct
• There are more strategies which distinguish– Conditions - the individual clauses of
predicate, from
– Decisions - the outcome of evaluating the whole predicate
Condition Coverage
• Achieve all possible combinations of simple Boolean conditions at each decision node
• In critical real-time applications over half of statements may be Boolean expressions
• Several variants of strategies which account for individual conditions
Example Condition Strategies• Decision coverage (DC)
– every decision tested in each possible outcome• Condition/Decision coverage (C/DC)
– as above plus, every condition in each decision tested in each possible outcome
• Modified Condition/Decision (MC/DC)– as above plus, every condition shown to
independently affect a decision outcome (by varying that condition only)
• Multiple-condition coverage (M-CC)– all possible combinations of conditions within
each decision taken
Modified Condition/Decision Coverage
• Multiple-condition coverage is strongest but grows exponentially in # conditions
• Modified C/D is linear like C/D• Eg. For A and B
– (T,T) required to exercise decision true– (F,T) required for independence of A– (T,F) required for independence of B– (F,F) not required
• MC/DC (among others) is required for flight-critical commercial avionics software
Further Problems with Symb. Ex.
• When loop conditions are input dependent
• When array indices are input dependent
• When external functions are called
Adaptive TestingThe above approach has been in 4 stages:
1) Construct the control flow graph– a parsing problem - automatable– can all add “instrumentation” here
2) Choose the test paths– According to some test strategy– CFG - possibly with data flow considerations
Four stages (cont.)
3) Choose the test cases– by symbolic execution and simultaneous ineqs
• or by backwards substitution
– can reveal Infeasible paths requiring reverting to stage 2.
4) Execute the test cases– Only now do we execute the program
• Adaptive testing merges stages 2), 3) and 4)
Problems with 4-stage approach
• Infeasible paths (stage 3) require selection of new paths (return to stage 2)
• Computational complexity of test case selection
Adaptive testing develops test cases one at a time and uses result of previous test case execution to help select next test case
Inductive Strategies
• Choose first test input x1 (perhaps at random)• Execute test and record path taken, p1
• Say k-1 tests have been done giving{(x1,p1),...(xk-1,pk-1)}
• use some strategy to select xn
Several such strategies exist.
Diagonalisation
Important “method” in Mathematics:
• Cantor’s uncountability of Reals
• Godel’s Incompleteness
• Undecidability of Halting problem
For list of lists, find a new list by choosing an element different from each on the diagonal
A11, A12, A13, ...A21, A22, A23, ...A31, A32, A33, ......
New = B1, B2, B3, ...where B1 = A11 B2 = A22 B3 = A33 ...
Diagonalisation (2)
• Each path pi gives a conjunctive predicate Pi
• These predicates characterise a set of non-overlapping subdomains of the input space
• We must find a new input xk not in any Pi
• Let Pi be conjunction of Ci,1,Ci,2,...Ci,ki
• For each i, choose xk to violate some Ci,j – eg. xk not in Ci,i
Path Prefix Strategy[Prather and Myers, IEEE Trans. SE-13(7) 1987]
For Branch coverage
• For a path p, define its reversible prefix q– the initial portion of p to the first decision node
where the branches are not yet fully covered
• A reversal of p is then any path with same reversible prefix but then a different continuation
Path Prefix Strategy (2)
• Choose first input in some way and execute to give first path, p1
• Given p1,...,pk-1, let pi be path with shortest reversible prefix
• Choose next input to give a reversal of pi
• Execute and add the new path to set of paths
Path Prefix: earlier example• Choose first input p = 3 (say)
– execution gives path p1 = 12357– Reversible prefix = 123, Reversal = 1234....
• Deduce second input, p = 5– execution gives path p2 = 12343578– reversible prefix 123435– path p1 also now has reversible prefix 1235 – choose shorter p2, Reversal = 12356
• Deduce 3rd input, p = 4– execution gives path p3 = 123568
• All branches covered
Problems with Path prefix
• Still need to deduce input for new path
– the inversion problem (later)
• Still may get infeasible paths– absolute infeasibility - a path can never be
executed
– relative infeasibilty - a path cannot be the continuation of any of the current reversible prefixes
Example of relative infeasibilty
simple(bool x, y)if(x = true) then S1 else S2;if(x xor y = true) then S3 else S4;if(x and y = true) then S5 else S6;
in1 = (false,false)p1 = F,F,F reverse at 1 gives:in2 = (true,false)p2 = T,T,F reverse p1 at 2 gives F,F,T - infeasible reverse p2 at 2 givesT,T,T infeasible butT,F,T is feasible, egin3 = (true,true)
1
2
3
- # paths to node grows exponentially - # previous nodes grows linearly
Conditionals in sequence:
The Inversion Problem
• How do we find the input which reverses the decision at Pk ?
D
P1&...&Pk-1
Pknot Pk
xx’
The Inversion Problem (2)• Need to find x’ given x
• Done by Back Substitution
• execute with x recording all states for prefix• pick change of a variable to change Pk
• substitute back through program logic to calculate required input– same as for 4 step approach but with actual
values– For real-valued conditions can use grad(Pk) to
cross boundary via normal
Advantages of adaptive approach
• Informal common sense tells us:– Change only one thing at a time– Exploit nearness of previous test cases to the
required path
• Formal analysis gives us:– overall complexity of adaptive approach is less
than 4 stage approach [Myers, SEJ 7(1) 1992]
References• ADTEST, Gallagher and Narasimhan, IEEE
Trans. SE-23(8), 1997.
• Symbolic Execution, Girgis, SEJ 7(4), 1992.
• Instrumentation, Luo, Probert and Ural, Software Engineering Journal (SEJ) 10(6), 1995.
• Path Prefix, Prather and Myers, IEEE Trans. SE-13(7), 1987.
• Complexity of adaptive, Myers, SEJ 7(1), 1992.
• MC/DC, Chilenski and Miller, SEJ 9(5) 1994.