Date post: | 28-Dec-2015 |
Category: |
Documents |
Upload: | bennett-hutchinson |
View: | 220 times |
Download: | 1 times |
Automated Testing, and Generating Complex Input
(and other applications)
Course Software Testing & Verification2014/15
Wishnu Prasetya
Content
• Automated Testing• BNF to describe inputs • Generating inputs• Coverage• Regular expression• Other applications of regular expression
2
Important note: generating complex inputs is a non-trivial task. But this is only partially addressed by AO. E.g. chapter 5 set up the right background, but then they went on to focus more on mutation. Here we will complement that with additional materials. Usable sections from AO related to the issues of input generation are the following:
• Section 5.1.1 (very short!) about BNF• Section 2.7 about regular expression, but this section is actually about using regular
expression to e.g. calculate the needed test paths to deliver a given graph coverage. This is not directly related to input generation; but we will discuss this as well.
Automated Testing
• You already know QuickCheck:– quickCheck (s isSorted . sort $ s)
• There QuickCheck ports for e.g. C# and Java• There are other automated unit testing tools,
e.g. T3 (Java), Evosuite (Java)
3
Automated Testing
• Consider testing the following cases:– int divide(int x, int y)– void sendmail(String to, subject, body)– class File{ ...
open() close() write(x)}
4
Automated Testing
• To automatically generate test cases you need at least :– a way to automatically generate valid and invalid
inputs , and sequences of test steps (for positive as well as negative tests)
• And test oracles...– concrete values oracles, e.g. return == 0.01, is not
going to scale up– thus can only work in conjunction with property
based testing.5
Generating test sequences
• Model-based testing: describe valid sequences of operations on your class/program as a finite state automaton (model), from which you can systematically generate sequences.
6
open()
close()
write()
File:
Generating inputs
• Valid inputs can be generated simply by filtering randomly generated values :– quickCheck (s isSorted s ==> sort s == s)
• But, this presumes it is not hard to accidentally produce valid ones not always true, e.g. generating valid ZIP code, or email address.
• Is there a more direct way to do this? (in fact, several ways)
7
Describing “allowed” inputs
• Using Bakus Naur Form (BNF) notation / context free grammar (see example in p171) :
• Terminologies: start symbol, terminal, non-terminal, epsilon (not in book, just check Wikipedia, or the lecture notes of Languages & Compilers)
8
S Brace | Curly | Brace “(“ S “)” SCurly “{“ S “}” S
implicitly THREE production rules!
Production Rule
• A production rule has the form N Z, where N is a non-terminal and Z is a sequence of symbols.
• A rule like A a(B|C)d is seen as a short hand for a set of production rules:
A aBdA aCd
• People often use extended BNF e.g. : Brace ( “(“ S “)” )*
9
Example: NL post codes
• Note: sometimes there are additional constraints, e.g. codes above 9999 XL do not actually exists (do not map to an existing address). A constraint is not always expressible in BNF; or it is expressible but not conveniently.
10
NLpostcode Area Space StreetArea FirstDigit Digit Digit DigitStreet Letter LetterFirstDigit 1 | 2 ...Digit 0 | 1 | 2 ...Letter a | b | c ... | A | B | C ...Space “ “ *
Generating inputs
11
S Brace | Curly | Brace “(“ S “)” SCurly “{“ S “}” S
A derivation is a series of expansion of the grammar that result in a sequence of terminal symbols. It follows that the sequence is a valid sentence of the grammar. We can use this to generate valid sentences. Example :
S Brace ( S ) S ( ) S ( )
Derivation tree
12
S Brace // RSbS Curly // RScS // RSeBrace “(“ S “)” SCurly “{“ S “}” S
A derivation :S Brace ( S ) S ( ) S ( )
RSb
RBrace
( RSe ) RSe
A derivation can also be described by a derivation tree such as above. Given such a tree, you can reconstruct what the derived sentence is.
One more example
13
RSb
RBrace
( RSc ) RSe
CRurly
{ RSe } RSe
S Brace // RSbS Curly // RScS // RSeBrace “(“ S “)” SCurly “{“ S “}” S
Representing Derivation Tree in Haskell
• data S = RSb Brace | RSc Curly | RSe
• data Brace = RBrace S S• data Curly = RCurly S S
14
S Brace // RSbS Curly // RScS // RSe
Brace “(“ S “)” SCurly “{“ S “}” S
Representing Derivation Tree in Haskell
• data S = RSb Brace | RSc Curly | RSe
• data Brace = RBrace S S• data Curly = RCurly S S
15
RSb
RBrace
( RSe ) RSe
Every value of type S, represents a derivation tree. E.g. the tree above is represented by: RSb (RBrace RSe RSe)
Extracting the sentence
• instance Show S where show (RSb b) = show b show (RSc c) = show c show RSe = “”
• instance Show Brace where show (RBrace s1 s2) = “(“ ++ show s1 ++ “)” ++ show s2
• instance Show Curly ....16
Generating derivation trees (hence also sentences)
• class Generator t where gen :: Int t -- generate all trees of height k genA :: Int t -- generate all trees of height k
• genA k = concat . map gen $ [0..k]
17
Generating derivation trees (hence also sentences)
• instance Generator Brace where gen k = [ RBrace s1 s2 | s1 subs, s2 subs ] where subs = gen (k-1)
• instance Generator S where gen k = if k0 then RSe else [ RSb b | b gen (k-1) ] ++ [ RSc c | c gen (k-1) ]
18
Representing Derivation Tree in an OO Language
• abstract class S {}• class RSb extends S { Brace b }• class RSc extends S { Curly c }• class RSe extend S {}• class Brace { S s1 ; S s2 }• class Curly { S s1 ; S s2 }
19
But...
• genA 4 produces :
• genA 6 produces 139 derivations...• Too many... However, you actually have the full
derivation trees which you can exploit for filtering.
20
[ ,{ } ,( ) ,{{ } }{ } ,{{ } }( ) ,{( ) }{ } ,{( ) }( ) ,({ } ){ } ,({ } )( ) ,(( ) ){ } ,(( ) )( ) ]
We can see...
• We can see which terminals are produced.• We can see which rules were used.• We can infer which non-terminals were
produced during the derivation.
21
S Brace | Curly | Brace “(“ S “)” SCurly “{“ S “}” S
RSb
RBrace
( RSc ) RSe
RCurly
{ RSe } RSe
BNF coverage
• (C5.29) TR contains each terminal symbol from the given grammar G.
• (C5.30) TR contains each production rule in G.• Production coverage subsumes terminal coverage; but these
are usually too weak.• Pair-wise production coverage: TR contains every feasible pair
(R1,R2) of production rules. Feasible means that they can actually be applied in succession in a derivation from G.
• Can be generalized to k-wise, but may blow up the size of TR. Alternatively, if G is not too large you can still manually add new requirements to your TR.
22
Pair-wise production coverage
• A derivation tree t covers covers a pair rule R1;R2 if the pair appears as two consecutive nodes in in t.
• A set T of derivation trees gives full pair-wise production coverage if every feasible pair of rules R1;R2 is covered by some t in T.
• Analogously for k-wise coverage.
23
RSb
RBrace
( RSc ) RSe
RCurly
{ RSe } RSe
Example
• { “()” , “{}” } gives full terminal as well as production coverage.
• Combinations of brace-curly and curly-brace can only be enforced by pair-wise coverage.
• But none of those coverage criteria can distinguish between e.g. ({}) and (){}
24
S Brace | Curly | Brace “(“ S “)” SCurly “{“ S “}” S
Rule-rule coverage
• alts(N) = the set of production rules of non-terminal N; alts(R,i) = the set of production rules of the i-th symbol of the rule R; equal to alts(N) if N is the non-terminal at i-th pos.
• A derivation tree t covers R ;i R’ if it R’ appears as the i-th child of some R in t.
• Each Rule-Rule Coverage (ERRC): for every rule R and every applicable i, TR includes every R;i R’ for every R’ alts(R,i).
• For example, TR includes: RBrace ; 1 RSb
25
S Brace | Curly | // RSb. RSc, RSeBrace “(“ S “)” S // RBraceCurly “{“ S “}” S // RCurly
ERRC example
• Importantly, ERRC also requires these to be in TR: – <RBrace ;1 RSb>, <RBrace;1 RSc>, <RBrace;1 RSe> – <RBrace ;3 RSb>, <RBrace;3 RSc>, <RBrace;3 RSe>– Similarly for Curly
• Just ({}) covers RBrace;1 RSc , but not RBrace;3 RSc• Similarly (){} covers RBrace;3 RSc, but not RBrace;1 RSc • Example of a tests-set giving full ERRC coverage:
– () , ({}), (){}, (()), ()(), {}, {()}, {}(), {{}}, {}{} 26
S Brace | Curly | // RSb,c,eBrace “(“ S “)” S // RBraceCurly “{“ S “}” S // RCurly
ERRC example
• Importantly, ERRC also requires these to be in TR: – <RBrace ;1 RSb>, <RBrace;1 RSc>, <RBrace;1 RSe> – <RBrace ;3 RSb>, <RBrace;3 RSc>, <RBrace;3 RSe>– Similarly for Curly
• ERRC does not force you to cover all “combinations” ARRC next slide, but this may produce a very large TR.
27
S Brace | Curly | // RSb,c,eBrace “(“ S “)” S // RBraceCurly “{“ S “}” S // RCurly
All-combinations
• Let R be a rule producing k non-terminals. A combination of R is a vector c of : R;1 R’1 , ... , k R’k
• A derivation tree t covers such a combination c if it “appears” in t.
• All Rule-Rule Coverage (ARRC): for every rule R TR includes every combinations of R.
28
RSb
RBrace
( RSe ) RSe
Subsumption
29
ARRC
ERRC
pair-wise production coverage
production coverage
terminal coverage
3-wise production coverage
Regular expression
• Example: (aa | bb)* , ( “(“ ”)” | “{“ “}” )*• Easy to write, but not as expressive as BNF.• Syntax :
terminalrexp | rexprexp rexprexp*rexp+( rexp )
30
AO also has eM-N, for iterating e at least m times and at most n times. But this can be expressed with seq and |
The sentences of an Rexp
• L(e) = the set of sentences described by the rexp e.• Defined as below :
31
L(e*) = { } L(e+)L(e+) = L(ee*)
L(e | f ) = L(e) L(f)
L(de) = { s++t | sL(d), tL(e) }
L(terminal) = { terminal }
Regular expression
• Can be equivalently described by a BNF grammar, but this is beyond our scope; check the course Languages and Compilers.
• In practice people use e.g. POSIX extension; e.g. to describe NL post codes:
[1..9][:digit:] [:digit:] [:digit:][:blank:]*[:alpha:][:alpha:]
32
Generating sentences
• Discussed in AO. We’ll generalize; let’s represent a regular expression with values of this type:
• is just Term “”
33
data Rexp = Term String | Seq Rexp Rexp | Alt Rexp Rexp | Star Rexp
Generating sentences
• Will generate all derivable string, but of course may not terminate.
• Make it finite, e.g. by only expanding Star finite times.
34
gen :: Rexp [String]gen (Term s) = [s]gen (Alt d e) = gen d gen egen (Seq d e) = [ s++t | sgen d, tgen e ]gen (Star e) = { } gen (Seq e (Star e))
Another application: model-based testing
• A model, in terms of a state automaton; example:
• An execution of such an automaton is a path through it, from the initial to final state.
• Can be equivalently described by a regular expression:
35
a*b(c|d)ef
a
b
c
d
e f (final state)(initial state)
representing control flow graph (CFG) with Rexp
• More on the equivalence between regular expressions and state automata is discussed in the course Languages & Compilers.
• Notice that a state automaton is a graph! So, it can be seen as describing a control flow graph. It follows that we can represent a CFG with a regular expression.
• To distinguish the arrows in the CFG, we will first assign a unique label to each.
36
Simple example
• Not so complicated, but things can get a bit confusing when you have nested loops.
• Next slides describe a conversion algorithm; this is from AO 2.7.1
37
a*b(c|d1d2)ef
a
bd1
c e f(exit node)(entry
node) d2
© Ammann & Offutt 38
You can merge sequential edges
• Assuming one single end-node; else add virtual end.• Combine/multiply sequential edges• Example: combine edges h and i
g
a0 21 3 4 6
c
hif
e
d
b
a0 21 3 54 6
c
ih
g
f
e
d
b
Introduction to Software Testing (Ch 2)
© Ammann & Offutt 39
You can merge parallel edges
• Combine parallel edges (edges with the same source and target)
• Example : Combine edges b and c
g
a0 21 3 4 6
c
hif
e
d
b
g
a0 21 3 4 6
hif
e
db + c
Introduction to Software Testing (Ch 2)
© Ammann & Offutt 40
You can remove self-Loops
• Combine all self-loops (loops from a node to itself)• Add a new “dummy” node • An incoming edge with exponent• Merge the resulting sequential edges with multiplication
g
a0 21 3 4 6
hif
e
db + c
g
a0 21 4 6
hib + c3
fe*d3’
Introduction to Software Testing (Ch 2) de*f2 4
© Ammann & Offutt 41
You can remove “middle node”
• A middle node not an initial nor final node.• Replace the middle node by inserting edges from all
predecessors to all successors.• But the middle node should not self-loop.• Multiply path expressions from all incoming with all outgoing
edges
CA
B
3
2 5
1 4
D
AC
AD
BC
2 5
1 4
BD
Introduction to Software Testing (Ch 2)
© Ammann & Offutt 42
Example of removing middle• Remove node 2• Edges (1, 2) and (2, 4) become one edge• Edges (4, 2) and (2, 4) become a self-loop
g
a0 21 4 6
hib + cde*f
a0 1 4 6
hibde*f + cde*f
gde*fIntroduction to Software Testing (Ch 2)
Keep doing it until only one edge is left …
Introduction to Software Testing (Ch 2)
© Ammann & Offutt 43
a0 1 4 6
hibde*f + cde*f
gde*f
0 4 6hiabde*f + acde*f
gde*f
hiabde*f + acde*f (gde*f)*0 4 64’
0 6abde*f (gde*f)* hi + acde*f (gde*f)* hi
Applications
• The obvious one: we can use gen exp to get the set of “all” possible test paths through the CFG, but this is perhaps not very useful because what we ultimately want are test cases.
• But you can calculate some other useful information.
44
a*b(c|d1d2)ef
a
bd1
c e f(exit node)(entry
node) d2
Calculating the number of paths in the CFG
• Iterating more than once is treated equivalent as iterating just once.
• AO: you can do that by “transforming” your “ expression: a*b(c|d1d2)ef (|a)b(c|d1d2)ef (0 + 1)1 (1 + 11) 1 = 4 45
a*b(c|d1d2)ef
a
bd1
c e f(exit node)(entry
node) d2
More generally...
46
cnt :: Rexp Intcnt (Term s) = 1cnt (Alt d e) = cnt d + cnt ecnt (Seq d e) = cnt d * cnt ecnt (Star e) = 1 + cnt e
Other applications
1. Calculating the longest path through the CFG/regular-exp
2. Calculating the minimum number of paths that would cover all branches (assuming loops always have an exit edge)
3. Calculating a minimalistic set of test paths that would satisfy (2).
4. ...47
a*b(c|d1d2)ef
a
bd1
c e f(exit node)(entry
node) d2
We can do them too, by folding...
48
maxLength :: Rexp IntmaxLength(Term s) = length smaxLength(Alt d e) = maxLength d `max` maxLength emaxLength(Seq d e) = maxLength d + maxLength e maxLength (Star e) = maxLength e
minCnt:: Rexp IntminCnt(Term s) = 1minCnt(Alt d e) = minCnt d + minCnt eminCnt(Seq d e) = minCnt d `max` minCnt e minCnt(Star e) = minCnt e
minPaths :: Rexp [String] ... do this yourself.
Complementary operation analysis
• Example of “complementary” operations (here called C/create and D/destruct): – push and pop– fileOpen and fileClose– getLock and releaseLock
• an execution path that contain more destructs than creates is suspicious not necessarily an error, because at the actual run it may still happen that some of the destructs simply have no effect.
• Actually, #C#D should hold along any prefix of an execution path.
49
Complementary operation analysis
• Given an execution s, let (t) = number of C’s in t – number of D’s in t.
• (t) has to be 0, otherwise unsafe.• Actually, for all prefixes s of t check that (s) 0• But can we check this for all executions of the CFG?
50
(CD* C | CC) D
D
C
C
D
C
C
Complementary operation analysis
• Consider the regexp that equivalently describes the graph.
• We’ll write a function :: Rexpr [Formula] to generate formulas describing all possible ’s of all sentences of the regular expr.
• Checking safety is then “simple” : safe :: Rexpr Bool safe e = all [ f 0 is valid | f e ]
51
(CD* C | CC) D
D
C
C
D
C
C
The algorithm
52
:: Rexp [Formula] C = [ “1” ] D = [ “-1” ] = [ “0” ] (Alt d e) = d e(Seq d e) = [ k + m | k d , m e ] (Star e) = [ k * n | k e ], where n is a fresh name
(This does not include the extension to generate constraints over prefixes)