Date post: | 05-Jan-2016 |
Category: |
Documents |
Upload: | wendy-flynn |
View: | 215 times |
Download: | 1 times |
2
Outline
Introduction Test selection Mining specification as test oracles
Mining Specification from Passing Tests Our Approach: Mining Specification from Unverified
Tests Potential benefits Difficulties of applying previous approaches directly Mining predicate rules offline
Experiments Future Work
3
Test Selection
Software Testing Software testing is the most commonly used and effective
technique for improving the quality of software programs. Testing involves three main steps
Generating a set of test inputs There have been various practical approaches on automatic test-
input generation Executing those inputs on the program under test
Do not need intensive human labor Checking whether the test executions reveal faults
Remains a largely manual task Test selection helps to reduce the cost of test-result inspection
by selecting a small subset of tests that are likely to reveal faults.
4
Mining specification as test oracles A specification can describe
The temporal properties of programs such as fopen() should be followed by fclose();
The algebraic properties of programs such as get(S, arg).state=S (state-preserving function);
The operational properties of programs such as idx>=0 for a method get(array,idx) (precondition)
Such a specification can be mined from execution traces, and then be used as test oracles for finding failing tests
5
Mining Specification from Passing Tests Dynamic invariant detection, called Daikon [Ernst01]
Given a set of passing tests, mine the implicit rules over the variables dynamically
The templates Invariants over any variable: x=a … Invariants over two numeric variables: y=ax+b … Invariants over three numeric variables: y=ax+by+c … …
Simply check the candidate rules and discard those violated immediately
6
Mining Specification from Passing Tests Test Selection based on dynamic invariant
detection (Jov [Xie03] ) Start from some passing tests, mine the operational
model using Daikon Generate a new test using Jtest ((a commercial
Java unit testing tool)l; mine a new operational model
If the new operational model is different from the original one, add the new test to the test suite
Repeat the process
7
Mining Specification from Passing Tests Test Selection based on dynamic invariant detection
(Eclat [Pacheco05]) Similar to the work of Jov Further distinguish illegal inputs from fault-revealing inputs
A violation of the operational model is not necessary to imply a fault, since the previously-seen behavior may be incomplete
They classify violations into three types Normal operation
No entry or exit violations Some entry violations, no exit violations
Fault-revealing No entry violations, some exit violations.
Illegal input Some entry and some exit violations.
8
Mining Specification from Passing Tests Test Selection based on dynamic invariant
detection (DIDUCE [Hangal02]) Based on another type of invariant, focusing on
variable values Mine invariants from long-running program
executions Issue a warning whenever an invariant is violated Also relax an invariant after being violated
9
Mining Specification from Passing Tests The drawback
We may have only a small set of passing tests or even no passing tests, whose behavior is incomplete.
A specification mined from a small set of tests could be noisy.
Many violations of the specification are false-positives
10
Our Approach: Mining Specification from Unverified Tests We propose to mine common operational
models, which are not always true in all observed traces, from a (potentially large) set of unverified tests based on mining predicate rules.
The common operational models mined can then be used to select the suspicious tests
11
Our Approach: Mining Specification from Unverified Tests Potential Benefits
Training a behavior model on a large set of unverified tests may capture the typical software behaviors without bias, hence reducing the noise
It is relatively easy to collect execution data of a large set of tests without verifying them
12
Our Approach: Mining Specification from Unverified Tests Difficulties of applying previous approaches
directly A common behavior model is not always true over
the whole set of tests, thus we can not discard them directly
Alternatively, we may generate and collect all the potential models at runtime and evaluate them after running all the tests. However, such an approach can incur high runtime
overhead if Daikon-like operational models, which are in a large number, are used.
13
Our Approach: Mining Specification from Unverified Tests Mining predicate rules offline
Collect values of simple predicates at runtime Generate and evaluate predicate rules as potential
operational models after running all the tests A predicate rule is an implication relationship between
predicates When a rule’s confidence is not equal to 1, the higher the
rule’s confidence is, the more suspicious its violations are in indicating a failure.
We then select tests that violate the mined predicate rules for result inspection
14
Our Approach: Mining Specification from Unverified Tests The program would fail if
In passing tests, the program should satisfy
a precondition
corresponds to a precondition
We observe that a failure is not likely to be
predicted by the violation of a single predicate.
This is similar to and weaker than the real
operational model. Its violation should also
lead to the violation of the real operational
model and indicate a failure, such as Test 5.
15
Our Approach: Mining Specification from Unverified Tests Mining predicate rules
We use cbi-tools [Liblit05] to instrument the programs Branches: At each conditional (branch), two predicates are tracked,
indicating whether the true or false branches were ever taken Returns: At each scalar-returning function call site, three predicates are
tracked: whether the returned value is ever < 0, == 0, or > 0 For each predicate y, we mine the rules X=> y and X => !y, where X
is a conjunction of other predicates To reduce complexity, our current implementation mines only the rules
where X is a single predicate We plan to use advanced data mining techniques such as association rule
mining to mine more general rules in our future work. There may be a large number of predicate rules.
For each predicate y, we select only the most confident rule X=> y and the most confident rule X => !y
Alternatively, we can set a threshold min_conf and select all the rules having higher confidences than min_conf
16
Our Approach: Mining Specification from Unverified Tests Test selection
We select only a small subset of the tests that violate all the predicate rules at least once Initially, the set of selected tests is empty. We sort the selected predicate rules in the descending
order of confidence. From the top to bottom, if a rule is not violated by any of the
previously selected tests, we select the first test that violates the rule.
Finally, in a greedy way all the selected rules can be violated by the selected tests.
We also rank the selected tests in the order of selection.
17
Experiments Preliminary Results
Subject 1: the Siemens suite 130 faulty versions of 7 programs that range in size from 170 to 540 lines
On average, only 1.53% (45/2945) of the original tests are needed to be checked.
Despite small sized, the selected tests can still reveal 74.6% (97/130) of the faults, while
the results of random sampling technique can reveal only 45.4% (59/130) of the faults.
18
Experiments Preliminary Results
Subject 2: the grep program a unix utility to search a file for a pattern; it includes
13,358 lines of C code There are 3 buggy versions that fail 3, 4, and 132 times
running the 470 tests, respectively. Our approach selects 82, 86, and 89 tests for these
versions, which reveal all the 3 faults. In addition, for each version, there is at least one failing test ranked in top 20.
We also randomly select 20 tests for each version. In the 5 times of random selection, the selected tests never reveal the faults of the first two versions but always reveal the faults of the third version.
19
Future work
Combine our approach with automatic test generation tools
Study the characteristics of mined common operational models and compare them with invariants mined by Daikon.
Explore mining more general rules containing several predicates.
20
References
[Pacheco05] C. Pacheco and M. D. Ernst. Eclat: Automatic generation and classification of test inputs. In ECOOP, pages 504–527, 2005.
[Xie03] T. Xie and D. Notkin. Tool-assisted unit test selection based on operational violations. In ASE, pages 40–48, 2003.
[Hangal02] S. Hangal and M. S. Lam. Tracking down software bugs using automatic anomaly detection. In ICSE, pages 291–301, 2002.
21
Thank you!