1 Test Selection for Result Inspection via Mining Predicate Rules Wujie Zheng...

1

Test Selection for Result Inspection via Mining

Predicate RulesWujie Zheng

[email protected]

2

Outline

Introduction Test selection Mining specification as test oracles

Mining Specification from Passing Tests Our Approach: Mining Specification from Unverified

Tests Potential benefits Difficulties of applying previous approaches directly Mining predicate rules offline

Experiments Future Work

3

Test Selection

Software Testing Software testing is the most commonly used and effective

technique for improving the quality of software programs. Testing involves three main steps

Generating a set of test inputs There have been various practical approaches on automatic test-

input generation Executing those inputs on the program under test

Do not need intensive human labor Checking whether the test executions reveal faults

Remains a largely manual task Test selection helps to reduce the cost of test-result inspection

by selecting a small subset of tests that are likely to reveal faults.

4

Mining specification as test oracles A specification can describe

The temporal properties of programs such as fopen() should be followed by fclose();

The algebraic properties of programs such as get(S, arg).state=S (state-preserving function);

The operational properties of programs such as idx>=0 for a method get(array,idx) (precondition)

Such a specification can be mined from execution traces, and then be used as test oracles for finding failing tests

5

Mining Specification from Passing Tests Dynamic invariant detection, called Daikon [Ernst01]

Given a set of passing tests, mine the implicit rules over the variables dynamically

The templates Invariants over any variable: x=a … Invariants over two numeric variables: y=ax+b … Invariants over three numeric variables: y=ax+by+c … …

Simply check the candidate rules and discard those violated immediately

6

Mining Specification from Passing Tests Test Selection based on dynamic invariant

detection (Jov [Xie03] ) Start from some passing tests, mine the operational

model using Daikon Generate a new test using Jtest ((a commercial

Java unit testing tool)l; mine a new operational model

If the new operational model is different from the original one, add the new test to the test suite

Repeat the process

7

Mining Specification from Passing Tests Test Selection based on dynamic invariant detection

(Eclat [Pacheco05]) Similar to the work of Jov Further distinguish illegal inputs from fault-revealing inputs

A violation of the operational model is not necessary to imply a fault, since the previously-seen behavior may be incomplete

They classify violations into three types Normal operation

No entry or exit violations Some entry violations, no exit violations

Fault-revealing No entry violations, some exit violations.

Illegal input Some entry and some exit violations.

8

Mining Specification from Passing Tests Test Selection based on dynamic invariant

detection (DIDUCE [Hangal02]) Based on another type of invariant, focusing on

variable values Mine invariants from long-running program

executions Issue a warning whenever an invariant is violated Also relax an invariant after being violated

9

Mining Specification from Passing Tests The drawback

We may have only a small set of passing tests or even no passing tests, whose behavior is incomplete.

A specification mined from a small set of tests could be noisy.

Many violations of the specification are false-positives

10

Our Approach: Mining Specification from Unverified Tests We propose to mine common operational

models, which are not always true in all observed traces, from a (potentially large) set of unverified tests based on mining predicate rules.

The common operational models mined can then be used to select the suspicious tests

11

Our Approach: Mining Specification from Unverified Tests Potential Benefits

Training a behavior model on a large set of unverified tests may capture the typical software behaviors without bias, hence reducing the noise

It is relatively easy to collect execution data of a large set of tests without verifying them

12

Our Approach: Mining Specification from Unverified Tests Difficulties of applying previous approaches

directly A common behavior model is not always true over

the whole set of tests, thus we can not discard them directly

Alternatively, we may generate and collect all the potential models at runtime and evaluate them after running all the tests. However, such an approach can incur high runtime

overhead if Daikon-like operational models, which are in a large number, are used.

13

Our Approach: Mining Specification from Unverified Tests Mining predicate rules offline

Collect values of simple predicates at runtime Generate and evaluate predicate rules as potential

operational models after running all the tests A predicate rule is an implication relationship between

predicates When a rule’s confidence is not equal to 1, the higher the

rule’s confidence is, the more suspicious its violations are in indicating a failure.

We then select tests that violate the mined predicate rules for result inspection

14

Our Approach: Mining Specification from Unverified Tests The program would fail if

In passing tests, the program should satisfy

a precondition

corresponds to a precondition

We observe that a failure is not likely to be

predicted by the violation of a single predicate.

This is similar to and weaker than the real

operational model. Its violation should also

lead to the violation of the real operational

model and indicate a failure, such as Test 5.

15

Our Approach: Mining Specification from Unverified Tests Mining predicate rules

We use cbi-tools [Liblit05] to instrument the programs Branches: At each conditional (branch), two predicates are tracked,

indicating whether the true or false branches were ever taken Returns: At each scalar-returning function call site, three predicates are

tracked: whether the returned value is ever < 0, == 0, or > 0 For each predicate y, we mine the rules X=> y and X => !y, where X

is a conjunction of other predicates To reduce complexity, our current implementation mines only the rules

where X is a single predicate We plan to use advanced data mining techniques such as association rule

mining to mine more general rules in our future work. There may be a large number of predicate rules.

For each predicate y, we select only the most confident rule X=> y and the most confident rule X => !y

Alternatively, we can set a threshold min_conf and select all the rules having higher confidences than min_conf

16

Our Approach: Mining Specification from Unverified Tests Test selection

We select only a small subset of the tests that violate all the predicate rules at least once Initially, the set of selected tests is empty. We sort the selected predicate rules in the descending

order of confidence. From the top to bottom, if a rule is not violated by any of the

previously selected tests, we select the first test that violates the rule.

Finally, in a greedy way all the selected rules can be violated by the selected tests.

We also rank the selected tests in the order of selection.

17

Experiments Preliminary Results

Subject 1: the Siemens suite 130 faulty versions of 7 programs that range in size from 170 to 540 lines

On average, only 1.53% (45/2945) of the original tests are needed to be checked.

Despite small sized, the selected tests can still reveal 74.6% (97/130) of the faults, while

the results of random sampling technique can reveal only 45.4% (59/130) of the faults.

18

Experiments Preliminary Results

Subject 2: the grep program a unix utility to search a file for a pattern; it includes

13,358 lines of C code There are 3 buggy versions that fail 3, 4, and 132 times

running the 470 tests, respectively. Our approach selects 82, 86, and 89 tests for these

versions, which reveal all the 3 faults. In addition, for each version, there is at least one failing test ranked in top 20.

We also randomly select 20 tests for each version. In the 5 times of random selection, the selected tests never reveal the faults of the first two versions but always reveal the faults of the third version.

19

Future work

Combine our approach with automatic test generation tools

Study the characteristics of mined common operational models and compare them with invariants mined by Daikon.

Explore mining more general rules containing several predicates.

20

References

[Pacheco05] C. Pacheco and M. D. Ernst. Eclat: Automatic generation and classification of test inputs. In ECOOP, pages 504–527, 2005.

[Xie03] T. Xie and D. Notkin. Tool-assisted unit test selection based on operational violations. In ASE, pages 40–48, 2003.

[Hangal02] S. Hangal and M. S. Lam. Tracking down software bugs using automatic anomaly detection. In ICSE, pages 291–301, 2002.

21

Thank you!

Date post:	05-Jan-2016
Category:	Documents
Upload:	wendy-flynn
View:	215 times
Download:	1 times

1 Test Selection for Result Inspection via Mining Predicate Rules Wujie Zheng...

Documents