+ All Categories
Home > Documents > Automated Fault Prediction The Ins, The Outs, The Ups, The Downs Elaine Weyuker June 11, 2015.

Automated Fault Prediction The Ins, The Outs, The Ups, The Downs Elaine Weyuker June 11, 2015.

Date post: 11-Jan-2016
Category:
Upload: myles-wood
View: 218 times
Download: 2 times
Share this document with a friend
Popular Tags:
63
Automated Fault Prediction The Ins, The Outs, The Ups, The Downs Elaine Weyuker June 11, 2015
Transcript
Page 1: Automated Fault Prediction The Ins, The Outs, The Ups, The Downs Elaine Weyuker June 11, 2015.

Automated Fault Prediction

The Ins, The Outs, The Ups, The Downs

Elaine WeyukerJune 11, 2015

Page 2: Automated Fault Prediction The Ins, The Outs, The Ups, The Downs Elaine Weyuker June 11, 2015.

To determine which files of a large software system with multiple releases are likely to contain the largest numbers of bugs in the next release.

Page 3: Automated Fault Prediction The Ins, The Outs, The Ups, The Downs Elaine Weyuker June 11, 2015.

Help testers prioritize testing efforts.

Help developers decide when to do design and code reviews and what to reimplement.

Help managers allocate resources.

Page 4: Automated Fault Prediction The Ins, The Outs, The Ups, The Downs Elaine Weyuker June 11, 2015.

Verified that bugs were non-uniformly distributed among files.

Identified properties that were likely to affect fault-proneness, and then built a statistical model and ultimately a tool to make predictions.

Page 5: Automated Fault Prediction The Ins, The Outs, The Ups, The Downs Elaine Weyuker June 11, 2015.

● Size of file (KLOCs)● Number of changes to the file in the

previous 2 releases.● Number of bugs in the file in the last

release.● Age of file (Number of releases in the

system)● Language the file is written in.

Page 6: Automated Fault Prediction The Ins, The Outs, The Ups, The Downs Elaine Weyuker June 11, 2015.

● All of the systems we’ve studied to date use a configuration management system which integrates version control and change management functionality, including bug history.

● Data is automatically extracted from the associated data repository and passed to the prediction engine.

Page 7: Automated Fault Prediction The Ins, The Outs, The Ups, The Downs Elaine Weyuker June 11, 2015.

Used Negative Binomial Regression

Also considered machine learning algorithms including:◦ Recursive Partitioning◦ Random Forests◦ BART (Bayesian Additive Regression Trees)

Page 8: Automated Fault Prediction The Ins, The Outs, The Ups, The Downs Elaine Weyuker June 11, 2015.

● Consists of two parts.

● The back end extracts data needed to make the predictions.

● The front end makes the predictions and displays them.

Page 9: Automated Fault Prediction The Ins, The Outs, The Ups, The Downs Elaine Weyuker June 11, 2015.

Extracts necessary data from the repository.

Predicts how many bugs will be in each file in the next release of the system.

Sorts to files in decreasing order of the number of predicted bugs.

Displays results to user.

Page 10: Automated Fault Prediction The Ins, The Outs, The Ups, The Downs Elaine Weyuker June 11, 2015.

Percentage of actual bugs that occurred in the N% of the files predicted to have the largest number of bugs. (N=20)

Considered other measures less sensitive to the specific value of N.

Page 11: Automated Fault Prediction The Ins, The Outs, The Ups, The Downs Elaine Weyuker June 11, 2015.

System Years Followed

Releases LOC % Faults Top 20%

NP 4 17 538K 83%

WN 2 9 438K 83%

VT 2.25 9 329K 75%

TS 9+ 35 442K 81%

TW 9+ 35 384K 93%

TE 7 27 327K 76%

IC 4 18 1520K 91%

AR 4 18 281K 87%

IN 4 18 2116K 93%

Page 12: Automated Fault Prediction The Ins, The Outs, The Ups, The Downs Elaine Weyuker June 11, 2015.
Page 13: Automated Fault Prediction The Ins, The Outs, The Ups, The Downs Elaine Weyuker June 11, 2015.
Page 14: Automated Fault Prediction The Ins, The Outs, The Ups, The Downs Elaine Weyuker June 11, 2015.

The Tool

Page 15: Automated Fault Prediction The Ins, The Outs, The Ups, The Downs Elaine Weyuker June 11, 2015.

Prediction Engine

Statistical Analysis

Version Mgmt /Fault Database

(previous releases)

Release to be predicted

User-supplied parameters

Fault-proneness predictions

Page 16: Automated Fault Prediction The Ins, The Outs, The Ups, The Downs Elaine Weyuker June 11, 2015.

User enters system name.

User asks for fault predictions for release “Bluestone2008.1”

Available releases are found in the version mgmt database. User chooses the releases to analyze.

User selects 4 file types.

User specifies that all problems reported in System Test phase are faults.

Page 17: Automated Fault Prediction The Ins, The Outs, The Ups, The Downs Elaine Weyuker June 11, 2015.

User confirms configuration

User enters filename to save the configuration.

User clicks Save & Run button, to start the prediction process.

Page 18: Automated Fault Prediction The Ins, The Outs, The Ups, The Downs Elaine Weyuker June 11, 2015.

Initial prediction view for Bluestone2008.1

All files are listed in decreasing order of predicted faults

Page 19: Automated Fault Prediction The Ins, The Outs, The Ups, The Downs Elaine Weyuker June 11, 2015.

Listing is restricted to eC files

Page 20: Automated Fault Prediction The Ins, The Outs, The Ups, The Downs Elaine Weyuker June 11, 2015.

Listing is restricted to 10% of eC files

Page 21: Automated Fault Prediction The Ins, The Outs, The Ups, The Downs Elaine Weyuker June 11, 2015.

Prediction tool is fully-operational◦ 750 lines Python for interface◦ 2150 lines C, 75K bytes compiled for prediction

engine

Current version’s backend (written in C) is specific for the internal AT&T configuration management system but can be adapted to other configuration management systems. All that is needed is a source of the data required by the prediction model.

Page 22: Automated Fault Prediction The Ins, The Outs, The Ups, The Downs Elaine Weyuker June 11, 2015.

Variations of the Fault Prediction Model

Page 23: Automated Fault Prediction The Ins, The Outs, The Ups, The Downs Elaine Weyuker June 11, 2015.

Developers◦ Counts◦ Individuals

Amount of Code Change

Calling Structure

Page 24: Automated Fault Prediction The Ins, The Outs, The Ups, The Downs Elaine Weyuker June 11, 2015.

1. Standard model2. Developer counts3. Individual developers4. Line-level change metrics5. Calling structure

Overview

Page 25: Automated Fault Prediction The Ins, The Outs, The Ups, The Downs Elaine Weyuker June 11, 2015.

Underlying statistical model◦ Negative binomial regression

Output (dependent) variable◦ Predicted fault count in each file of release n

Predictor (independent) variables◦ KLOC (n)◦ Previous faults (n-1)◦ Previous changes (n-1, n-2)◦ File age (number of releases)◦ File type (C,C++,java,sql,make,sh,perl,...)

The Standard Model

Page 26: Automated Fault Prediction The Ins, The Outs, The Ups, The Downs Elaine Weyuker June 11, 2015.

How many different people have worked on the file in the most recent previous release?

How many different people have worked on the file in all previous releases? This is a cumulative count.

How many people who changed the file were working on it for the first time?

Developer counts

Page 27: Automated Fault Prediction The Ins, The Outs, The Ups, The Downs Elaine Weyuker June 11, 2015.

Faults per file in releases of System BTS

Page 28: Automated Fault Prediction The Ins, The Outs, The Ups, The Downs Elaine Weyuker June 11, 2015.

Standard Model

Page 29: Automated Fault Prediction The Ins, The Outs, The Ups, The Downs Elaine Weyuker June 11, 2015.

Developers Changing File in Previous Release

Page 30: Automated Fault Prediction The Ins, The Outs, The Ups, The Downs Elaine Weyuker June 11, 2015.

New Developers Changing File in Previous Release

Page 31: Automated Fault Prediction The Ins, The Outs, The Ups, The Downs Elaine Weyuker June 11, 2015.

Total Developers Changing File in All Previous Releases

Page 32: Automated Fault Prediction The Ins, The Outs, The Ups, The Downs Elaine Weyuker June 11, 2015.

Total developers touching file in all previous releases

Page 33: Automated Fault Prediction The Ins, The Outs, The Ups, The Downs Elaine Weyuker June 11, 2015.

None of the developer count attributes uniformly increases prediction accuracy. In all cases, adding a developer count attribute to the standard model sometimes leads to less accurate predictions than the standard model alone. The benefit is never major.

Summary

Page 34: Automated Fault Prediction The Ins, The Outs, The Ups, The Downs Elaine Weyuker June 11, 2015.

The standard model includes a count of the number of changes made in the previous two releases. It does not take into account how much code was changed.

We will now look at the impact on predictive accuracy of adding to the model fine-grained information about change size.

Code Change

Page 35: Automated Fault Prediction The Ins, The Outs, The Ups, The Downs Elaine Weyuker June 11, 2015.

Number of changes made to a file during a previous release

Number of lines added Number of lines deleted Number of lines modified Relative size of change (line changes/LOC)

Changed/not changed

Measures of Code Change

Page 36: Automated Fault Prediction The Ins, The Outs, The Ups, The Downs Elaine Weyuker June 11, 2015.

18 releases, 5 year lifespan

IC: Large provisioning system 6 languages: Java (60%), C, C++, SQL, SQL-C, SQL-C++ 3000+ files 1.5Mil LOC Average of 395 faults/release

AR: Utility, data aggregation system >10 languages: Java (77%), Perl, xml, sh, ... 800 files 280K LOC Average of 90 faults/release

Two Subject Systems

Page 37: Automated Fault Prediction The Ins, The Outs, The Ups, The Downs Elaine Weyuker June 11, 2015.

Distribution of files,averages over all releases.

Page 38: Automated Fault Prediction The Ins, The Outs, The Ups, The Downs Elaine Weyuker June 11, 2015.

System IC Faults per File, by Release

Page 39: Automated Fault Prediction The Ins, The Outs, The Ups, The Downs Elaine Weyuker June 11, 2015.

System AR Faults per File, by Release

Page 40: Automated Fault Prediction The Ins, The Outs, The Ups, The Downs Elaine Weyuker June 11, 2015.

Univariate models

Base model: log(KLOC), File age, File type

Augmented models:◦ Previous Changes◦ Previous {Adds / Deletes / Mods}◦ Previous {Adds / Deletes / Mods} / LOC (relative

churn)◦ Previous Developers

Prediction Models with Line-level Change Counts

Page 41: Automated Fault Prediction The Ins, The Outs, The Ups, The Downs Elaine Weyuker June 11, 2015.

Fault-percentile averages for univariate predictor models: System IC

Page 42: Automated Fault Prediction The Ins, The Outs, The Ups, The Downs Elaine Weyuker June 11, 2015.

Base Model and Added Variables: System IC

• Base model: KLOC, File age (number of releases), File type (C,C++,java,sql,make,sh,perl,...)

Page 43: Automated Fault Prediction The Ins, The Outs, The Ups, The Downs Elaine Weyuker June 11, 2015.

Base Model and Added Variables: System AR

Page 44: Automated Fault Prediction The Ins, The Outs, The Ups, The Downs Elaine Weyuker June 11, 2015.

Change information provides important information for fault predictions

{Adds+Deletes+Mods} improves the accuracy of a model that doesn’t include any change information

BUT a simple count of prior changes slightly outperforms

{Adds+Deletes+Mods} Prior changed (a simple binary variable) is nearly as good

as either, when added to a model without change info Lines added is the most effective single change predictor Lines deleted is least effective single change predictor Relative changes is no better than absolute changes for

predicting total fault count

Summary

Page 45: Automated Fault Prediction The Ins, The Outs, The Ups, The Downs Elaine Weyuker June 11, 2015.

Individual Developers

How can we measure the effect that a single developer has on the faultiness of a file?

If developer d modifies k files in release N how many of those files have bugs in

release N+1? how many bugs are in those files in release

N+1?

Page 46: Automated Fault Prediction The Ins, The Outs, The Ups, The Downs Elaine Weyuker June 11, 2015.

The BuggyFile Ratio

If d modifies k files in release N, and if b of them have bugs in release N+1, the buggyfile ratio for d is b/k

System IC has 107 programmers.

Over 15 releases, their buggyfile ratios vary between 0 and 1

The average is about 0.4

Page 47: Automated Fault Prediction The Ins, The Outs, The Ups, The Downs Elaine Weyuker June 11, 2015.

Average buggyfile ratio, all programmers

Page 48: Automated Fault Prediction The Ins, The Outs, The Ups, The Downs Elaine Weyuker June 11, 2015.

Buggyfile ratio for two programmers

Page 49: Automated Fault Prediction The Ins, The Outs, The Ups, The Downs Elaine Weyuker June 11, 2015.

Buggyfile ratiomore typical cases

Page 50: Automated Fault Prediction The Ins, The Outs, The Ups, The Downs Elaine Weyuker June 11, 2015.

The Bug Ratio

If d modifies k files in release N, and if there are B bugs in those files in release N+1, the bug ratio for d is B/k

The bug ratio can vary between 0 and B

Over 15 releases, we’ve seen a maximum bug ratio of about 8

The average is about 1.5

Page 51: Automated Fault Prediction The Ins, The Outs, The Ups, The Downs Elaine Weyuker June 11, 2015.

Bug Ratio

Buggyfile Ratio

Page 52: Automated Fault Prediction The Ins, The Outs, The Ups, The Downs Elaine Weyuker June 11, 2015.

Problems with these definitions

A file can be changed by more than one developer.

A file may be changed in Rel N and a fault detected in N+1, but that change may not have caused that fault.

A programmer might change many files in the identical trivial ways (interface, variable name, ...)

The “best” programmers might be assigned to work on the most difficult files.

For most programmers, the bug ratios vary widely from release to release.

Page 53: Automated Fault Prediction The Ins, The Outs, The Ups, The Downs Elaine Weyuker June 11, 2015.

• Is individual programmer bug-proneness helpful for prediction?

• Is this information useful for helping a project succeed?

• Are there better ways to measure it?

• Is it ethical to measure it?

• Does attempting to measure it lead to poor performance and unhappy programmers?

Some final thoughts

Page 54: Automated Fault Prediction The Ins, The Outs, The Ups, The Downs Elaine Weyuker June 11, 2015.

Are files that have high rate of interaction with other files more fault-prone?

Calling Structure

File Q

Method 1

Method 2

File X

File Y

File Z

Callees of File Q

File A

File B

Callers of File Q

Page 55: Automated Fault Prediction The Ins, The Outs, The Ups, The Downs Elaine Weyuker June 11, 2015.

For each file:

number of callers & callees number of new callers & callees number of prior new callers & callees number of prior changed callers & callees number of prior faulty callers & callees ratio of internal calls to total calls

Calling Structure Attributes Investigated

Page 56: Automated Fault Prediction The Ins, The Outs, The Ups, The Downs Elaine Weyuker June 11, 2015.

Code and history attributes, no calling structure

Code and history attributes, including calling structure

Code attributes only, including calling structure

Fault Prediction by Multi-variable Models

Page 57: Automated Fault Prediction The Ins, The Outs, The Ups, The Downs Elaine Weyuker June 11, 2015.

Models applied to C, C++, and C-SQL files of one of the systems studied.

First model built from the single best attribute.

Each succeeding model built by adding the attribute that most improves the prediction.

Stop when no attribute improves.

Fault Prediction by Multi-variable Models

Page 58: Automated Fault Prediction The Ins, The Outs, The Ups, The Downs Elaine Weyuker June 11, 2015.

Code and history attributes, no calling structure

Page 59: Automated Fault Prediction The Ins, The Outs, The Ups, The Downs Elaine Weyuker June 11, 2015.

Code, history, and calling structure attributes

Page 60: Automated Fault Prediction The Ins, The Outs, The Ups, The Downs Elaine Weyuker June 11, 2015.

Code and calling structure attributes but not numbers of faults or changes in previous releases.

Page 61: Automated Fault Prediction The Ins, The Outs, The Ups, The Downs Elaine Weyuker June 11, 2015.

Calling structure attributes do not increase the accuracy of predictions.

History attributes (prior changes, prior faults) increase accuracy, either with or without calling structure.

We only studied these issues for two of the systems.

Summary

Page 62: Automated Fault Prediction The Ins, The Outs, The Ups, The Downs Elaine Weyuker June 11, 2015.

The Standard Model performs very well (on all nine industrial systems we have examined)

The augmented models add very little or no additional accuracy

Cumulative developers is the most effective addition to the Standard Model, but still doesn’t guarantee improved prediction or yield significant improvement.

Overall Summary

Page 63: Automated Fault Prediction The Ins, The Outs, The Ups, The Downs Elaine Weyuker June 11, 2015.

◦ Will our standard model make accurate predictions for open-source systems?

◦Will our standard model make accurate predictions agile systems?

◦ Can we predict which files will contain the faults with the highest severities?

◦ Can predictions be made for units smaller than files?

◦ Can run-time attributes be used to make fault predictions? (execution time, execution frequency, memory use, …)

◦ What is the most meaningful way to assess the effectiveness and accuracy of the predictions?

What’s Ahead?


Recommended