CrashLocator: Locating Crashing Faults Based on Crash Stacks (ISSTA 2014)

transcript

CrashLocator: Locating Crashing Faults Based on

Crash Stacks

Rongxin Wu1, Hongyu Zhang2, Shing-Chi Cheung1 and Sunghun Kim1

The Hong Kong University of Science and Technology1

Microsoft Research2

July 24th , 2014

ISSTA 2014

Background

Crash Information with Crash Stack

Crash Reporting System

Software Crash

Bug ReportsDevelopers Crash Buckets

Feedbacks From Mozilla Developers

• Locating crashing faults is hard

• Ad hoc approach

“… and look at the crash stack listed. It shows the line number of the code, and then I go to the code and inspect it. If I am unsure what it does I go to the second line of the stack and code and inspect that, and so on and so forth …”

“Some crashes are hard to fix because it is not necessarily indicative of the place where it crashes in the crash stack …”

“ I use the top down method of following the crash backwards.”

“Sometimes it can be very difficult.”

Uncertain Fault Location• The faulty function may not appear in crash stack

About 33%~41% of crashing faults in Firefox cannot be located in crash stacks!

C E F G

Buggy Code

Crash Stack

Crash Point

• Related Work• Tarantula

(J. A. Jones et al., ICSE 2002) (J. A. Jones et al., ASE 2005)• Jaccard

(R. Abreu et al., TAICPART-MUTATION 2007)• Ochiai

(R. Abreu et al., TAICPART-MUTATION 2007) (S. Art et al., ISSTA 2010)

• …

• Passing Traces and Failing Traces

Spectrum-Based Fault Localization

• Are these techniques applicable?

Spectrum-Based Fault Localization

Instrumented Product Software

Failing Traces

Passing TracesPrivacy Concern

Performance Overhead(C. Luk et al., PLDI 2005)

Crash Stackf1f2f3…fn

Test CasesEffectiveness

(S. Artzi et al., ISSTA’10)

Our Research Goal

How to help developers fix crashing faults?– Locate crashing faults based on crash stack

Our technique: CrashLocator

• Target at locating faulty functions• No instrumentation needed• Approximate failing traces

Based on Crash Stacks Use static analysis techniques

• Rank suspicious functions Without passing traces Based on characteristics of faulty functions

Approximate Failing Traces• Basic Stack Expansion Algorithm

Crash Stack

Depth-1

Depth-2

Depth-3 A

Call Graph

functionposition File Line

D0 file_0 l0

C1 file_1 l1

B2 file_2 l2

A3 file_3 l3

Crash Stack

Approximate Failing Traces• Basic Stack Expansion Algorithm

Function call information only

• Improved Stack Expansion Algorithm Source file position information

Improved Stack Expansion Algorithm

• Control Flow Analysis

J()…

B()…

In Crash Stack

CFG of A

Crash Stack

Depth-1

Depth-2

Depth-3

Improved Stack Expansion Algorithm

• Backward Slicing

1. Obj D(){2. Obj s;3. int a = M();4. char b = ‘’;5. Obj[] c = N(b);6. s=c[1]; //crash here7. if(s!=‘’){8. … 9. }8. … 9. }

variables {s,c}

Crash Stack

Depth-1

Depth-2

Depth-3

Not in slicing

After crash stack expansion, there are still a large number of suspicious functions

How to rank the suspicious functions?

Rank suspicious functions

• An empirical study on the characteristics of faulty functions

• Quantify the suspiciousness of suspicious functions

Observation 1: Frequent Function

• Faulty functions appear frequently in the crash traces of the corresponding buckets.

Function Frequency (FF)

Crash Report More Frequent,

More Suspicious

For 89-92% crashing faults, the associated faulty functions appear in all crash execution traces in the

corresponding bucket.

Crash Bucket

Frequent Function

• Some frequent functions are unlikely to be buggy Entry points (main, _RtlUserThreadStart, …) Event handling routine (CloseHandle)

• Information retrieval, some frequent words are useless stop-words, e.g. “the”, “an”, “a” Inverse Document Frequency (IDF)

• Inverse Bucket Frequency (IBF) If a function appears in many buckets, it is less likely to be

Observation 2: Functions Close to Crash

Point• Faulty functions appear closer to crash point

In Mozilla Firefox, for 84.3% of crashing faults, the distance between crash point and the associated faulty functions is less 5.

• Inverse Average Distance to Crash Point (IAD)

Observation 3: Less Frequently Changed

Functions• Functions that do not contain crashing faults are

often less frequently changed 94.1% of faulty functions have been changed at least

once during the past 12 months Immune Functions (Y. Dang et al. ICSE 2012)

• Less frequently changed functions Functions that have no changes in past 12 months Suspicious score is 0

Observation 4: Large Functions

• Our prior study (H. Zhang. ICSM 2009) showed that large modules are more likely to be defect-prone

• Function’s Lines of Code (FLOC)

Suspicious Score

• FF (Function Frequency)

• IBF(Inverse Bucket Frequency)

• IAD(Inverse Distance to Crash Point)

• FLOC(Function Lines of Code)

Evaluation Subjects

• Mozilla Products 5 releases of Firefox 2 releases of Thunderbird 1 release of SeaMonkey

• 160 crashing faults(buckets)• Large-Scale

More than 2 million LOC More than 120K functions

Evaluation Metrics

• Recall@N: Percentage of successfully located faults by examining top N recommended functions

• Mean Reciprocal Rank (MRR) Measure the quality of the ranking results in IR Range value: 0 ~ 1 Higher value means better ranking

Experimental Design

• RQ1: How many faults can be successfully located by CrashLocator?

• RQ2: Can CrashLocator outperform the conventional stack-only methods?

• RQ3: How does each factor contribute to the crash localization performance?

• RQ4: How effective is the proposed crash stack expansion algorithm?

RQ1: CrashLocator Performance

System Recall@1 Recall@5 Recall@10 MRR

Firefox 4.0b4 55.6% 66.7% 77.8% 0.627

Firefox 4.0b5 47.1% 70.6% 70.6% 0.566

Firefox 4.0b6 48.0% 64.0% 64.0% 0.540

Firefox14.0.1 52.0% 52.0% 56.0% 0.528

Firefox16.0.1 53.8% 53.8% 53.8% 0.542

Thunderbird17.0 48.5% 66.7% 78.8% 0.568

Thunderbird24.0 50.0% 66.7% 66.7% 0.544

SeaMonkey2.21 55.0% 70.0% 70.0% 0.600

Summary 50.6% 63.7% 67.5% 0.559

RQ2: Comparison with Stack-Only methods

• Conventional Stack-Only Methods• StackOnlySampling• StackOnlyAverage• StackOnlyChangeDate

RQ2: Comparison with Stack-Only methods

1 5 10 20 50 100

Top N Functions

StackOnlySampling

StackOnlyAverage

StackOnlyChangeDate

CrashLocator

RQ3: Contribution of Each Factors• Inverse Bucket Frequency (IBF)• Function Frequency (FF)• Function’s Lines of Code (FLOC)• Inverse Average Distance to Crash Point (IAD)

RQ3: Contribution of Each Factors

ff4.0b4 ff4.0b5 ff4.0b6 ff14.0.1 ff16.0.1 tb17.0 tb24.0 sm2.21 Summary

IBF IBF*FF IBF*FF*FLOC IBF*FF*FLOC*IAD

RQ4: Stack Expansion Algorithms• Basic Stack Expansion Algorithm

Static Call Graph

• Improved Stack Expansion Algorithm Static Call Graph Control Flow Analysis Backward Slicing

RQ4: Stack Expansion Algorithms

Recall@1 Recall@5 Recall@10 Recall@20 Recall@50 MRR

Basic Stack Trace Expansion Improved Stack Trace Expansion

Conclusions

• Propose a novel technique CrashLocator to locate crashing faults based on crash stack only

• Evaluate on real and large-scale projects

• 50.6%, 63.7%, and 67.5% of crashing faults can be located by examining only top 1,5,10 functions

• CrashLocator outperforms Stack-Only methods significantly, with the improvement of MRR at least 32% and the improvement of Recall@10 at least 23%