Post on 01-Dec-2014
description
transcript
CrashLocator: Locating Crashing Faults Based on
Crash Stacks
Rongxin Wu1, Hongyu Zhang2, Shing-Chi Cheung1 and Sunghun Kim1
The Hong Kong University of Science and Technology1
Microsoft Research2
July 24th , 2014
ISSTA 2014
2
Background
Crash Information with Crash Stack
Crash Reporting System
Software Crash
Bug ReportsDevelopers Crash Buckets
3
Feedbacks From Mozilla Developers
• Locating crashing faults is hard
• Ad hoc approach
“… and look at the crash stack listed. It shows the line number of the code, and then I go to the code and inspect it. If I am unsure what it does I go to the second line of the stack and code and inspect that, and so on and so forth …”
“Some crashes are hard to fix because it is not necessarily indicative of the place where it crashes in the crash stack …”
“ I use the top down method of following the crash backwards.”
“Sometimes it can be very difficult.”
4
Uncertain Fault Location• The faulty function may not appear in crash stack
About 33%~41% of crashing faults in Firefox cannot be located in crash stacks!
A
B
C E F G
H
Buggy Code
D
Crash Stack
Crash Point
5
• Related Work• Tarantula
(J. A. Jones et al., ICSE 2002) (J. A. Jones et al., ASE 2005)• Jaccard
(R. Abreu et al., TAICPART-MUTATION 2007)• Ochiai
(R. Abreu et al., TAICPART-MUTATION 2007) (S. Art et al., ISSTA 2010)
• …
• Passing Traces and Failing Traces
Spectrum-Based Fault Localization
6
• Are these techniques applicable?
Spectrum-Based Fault Localization
Instrumented Product Software
Failing Traces
Passing TracesPrivacy Concern
Performance Overhead(C. Luk et al., PLDI 2005)
x
Crash Stackf1f2f3…fn
x
Test CasesEffectiveness
(S. Artzi et al., ISSTA’10)
7
Our Research Goal
How to help developers fix crashing faults?– Locate crashing faults based on crash stack
8
Our technique: CrashLocator
• Target at locating faulty functions• No instrumentation needed• Approximate failing traces
Based on Crash Stacks Use static analysis techniques
• Rank suspicious functions Without passing traces Based on characteristics of faulty functions
9
Approximate Failing Traces• Basic Stack Expansion Algorithm
A
B
C
D
Crash Stack
E
J
M
N
Depth-1
F
K
L
Depth-2
G
H
Depth-3 A
B J
C K L
D E
M N F
G H
Call Graph
10
functionposition File Line
D0 file_0 l0
C1 file_1 l1
B2 file_2 l2
A3 file_3 l3
Crash Stack
Approximate Failing Traces• Basic Stack Expansion Algorithm
Function call information only
• Improved Stack Expansion Algorithm Source file position information
11
Improved Stack Expansion Algorithm
• Control Flow Analysis
if
J()…
B()…
Entry
Exit
In Crash Stack
CFG of A
A
B
C
D
Crash Stack
E
J
M
N
Depth-1
F
K
L
Depth-2
G
H
Depth-3
12
Improved Stack Expansion Algorithm
• Backward Slicing
1. Obj D(){2. Obj s;3. int a = M();4. char b = ‘’;5. Obj[] c = N(b);6. s=c[1]; //crash here7. if(s!=‘’){8. … 9. }8. … 9. }
variables {s,c}
A
B
C
D
Crash Stack
E
M
N
Depth-1
F
Depth-2
G
H
Depth-3
Not in slicing
13
After crash stack expansion, there are still a large number of suspicious functions
How to rank the suspicious functions?
14
Rank suspicious functions
• An empirical study on the characteristics of faulty functions
• Quantify the suspiciousness of suspicious functions
15
Observation 1: Frequent Function
• Faulty functions appear frequently in the crash traces of the corresponding buckets.
Function Frequency (FF)
Crash Report More Frequent,
More Suspicious
For 89-92% crashing faults, the associated faulty functions appear in all crash execution traces in the
corresponding bucket.
Crash Bucket
16
Frequent Function
• Some frequent functions are unlikely to be buggy Entry points (main, _RtlUserThreadStart, …) Event handling routine (CloseHandle)
• Information retrieval, some frequent words are useless stop-words, e.g. “the”, “an”, “a” Inverse Document Frequency (IDF)
• Inverse Bucket Frequency (IBF) If a function appears in many buckets, it is less likely to be
buggy
17
Observation 2: Functions Close to Crash
Point• Faulty functions appear closer to crash point
In Mozilla Firefox, for 84.3% of crashing faults, the distance between crash point and the associated faulty functions is less 5.
• Inverse Average Distance to Crash Point (IAD)
18
Observation 3: Less Frequently Changed
Functions• Functions that do not contain crashing faults are
often less frequently changed 94.1% of faulty functions have been changed at least
once during the past 12 months Immune Functions (Y. Dang et al. ICSE 2012)
• Less frequently changed functions Functions that have no changes in past 12 months Suspicious score is 0
19
Observation 4: Large Functions
• Our prior study (H. Zhang. ICSM 2009) showed that large modules are more likely to be defect-prone
• Function’s Lines of Code (FLOC)
20
Suspicious Score
• FF (Function Frequency)
• IBF(Inverse Bucket Frequency)
• IAD(Inverse Distance to Crash Point)
• FLOC(Function Lines of Code)
21
Evaluation Subjects
• Mozilla Products 5 releases of Firefox 2 releases of Thunderbird 1 release of SeaMonkey
• 160 crashing faults(buckets)• Large-Scale
More than 2 million LOC More than 120K functions
22
Evaluation Metrics
• Recall@N: Percentage of successfully located faults by examining top N recommended functions
• Mean Reciprocal Rank (MRR) Measure the quality of the ranking results in IR Range value: 0 ~ 1 Higher value means better ranking
23
Experimental Design
• RQ1: How many faults can be successfully located by CrashLocator?
• RQ2: Can CrashLocator outperform the conventional stack-only methods?
• RQ3: How does each factor contribute to the crash localization performance?
• RQ4: How effective is the proposed crash stack expansion algorithm?
24
RQ1: CrashLocator Performance
System Recall@1 Recall@5 Recall@10 MRR
Firefox 4.0b4 55.6% 66.7% 77.8% 0.627
Firefox 4.0b5 47.1% 70.6% 70.6% 0.566
Firefox 4.0b6 48.0% 64.0% 64.0% 0.540
Firefox14.0.1 52.0% 52.0% 56.0% 0.528
Firefox16.0.1 53.8% 53.8% 53.8% 0.542
Thunderbird17.0 48.5% 66.7% 78.8% 0.568
Thunderbird24.0 50.0% 66.7% 66.7% 0.544
SeaMonkey2.21 55.0% 70.0% 70.0% 0.600
Summary 50.6% 63.7% 67.5% 0.559
25
RQ2: Comparison with Stack-Only methods
• Conventional Stack-Only Methods• StackOnlySampling• StackOnlyAverage• StackOnlyChangeDate
26
RQ2: Comparison with Stack-Only methods
0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
1 5 10 20 50 100
Reca
ll@N
Top N Functions
StackOnlySampling
StackOnlyAverage
StackOnlyChangeDate
CrashLocator
27
RQ3: Contribution of Each Factors• Inverse Bucket Frequency (IBF)• Function Frequency (FF)• Function’s Lines of Code (FLOC)• Inverse Average Distance to Crash Point (IAD)
28
RQ3: Contribution of Each Factors
0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
ff4.0b4 ff4.0b5 ff4.0b6 ff14.0.1 ff16.0.1 tb17.0 tb24.0 sm2.21 Summary
MRR
IBF IBF*FF IBF*FF*FLOC IBF*FF*FLOC*IAD
29
RQ4: Stack Expansion Algorithms• Basic Stack Expansion Algorithm
Static Call Graph
• Improved Stack Expansion Algorithm Static Call Graph Control Flow Analysis Backward Slicing
30
RQ4: Stack Expansion Algorithms
0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
Recall@1 Recall@5 Recall@10 Recall@20 Recall@50 MRR
Basic Stack Trace Expansion Improved Stack Trace Expansion
31
Conclusions
• Propose a novel technique CrashLocator to locate crashing faults based on crash stack only
• Evaluate on real and large-scale projects
• 50.6%, 63.7%, and 67.5% of crashing faults can be located by examining only top 1,5,10 functions
• CrashLocator outperforms Stack-Only methods significantly, with the improvement of MRR at least 32% and the improvement of Recall@10 at least 23%