Date post: | 05-Dec-2014 |
Category: |
Technology |
Upload: | dirk-fahland |
View: | 656 times |
Download: | 0 times |
Mining
Branching-Time Scenarios
Dirk Fahland
David Lo
Shahar Maoz
Mining
Branching-Time Scenarios
Dirk Fahland
David Lo
Shahar Maoz
Eindhoven University of Technology
Singapore
Management
University
Tel Aviv University
Mining
Branching-Time Scenarios
Dirk Fahland
David Lo
Shahar Maoz
3
Understanding Existing Applications
understanding of objects and
object interplay
specification fix bugs, add
features, test,
document, …
4
Understanding Existing Applications
understanding of objects and
object interplay
specification fix bugs, add
features, test,
document, …
Usually, there is no specification
or worse, if it exists, it is
outdated…
5
Understanding Existing Applications
understanding of objects and
object interplay
?
?
?
?
source code
specification fix bugs, add
features, test,
document, …
6
Understanding Existing Applications
understanding of objects and
object interplay
?
?
?
?
source code
specification fix bugs, add
features, test,
document, …
Understanding applications from source
code is
• laborious
• time-consuming
• error prone
7
Specification Mining
?
?
?
?
source code
specification automatically
extract
understanding of objects and
object interplay
8
Specification Mining from Event Logs
?
?
?
?
source code
specification
automatically
extract
log
understanding of objects and
object interplay
9
This talk
?
?
?
?
source code
specification
automatically
extract
log
understanding of objects and
object interplay
10
This talk
?
?
?
?
source code
specification
automatically
extract
log
understanding of objects and
object interplay
What is a good specification
language to get an overview of
how an application works?
11
Understanding Object Interplay
FTP server: How does Login/Logout work?
In object-oriented applications,
the hard part is to understand
how the different objects relate
to and interact with each other.
Here are some classes of an
FTP server. How does
login/logout work?
12
Understanding Object Interplay
C UserCmd B A
onConnect()
Scenario: Login/Logout
A
UserCmd
B
C
An object of class A invokes
method onConnect() on an
object of class B.
13
Understanding Object Interplay
setUser()
onLogin()
C UserCmd B A
onConnect()
Scenario: Login/Logout
setLogout() A
UserCmd
B
C
14
Understanding Object Interplay
setUser()
onLogin()
C UserCmd B A
onConnect()
Scenario: Login/Logout
setLogout() A
UserCmd
B
C
• non-local behavior:
multiple objects
• logically related
C
Scenario: Login/Logout
setLogout()
setUser()
onLogin()
onConnect()
UserCmd B A
15
When does it happen?
This scenario tells us
which objects and
methods are involved in
login/logout.
But it does not tell when
this scenario occurs in
the application.
C
Scenario: Login/Logout
setLogout()
setUser()
onLogin()
onConnect()
UserCmd B A
16
When does it happen?
whenever the prechart
happens…
eventually the mainchart
will happen
17
Linear-Time LSCs - Invariants
pre
Login
Login
2 runs
There can be other behaviors
between the behaviors shown in
the LSC.
18
Linear-Time LSCs - Invariants
pre
Login
Login
2 runs
This run does not continue with
the complete main chart of the
LSC.
19
Understanding Everything
FTP
download
2 alternative runs
FTP
delete not everything is an invariant
alternative behaviors
FTP
rename
20
Linear-Time is Insufficient
scenario for FTP delete command
2 runs
This LSC for the delete command
does not hold in every run.
21
Branching Time
execution tree
We merge all runs on their joint
prefixes into an execution tree.
22
Branching-Time LSCs
whenever the prechart happens
there exists a branch
where the mainchart happens
[Sibay, Uchitel, Braberman ICSE 2008]
execution tree
23
Describing Alternatives
execution tree
We can define an LSC for the download
command, that is alternative to delete.
24
Describing Alternatives
execution tree
… and also an LSC for the Rename
command.
25
Describing Alternatives
execution tree
26
Describing Alternatives
execution tree
understanding of objects and
object interplay 27
LSC Mining from Event Logs
automatically
extract
log
complete set of LSCs
(linear / branching)
We want to discover a set of
LSCs that can describe all
behaviors (or as much as
possible of the behaviors).
28
Logs
automatically
extract
log
complete set of LSCs
(linear / branching) log method calls
caller1, callee1, method1(…) caller2, callee2, method2(…) …
Each execution of the application gives
one trace. Run application multiple
times for a log.
log
29
Desired Outcome
automatically
extract
complete set of LSCs
(occuring at least s times
= support)
log
=
tree
30
Mining Algorithm
tree
github.com/scenario-based-tools/sam/
variant of [Lo, Maoz, Khoo ASE 2007]
31
Mining Algorithm
tree
onConnect()
onLogin()
setLogin()
setLogout()
onConnect()
onLogin()
setLogin()
setLogout()
1. enumerate all sequences of events
occurring ≥ s times
onConnect()
onLogin()
setLogin()
setLogout()
candidate words
github.com/scenario-based-tools/sam/
Starting from sequences of
length 1, recursively append
events and check if it occurs
often enough.
Efficient implementation
uses branch and bound and
some heuristics.
32
Mining Algorithm
tree
onConnect()
onLogin()
setLogin()
setLogout()
onConnect()
onLogin()
setLogin()
setLogout()
1. enumerate all sequences of events
occurring ≥ s times
onConnect()
onLogin()
setLogin()
setLogout()
candidate words
onConnect()
onLogin()
setLogin()
setLogout() 2. generate all
candidate LSCs
onConnect()
onLogin()
setLogin()
setLogout()
github.com/scenario-based-tools/sam/
From LSC with pre-chart
length 1 to LSC with main-
chart length 1.
33
Mining Algorithm
tree
onConnect()
onLogin()
setLogin()
setLogout()
onConnect()
onLogin()
setLogin()
setLogout()
1. enumerate all sequences of events
occurring ≥ s times
onConnect()
onLogin()
setLogin()
setLogout()
candidate words
3. test for each LSC
if satisfied ≥ c%
onConnect()
onLogin()
setLogin()
setLogout() 2. generate all
candidate LSCs
onConnect()
onLogin()
setLogin()
setLogout()
github.com/scenario-based-tools/sam/
34
Mining Algorithm
tree
onConnect()
onLogin()
setLogin()
setLogout()
onConnect()
onLogin()
setLogin()
setLogout()
1. enumerate all sequences of events
occurring ≥ s times
onConnect()
onLogin()
setLogin()
setLogout()
candidate words
3. test for each LSC
if satisfied ≥ c%
onConnect()
onLogin()
setLogin()
setLogout() 2. generate all
candidate LSCs
onConnect()
onLogin()
setLogin()
setLogout()
github.com/scenario-based-tools/sam/
35
LSC Mining from Event Logs
automatically
extract
log
complete set of LSCs
(linear and branching)
understanding of objects and
object interplay
What do branching
scenarios add to
specification mining?
s Linear
LSC
covered
events
avg.
length
time
[s]
Branching
LSC
covered
events
avg.
length
time
[s]
20 7 90% 7 3 7+0 90% 7 3
14 9 90% 5 31 9+12 95% 13 26
10 9 90% 5 1008 9+18 95% 18 685
CrossFTP server: 54 traces, 50 event types
36
Experiments
Branching LSC contain the linear LSC and some more
strictly branching LSC that were not found before.
Branching LSC are less frequent (lower support
threshold).
s Linear
LSC
covered
events
avg.
length
time
[s]
Branching
LSC
covered
events
avg.
length
time
[s]
20 7 90% 7 3 7+0 90% 7 3
14 9 90% 5 31 9+12 95% 13 26
10 9 90% 5 1008 9+18 95% 18 685
CrossFTP server: 54 traces, 50 event types
37
Experiments
Branching LSC can explore more events of the log than
just Linear LSC.
s Linear
LSC
covered
events
avg.
length
time
[s]
Branching
LSC
covered
events
avg.
length
time
[s]
20 7 90% 7 3 7+0 90% 7 3
14 9 90% 5 31 9+12 95% 13 26
10 9 90% 5 1008 9+18 95% 18 685
CrossFTP server: 54 traces, 50 event types
38
Experiments
Branching LSC are longer than Linear LSC. In other
words, they show more details for a particular behavior.
s Linear
LSC
covered
events
avg.
length
time
[s]
Branching
LSC
covered
events
avg.
length
time
[s]
20 7 90% 7 3 7+0 90% 7 3
14 9 90% 5 31 9+12 95% 13 26
10 9 90% 5 1008 9+18 95% 18 685
CrossFTP server: 54 traces, 50 event types
39
Experiments
Running times for extraction are feasible.
Note that LSCs shown here are the LSCs left after
removing subsumed ones. Originally, the algorithm finds
around 6 million branching LSC in 685 seconds.
CrossFTP server: 54 traces, 50 event types
Columba mail client: 104 traces, 79 event types
40
Experiments
s Linear
LSC
covered
events
avg.
length
time
[s]
Branching
LSC
covered
events
avg.
length
time
[s]
20 7 90% 7 3 7+0 90% 7 3
14 9 90% 5 31 9+12 95% 13 26
10 9 90% 5 1008 9+18 95% 18 685
s / c Linear
LSC
covered
events
avg.
length
time
[s]
Branching
LSC
covered
events
avg.
length
time
[s]
20 57 70% 4 159 57+1 71% 9 154
10 205 72% 6 2191 205+53 75% 9 2055
10/.5 163 78% 6 2256 163+44 84% 6 2125
full data sets and results:
http://dx.doi.org/10.4121/uuid:aa7db920-aae6-4750-8975-cb739262f432
CrossFTP server: 54 traces, 50 event types
Columba mail client: 104 traces, 79 event types
41
Experiments
s Linear
LSC
covered
events
avg.
length
time
[s]
Branching
LSC
covered
events
avg.
length
time
[s]
20 7 90% 7 3 7+0 90% 7 3
14 9 90% 5 31 9+12 95% 13 26
10 9 90% 5 1008 9+18 95% 18 685
s / c Linear
LSC
covered
events
avg.
length
time
[s]
Branching
LSC
covered
events
avg.
length
time
[s]
20 57 70% 4 159 57+1 71% 9 154
10 205 72% 6 2191 205+53 75% 9 2055
10/.5 163 78% 6 2256 163+44 84% 6 2125
full data sets and results:
http://dx.doi.org/10.4121/uuid:aa7db920-aae6-4750-8975-cb739262f432
application life-cycle
from end to end
42
Linear vs. Branching: CrossFTP
connect
logout
clean up
login
What is the qualitative contribution
of branching LSC to specification
mining?
application life-cycle
from end to end
43
Linear vs. Branching: CrossFTP
short invariants of
individual FTP
commands
invariant of RENAME
application life-cycle
from end to end
44
Linear vs. Branching: CrossFTP
FTP command
+
where triggered
short invariants of
individual FTP
commands
rename command
login
The branching LSC fills
the gap between large
and small invariants.
application life-cycle
from end to end
45
Linear vs. Branching: CrossFTP
individual FTP
commands +
where they are
triggered
individual FTP
commands +
where they are
triggered
all FTP commands
+
can be triggered in
the same situation
short invariants of
individual FTP
commands
We found all ftp commands
supported by the server, as
alternative LSC.
application life-cycle
from end to end
46
Linear vs. Branching: CrossFTP
individual FTP
commands +
where they are
triggered
individual FTP
commands +
where they are
triggered
all FTP commands
+
can be triggered in
the same situation
short invariants of
individual FTP
commands
cycles: rename delete
… and we could discover
cyclic behavior: after
rename, there could be
another delete command
47
Take Home Points
log
complete set of LSCs • mining branching scenarios
alternatives, cycles
• combined with linear:
comprehensive specification
• future work:
visualizing results
distributed scenarios understanding of objects and
object interplay
http://github.com/scenario-based-tools/sam/
Mining
Branching-Time Scenarios
about.me/dirk.fahland
@dfahland
49
Q&A …is branching time really necessary?
if
then
delete
download
or
Yes, here is a linear LSC showing a disjunction
for continuing after the pre-chart.
50
Branching Time vs. Disjunction The full execution tree satisfies this Linear LSC with disjunction
and two branching LSCs describing the two alternatives in
separate LSCs.
51
Branching Time vs. Disjunction
Removing one branch
from the tree (the
execution of the
download command),
violates the branching
LSCs, but still satisfies
the disjunctive linear
LSCs (because only
one of them has to
hold).