Date post: | 27-Dec-2015 |
Category: |
Documents |
Upload: | gertrude-simon |
View: | 220 times |
Download: | 2 times |
Cross-language Program Slicing in the .NET Framework
Krisztián Pócza, Mihály Biczó, Zoltán PorkolábEötvös Loránd University, Hungary
Faculty of Informatics Department of Programming Languages and Compilers
2
The structure of this presentation
• Motivation• Theoretical introduction• Earlier work• Slicing - .NET Framework• Technical outlook• Architecture & algorithm• Practical experiences• Summary
3
Motivation
• What does the term ‘program slicing’ really cover?– Mapping mental abstractions of programmers– Easing program maintenance tasks (debugging)
• There is a need to integrate slicing into modern debuggers
• Real world applications are composed of several modules written in different languages
4
Theoretical introduction
• Slicing means finding all those statements that might directly or indirectly affect the values of variables in a set V– Depends on the program location– The criterion that defines the slicing problem
is a pair C=(p,V) where p denotes program location
– The criterion is the slicing criterion– In the classical case…
5
Theoretical introduction
• Static vs. dynamic slicing – what’s the difference?– Input of the program was disregarded in the
previous (static) case– If input is considered we talk about dynamic
slicing– Dynamic slicing criterion is a triple C=(I,o,V)
where I is program input, o is occurance of a statement
– Occurance
8
Theoretical introduction• Example program and its control flow graph• CFG is the basis of more advanced concepts
1 sum = 02 mul = 13 a = 14 b = read()5 while (a <= b) {6 sum = sum + a7 mul = mul * a8 a = a + 19 }10 write(sum)11 write(mul)
START
STOP
1
2
3
4
5
10
7
6
8
11
T
F
Remarks:•Intuitive representation•Control dependence•Data dependence
9
Theoretical introduction
• Post-dominator (m post-dominates n)– m,n are nodes in CFG– Any path from n to STOP node goes through m
• Control dependence– there exists a path p from n to m in the CFG– m is a post-dominator for every node in p except n, and– m is not a post-dominator for n
• Data dependence– there is a path p from n to m in the CFG,– there is a variable v, with vdef(n) and v ref(m),– for all nodes kn of path p, vdef(k) holds
• Program Dependence Graph (PDG)• System Dependece Graph (SDG)
10
Earlier work
• Dynamic slicing– Different input, different execution branches– Different input, different dynamic slice (see
dynamic slicing criterion)– How to track execution paths?– Generating call trace
• Needs running program against specified input values
• Log execution path in a comfortable format (typically plain text)
11
Earlier work
• Dynamic slicing– Studied previously through real world
applications by the JAVA community– Java Platform Debugging Architecture (JPDA)
• Java Virtual Machine Debug Interface (JVMDI)• Java Debug Wire Protocol (JDWP)• Java Debug Interface (JDI)• Only since JDK 1.3!
– Custom solutions (JVM hacking)
12
Slicing - .NET Framework
• Architecture of the .NET Framework
BCL
CTS(CLS)
CLR
Base Class Library
Common Type System - Common Language Spec.
Common Language Runtime
Managed code lattice
Subset of CTS
One library for all langs.
13
Slicing - .NET Framework
• Key concept of the .NET framework: language interoperability
• Cross-language debugger• Cross-language program slicers
– identify bugs more precisely and at a much earlier stage
Programslicing
.NET debuggingcapabilities
Softwarequality
simplifiesimproves
14
Technical outlook
• Earlier active scripting
• Now script engines compile and interpret code for CLR
• .NET Debugging Services API– Debug every code compiled to IL– Debugging capabilities for all modern
languages
15
Technical outlook
• CLR supports two debugging modes:– In-process
• Inspecting the run-time state of an application• Collecting profiling information
– Out-of-process• run in a separated process • Providing common debugger functionality like stepping,
breakpoints, etc.
• The CLR Debugging Services is implemented as a set of 70+ COM interfaces
17
Technical outlook
• Design-time interface– Responsible for handling debugging events– Implemented separated from the CLR– Host application resides in a different process
• Has a separate thread for receiving debugger events
• When a debug event occurs (assembly loaded, thread started, breakpoint reached, etc.) the application halts and the debugger thread notifies the debugging service through callback functions
18
Technical outlook
• Symbol manager– Interprets the program database (PDB) files– PDB files contain debugging information– Enables the unique identification of program
elements like classes, functions, variables and statements
– Program database can also be used to retrieve their original position in source code
19
Technical outlook
• Publisher– Enumerates all running managed processes
in the system
• Profiler– Tracks application performance and
resources used by running managed processes
20
Architecture & algorithm
Source code Beautification
Recompile in Debug mode
Generate Call Trace
Call trace
Dynamic slicing algorithm
Cross-languageslice
Phase 1 Phase 2
21
Architecture & algorithm
• Phase 1 produces the input for Phase 2• Phase 1 steps:
– Source code beautification: • Parsing code: Marcel Debreuil’s library using
ANTLR• Writing back to a custom alignment:
sequence point = line
– Recompile in debug mode:• csc /debug+ …• vbc /debug+ …
22
Architecture & algorithm
• Phase 1 further steps:– Generate call trace:
• Using .NET Debugging Services• Find Entry Point• Place breakpoint• Call Step/Step In operation until end• The Stepper is derived from MDbg’s source
– Call Trace:• Not step, output of Ph1 and input of Ph2
23
Architecture & algorithm
• Phase 2 steps:– Dynamic slicing algorithm
• Input: call trace and beautified source code• Output: Cross-language slice• Language independent/platform independent• No .NET specific features
– Cross-language slice:• Not step, output
24
Usage of Debugger• Generates output like:
module loadbreakpoint hitst MainClass.cs 10 Mst MainClass.cs 11 Mst MainClass.cs 12 Mst MainClass.cs 13 Mst MainClass.cs 14 Mmodule loadst MainClass.cs 20 M,Rst MainClass.cs 22 M,Rst Functions.cs 10 M,R,Ast Functions.cs 11 M,R,Ast MainClass.cs 23 M,Rst Functions.cs 15 M,R,Pst Functions.cs 16 M,R,P…
• Demo program
25
• Call trace+source code → dynslice• Intra-procedural/inter-procedural• Example program:
• Define and reference a variable
Basics of Dynamic Slicing
1 int n = askUser(); 2 int i = 0; 3 int sum = 0; 4 int prod = 1; 5 while (i < n) 6 { 7 sum += i; 8 prod *= i; 9 i++;10 }11 Console.WriteLine(sum);
26
Basics of Dynamic Slicing
• Control Dependence Graph (not CFG)– Control Dependence Edges
Start
1 2 3 4 5
7 8 9
11
27
Action and Variable Store
• Action– Value can be Def or Ref– Always store action belonging to a variable
• Variable Store (VarStore)– (variable, Action) pairs– Action means the last action on variable– Method-wide– Dynamically updating while dynslicing
28
Intra-procedural operation
• Backward algorithm• First items in VarStore: (sliceVar, Ref)• When encountering a statement:
– Variable with Ref Action is defined in the statement:• Statement added to slice• In Varstore: Ref -> Def • Referenced variables are added to VarStore with Ref Action
– Variable with Def Action is redefined:• Nothing to do• Would be killed
29
Intra-procedural operation
• When adding a statement:– Add statements to LoopCond the current statement is
control dependent on• Condition or loop test statement when a statement
encountered in its body
• When a statement is encountered:– Always check if it is in LoopCond– If yes:
• Add referenced vars to VarStore• Increase slice• Add parents to LoopCond
30
• Remember the example program?
• Call trace of example program : 1,2,3,4(,5,7,8,9){n},5,11
• Slicing criterion: (<n=2>, 111, {sum})• Example run on next slide
Intra-procedural operation
1 int n = askUser(); 2 int i = 0; 3 int sum = 0; 4 int prod = 1; 5 while (i < n) 6 { 7 sum += i; 8 prod *= i; 9 i++;10 }11 Console.WriteLine(sum);
31
Intra-procedural operation
trace Varstore loop-cond Slice
11 (sum,Ref) - -
5 (sum,Ref) - -
9 (sum,Ref) 5 -
8 (sum,Ref) 5 -
7 (sum,Ref),(i,Ref) 5 7
5 (sum,Ref),(i,Ref),(n,Ref) - 5,7
9 (sum,Ref),(i,Ref),(n,Ref) 5 5,7,9
8 (sum,Ref),(i,Ref),(n,Ref) 5 5,7,9
7 (sum,Ref),(i,Ref),(n,Ref) 5 5,7,9
5 (sum,Ref),(i,Ref),(n,Ref) - 5,7,9
4 (sum,Ref),(i,Ref),(n,Ref) - 5,7,9
3 (sum,Def),(i,Ref),(n,Ref) - 3,5,7,9
2 (sum,Def),(i,Def),(n,Ref) - 2,3,5,7,9
1 (sum,Def),(i,Def),(n,Def) - 1,2,3,5,7,9
32
Extend to inter-procedural
• Starts in the same way as intra-procedural
• What happens when the last line of a function reached (backward)?– New VarStore, new LoopCond– Have to maintain them until function start
• Context
• Indexing data structure
33
Extend to inter-procedural
• At the last line of a function:– Identify calling line– Identify all output parameters– Select those have Ref Action in VarStore– If nothing → disregard function– New Context + Recursive Call of DynSlice
• When reached calling line:– Identify used input parameters– Update current Context (VarStore and LoopCond)
34
Creation of Indexing Data Structure
• Used at identification of calling line• Would be slow to go through call trace at every
call’s line end• Unique ID is given for every unique function call
(run) in the call trace • Do not have to be continuous
(1,1,1,1,1,2,2,3,3,2,4,4,2,2,5,… )• While building structure store these runs (using
a Hashtable)• At every start store the previous end
– The query of function calling line is a single operation
36
Summary
• Language features studied:– Value types– Basic program constructions – Static method calls
• Language features to be studied:– Reference types– Non-static method calls– Delegates, properties, foreach, lock, using– Generics, anonymous methods, yield* keyword (in
C#)
37
Q&A
Krisztián Pó[email protected]
Mihály Biczó[email protected]
Zoltán Porkoláb [email protected]
Thank you for your attention!