+ All Categories
Home > Software > Source code comprehension on evolving software

Source code comprehension on evolving software

Date post: 27-May-2015
Category:
Upload: sung-kim
View: 261 times
Download: 0 times
Share this document with a friend
Description:
Yida's PQE
Popular Tags:
26
Source Code Comprehension on Evolving Software: A Literature Survey Yida Tao Supervisor: Sunghun Kim 1
Transcript
Page 1: Source code comprehension on evolving software

1

Source Code Comprehension on Evolving Software: A Literature Survey

Yida Tao

Supervisor: Sunghun Kim

Page 2: Source code comprehension on evolving software

2

Motivation

Code Change Comprehension

Tao et al., FSE’12Code change comprehension is• Frequently required• In major development activities,

in particular the code-review process

• How do software engineers understand code changes? An exploratory study in industry. Tao et al., FSE’12

• Expectations, outcomes, and challenges of modern code review. Bacchelli and Bird, ICSE’13

Bacchelli & Bird, ICSE’13• “…review and understand code

they have not seen before may be more common that a developer working on new code”

• “From interviews, no other code review challenge emerged as clearly as understanding the submitted change”

Page 3: Source code comprehension on evolving software

3

Outline

Program Differencing

Describing code changes

Code Change Summarization

Explaining code changes

Querying and Filtering

Customization

Code Change Comprehension

Page 4: Source code comprehension on evolving software

4

Program Differencing Text Differencing

Syntactic Differencing

Semantic Differencing

Page 5: Source code comprehension on evolving software

5

Text Differencing Flat representation of a program

Sequence of strings

Unix diff Only output added/deleted lines, can not detect modified lines Hard to determine when a code fragment is moved upward or

downward

Ldiff (Canfora et al., ICSE’09) An enhanced line differencing tool

Limitations Changes to *characters* No syntactic-structure information

Page 6: Source code comprehension on evolving software

6

Syntactic Differencing Structured representation of a program

Abstract syntax tree; XML ChangeDistiller (Fluri et al., TSE’07)

Tree differencing Node: bigram string similarity Control structure: subtree similarity

Output: tree edit script (insert, delete, move, update) XML differecing

srcXML (Maletic & Collard, ICSM’04): embeds abstract syntax and structure within the source code

diffX (Al-Ekram et al., CASCON '05) Limitation

Cannot describe how the behavior of a program is changed Still report differences for behavior-preserving changes

Page 7: Source code comprehension on evolving software

7

Semantic Differencing Semantic diff (Jackson and Ladd, ICSM’94)

Method-level Variable dependencies comparison

==

Page 8: Source code comprehension on evolving software

8

Semantic Differencing (cont.) JDiff (Apiwattanapong et al. ASE’04, 06)

Extended control-flow graph (ECFG) Dynamic binding, class hierarchy, exception handling,

etc.

Page 9: Source code comprehension on evolving software

9

Semantic Differencing (cont.) Differential symbolic execution (Person et al.,

FSE’08) “Executing” a program using symbolic values

Page 10: Source code comprehension on evolving software

10

Outline

Program Differencing

Text Differencing

Syntactic differencing

Semantic differencing

Code Change Comprehension

Code Change Summarization

Explaining code changes

Querying and Filtering

Customization

Page 11: Source code comprehension on evolving software

11

Code Change Summarization LSdiff (Kim and Notkin, ICSE’09)

Group related changes Detect potential inconsistencies in a code change

Page 12: Source code comprehension on evolving software

12

Code Change Summarization (cont.) DeltaDoc (Buse and Weimer, ASE’10)

Symbolic execution: obtain path predicates for each statement in both versions

Identify statements that are added, deleted, or have a changed predicates

Summarization

Page 13: Source code comprehension on evolving software

13

Code Change Summarization (cont.) Multi-document summarization (Rastkar and Murphy,

ICSE’13) Linking evolutionary documents (commit log, issue tracking entries) Finding the most informative sentences to extract to form a

summary Similarity between a sentence and the title of the enclosing document Overlap between a sentence and the adjacent document

Page 14: Source code comprehension on evolving software

14

Code Change Summarization (cont.) Challenges

Evolutionary documents Linkage might not be found (Bachman et al., FSE’10, Wu et al., FSE’11) Human-written document may be unavailable or uninformative (Buse

and Weimer, ASE’10, Tao et al., FSE’12) Automatically generated document

Verbosity Uninteresting changes are identified, e.g., “all types that declared

toString() added constructors” (Kim and Notkin, ICSE’09)

LSdiff DeltaDoc

Page 15: Source code comprehension on evolving software

15

Outline

Program Differencing

Text Differencing

Syntactic differencing

Semantic differencing

Code Change Summarization

Rules and exceptions

Control-flow changes

Evolutionary documentation

Code Change Comprehension

Querying and Filtering

Customization

Page 16: Source code comprehension on evolving software

16

Querying and Filtering Specifying and detecting meaningful changes (Yu et al.,

ASE’11) Normalize the program (user-specified) before differencing Non-trivial to construct the query

Page 17: Source code comprehension on evolving software

17

Querying and Filtering (cont.) Filtering non-essential changes (Kawrykow

and Robillard, ICSE’11) Non-essential changes: rename-induced

modifications, local variable extraction, trivial keyword modification, whitespace and documentation updates

ChangeDistiller (Fluri et al., TSE’07) + Partial program analysis (Dagenais and Robillard, ICSE’08)

Goal: improving mining and recommendation accuracy instead of developers’ comprehension

Page 18: Source code comprehension on evolving software

18

Outline

Program Differencing

Text Differencing

Syntactic differencing

Semantic differencing

Code Change Summarization

Rules and exceptions

Control-flow changes

Evolutionary documentation

Querying and Filtering

Meaningful changes

Non-essential changes

Code Change Comprehension

Page 19: Source code comprehension on evolving software

19

Research Directions

Program Differencing

Text Differencing

Syntactic differencing

Semantic differencing

Code Change Summarization

Rules and exceptions

Control-flow changes

Evolutionary documentation

Querying and Filtering

Meaningful changes

Non-essential changes

Source Code Changes

Work-item-based changes?

Page 20: Source code comprehension on evolving software

Work-item-based Changes Multiple work-items in a single code change (e.g.,

a bug fix + code cleanup + a new feature) Very difficult to understand (Tao et al., FSE’12)

20JFreeChart revision 1083

Trivial keyword removal

Bug fix

Formatting

Page 21: Source code comprehension on evolving software

Work-item-based Change Detection Multiple work-items in a single code change (e.g.,

a bug fix + code cleanup + a new feature) Very difficult to understand (Tao et al., FSE’12) Change decomposition

Program slicing (entity dependencies) Pattern matching (similarities)

A single work-item spreads across multiple code changes (e.g., 5 changes to finally fix a bug completely) Change aggregation

Linkage to the same issue Heuristics like time duration, commit authors, program

dependencies, etc.21

Page 22: Source code comprehension on evolving software

22

Research Directions

Program Differencing

Text Differencing

Syntax differencing

Semantic differencing

Code Change Summarization

Rules and exceptions

Control-flow changes

Evolutionary documentation

Querying and Filtering

Meaningful changes

Non-essential changes

Code Change ComprehensionWork-item change

detection

Change decomposition

Change aggregation

Page 23: Source code comprehension on evolving software

23

Research Directions

Program Differencing

Text Differencing

Syntax differencing

Semantic differencing

Code Change Summarization

Rules and exceptions

Control-flow changes

Evolutionary documentation

Querying and Filtering

Meaningful changes

Non-essential changes

Work-item-specific changes

Code Change ComprehensionWork-item change

detection

Change decomposition

Change aggregation

Page 24: Source code comprehension on evolving software

24

Research Directions

Program Differencing

Text Differencing

Syntax differencing

Semantic differencing

Code Change Summarization

Rules and exceptions

Control-flow changes

Evolutionary documentation

Querying and Filtering

Meaningful changes

Non-essential changes

Work-item-specific changes

Code Change Comprehension

Concrete Execution

Work-item change detection

Change decomposition

Change aggregation

Page 25: Source code comprehension on evolving software

25

Explaining code changes with executions of co-changed test cases Test cases

Best documentation for source code Test cases co-changed with source code

Documentation for code changes? Mostly synchronous co-evolution of production and

test code (Zaidman et al., Empirical Software Engineering’11)

Differential test executions Co-changed test cases T Executing T on the old version P and new version

P’ Comparing executions to explained change

behaviors

From StackExchangehttp://programmers.stackexchange.com/questions/154439/quality-of-code-in-unit-tests?newsletter=1&nlcode=67628%7c1a35• “Unit tests are one of the best sources of documentation for

your system, and arguably the most reliable form”• “Unit tests are often the first thing you look at when trying to

grasp what some piece of code does”• “They can also serve as a starting point for people new to the

code base”

Page 26: Source code comprehension on evolving software

26

Research Directions

Program Differencing

Text Differencing

Syntax differencing

Semantic differencing

Code Change Summarization

Rules and exceptions

Control-flow changes

Evolutionary documentation

Querying and Filtering

Meaningful changes

Non-essential changes

Work-item-specific changes

Code Change Comprehension

Concrete Execution

• Co-changed test cases• Differential test

execution

Work-item change detection

Change decomposition

Change aggregation


Recommended