+ All Categories
Home > Technology > PhD Proposal

PhD Proposal

Date post: 08-Dec-2014
Category:
Upload: patricia-deshane
View: 1,641 times
Download: 2 times
Share this document with a friend
Description:
 
Popular Tags:
27
Techniques for Detecting and Preventing Copy-and-Paste Errors during Software Development A Dissertation Proposal By Patricia Jablonski Engineering Science Clarkson University September 5, 2007
Transcript
Page 1: PhD Proposal

Techniques for Detecting and Preventing Copy-and-Paste Errors during Software Development

A Dissertation ProposalBy

Patricia Jablonski

Engineering ScienceClarkson University

September 5, 2007

Page 2: PhD Proposal

Outline

Copying and pasting code Modifying copy-and-pasted code Our proposed solution (CnP) Our proof of concept (CReN) Demo of CReN Related Eclipse features Evaluation plan Proposed plan

Page 3: PhD Proposal

Copying and Pasting Code

A common form of software reuse Reuse copied code as a template

Why copy and paste code? Duplicate code exactly Defer creating an abstraction Experiment and test

Results in code clones Multiple similar code fragments

What happens when code needs modification?

Page 4: PhD Proposal

Modifying Copy-and-Pasted Code (1 of 2)

Expensive software maintenance Original copied code could be erroneous Changes need to be made to each instance

Solutions: clone detection and removal, clone tracking tools Linked editing and simultaneous editing

Clones are selected and linked together so that modifications in one clone can be made to all of the clones that it is linked to simultaneously

Page 5: PhD Proposal

Modifying Copy-and-Pasted Code (2 of 2)

Manual modifications can result in undetected errors and unintended inconsistencies

Solution: error detection tools CP-Miner tool

Uses identifier mapping, “forget-to-change” vs. “change”, and unchanged ratio

DECKARD-based tool Uses a count of unique identifiers

What about proactive error prevention?

Page 6: PhD Proposal

Our Proposed Solution (CnP)

Provide automated tool support in the IDE Eclipse, Java

Improve software quality during development

What are the main features of the CnP tool? Tracks & highlights copy-pasted statements Detects inconsistencies based on inferences

of the programmer’s intention Inconsistencies are based on inferred rules

What is the current status of CnP?

Page 7: PhD Proposal

Our Proof of Concept (CReN) Design and Implementation (1 of 5)

Consistent renaming usage pattern Identifier (for example, variable name)

renaming within a copy-and-paste clone Manual renaming can result in inconsistencies

What are the main features of the CReN tool? Tracks & highlights copy-pasted statements Automatically renames all instances of an

identifier in a group when any one instance in the group is modified, the inferred rules can be refined by the programmer

Page 8: PhD Proposal
Page 9: PhD Proposal

Our Proof of Concept (CReN) Design and Implementation (2 of 5)

Tracking copy-and-paste clones No clone detection tool or manual selection Clone region: Java file name + clone’s range

Obtaining ASTs from clone locations Abstract syntax tree (AST) API in Eclipse AST captures the source code characters &

their absolute position in the source code Each ASTNode has starting/ending positions

denoting character positions within the node

Page 10: PhD Proposal
Page 11: PhD Proposal

Our Proof of Concept (CReN) Design and Implementation (3 of 5)

Matching identifiers between clones Determine relationships of identifiers

between copy-and-pasted code fragments Identifiers in the copied code are matched

with their corresponding identifiers in the pasted code

When the code has just been pasted, its contents are identical to the copied fragment, only at a different location

Rules are inferred across all clones

Page 12: PhD Proposal
Page 13: PhD Proposal

Our Proof of Concept (CReN) Design and Implementation (4 of 5)

Partitioning identifiers into groups Determine relationships of identifiers within

copy-and-pasted code fragments Identifiers in the copied and pasted code are

partitioned into groups and mapped to each other

Defines the group of identifiers that are to be renamed together

Want group of identifiers that resolve to the same variable – use binding, if available

Page 14: PhD Proposal
Page 15: PhD Proposal

Our Proof of Concept (CReN) Design and Implementation (5 of 5)

Refining the inferred rules When the code is initially pasted, the

inferred rule assumes that all identifiers that would resolve to the same program entity should be renamed consistently

Programmer can choose to exclude the currently renamed identifier from the group (this instance is deleted from the vector)

The updated rule is inferred across all clones

Let’s see if CReN can detect/prevent errors...

Page 16: PhD Proposal

Our Proof of Concept (CReN) Usage and Demonstration

Three examples from literature show an inconsistent renaming of identifiers within a copy-and-pasted clone in production code

Z. Li, S. Lu, S. Myagmar, and Y. Zhou, “CP-Miner: A Tool for Finding Copy-paste and Related Bugs in Operating System Code”, USENIX-ACM SIGOPS Symposium on Operating Systems Design and Implementation (OSDI), 2004.

B. Liblit, A. Aiken, A.X. Zheng, and M.I. Jordan, “Bug Isolation via Remote Program Sampling”, ACM SIGPLAN Conference on Programming Language Design and Implementation (PLDI), 2003.

L. Jiang, Z. Su, and E. Chiu, “Context-Based Detection of Clone-Related Bugs”, European Software Engineering Conference (ESEC) and ACM SIGSOFT International Symposium on the Foundations of Software Engineering (FSE), 2007.

Page 17: PhD Proposal
Page 18: PhD Proposal
Page 19: PhD Proposal

Demo of CReN

Demonstrate how CReN would catch each identifier renaming error in the examples as if they were currently being written

(Some) CReN future work Consistent renaming of any kind of identifier Allow “undo” of taking identifier out of group Consistent renaming in a user-defined scope Apply renaming across all related clones

How are other Eclipse features related to CReN?

Page 20: PhD Proposal

Related Eclipse Features

Find & Replace Text-based search, manually started Not limited to within a code fragment

Rename Refactoring Automatically applies to the whole project Binding is important for it to work

Linked Renaming Like Rename Refactoring, but applies to file

What are our next steps in our research?

Page 21: PhD Proposal

Evaluation Plan

We tested CReN with the three examples We plan to perform controlled experiments

Give a homework assignment to students Require them to use Eclipse & CnP plug-in Have them write a suitable application

We plan to evaluate in terms of: Usefulness, usability (user error), user

experience, accuracy (false negatives & false positives), performance

What is our plan after CReN is fully evaluated?

Page 22: PhD Proposal

Proposed Plan

Determine usage patterns by using clone detection tools

What other kinds of errors could CnP handle? Lexical/naming pattern inconsistencies

Substring is the same on both sides of = Naming pairs like left/right, top/bottom

Type inconsistencies Inferences can be made about types at the

same positions across clones Improve the mgmt and visualization of clones

Page 23: PhD Proposal

Conclusion

Copy-and-paste will remain a common programming practice, which can result in undetected errors

Error detection and prevention should happen during software development, not only “after-the-fact”

So far, we have implemented one of three parts of the proposed CnP tool, called CReN Automatic tracking of copy-and-paste clones Consistent renaming of identifiers within

copy-and-paste clones

Page 24: PhD Proposal

Questions / Comments

Page 25: PhD Proposal

Extra Slides (CReN Demo Screen Shots)

Page 26: PhD Proposal
Page 27: PhD Proposal

Recommended