+ All Categories
Home > Documents > Detecting and Characterizing Semantic Inconsistencies in Ported Code

Detecting and Characterizing Semantic Inconsistencies in Ported Code

Date post: 22-Feb-2016
Category:
Upload: alban
View: 37 times
Download: 0 times
Share this document with a friend
Description:
Baishakhi Ray * , Miryung Kim * , Suzette Person + , Neha Rungta !. Detecting and Characterizing Semantic Inconsistencies in Ported Code . * The University of Texas at Austin + NASA Langley Research Center ! NASA Ames Research Center . Motivation. - PowerPoint PPT Presentation
Popular Tags:
39
Detecting and Characterizing Semantic Inconsistencies in Ported Code Baishakhi Ray * , Miryung Kim * , Suzette Person + , Neha Rungta ! * The University of Texas at Austin + NASA Langley Research Center ! NASA Ames Research Center
Transcript
Page 1: Detecting and Characterizing Semantic Inconsistencies in Ported Code

Detecting and Characterizing Semantic Inconsistencies in

Ported Code

Baishakhi Ray*, Miryung Kim*, Suzette Person+, Neha Rungta!

* The University of Texas at Austin+ NASA Langley Research Center

! NASA Ames Research Center

Page 2: Detecting and Characterizing Semantic Inconsistencies in Ported Code

2

Motivation

Port code from a reference to a target implementation. [Ray et al., Al-Ekram et al., Kim et al.]

Adapt ported changes to fit the target context. [Kim et al.]

Faulty adaptation often leads to porting-error. [Chou et al., Juergens et al., Li et al., Jiang et al.].

referencetarget

Page 3: Detecting and Characterizing Semantic Inconsistencies in Ported Code

Outline

Empirical study of porting errors Classification scheme for porting errors SPA: Semantic Porting Analysis Evaluation Conclusion

3

Page 4: Detecting and Characterizing Semantic Inconsistencies in Ported Code

How are porting errors introduced?

Reference: ExportMemoryDialog.java

if(!containsKey(IMemoryExporter)) setProperty(IMemoryExporter);

Original Target:ImportMemoryDialog.java

if(!containsKey(IMemoryExporter)) setProperty(IMemoryExporter);

Fixed Target:ImportMemoryDialog.java

if(!containsKey(IMemoryImporter)) setProperty(IMemoryImporter);

porting

fix

Page 5: Detecting and Characterizing Semantic Inconsistencies in Ported Code

Study Methodology

Reference: ExportMemoryDialog.java

if(!containsKey(IMemoryExporter)) setProperty(IMemoryExporter);

Original Target:ImportMemoryDialog.java

if(!containsKey(IMemoryExporter)) setProperty(IMemoryExporter);

Fixed Target:ImportMemoryDialog.javaLog:Fix copy&paste error in last commit

if(!containsKey(IMemoryImporter)) setProperty(IMemoryImporter);

git blame

Repertoire [Ray et al.]

Page 6: Detecting and Characterizing Semantic Inconsistencies in Ported Code

Empirical Study of Porting Errors

6

KLOC Developers

Years

Total

FreeBSD

4,479 405 18 113

Linux 14,998 6839 3 182

Developers frequently introduce porting errors in the codebase.

Page 7: Detecting and Characterizing Semantic Inconsistencies in Ported Code

Outline

Empirical study of porting errors Classification scheme for porting errors SPA: Semantic Porting Analysis Evaluation Conclusion

7

Page 8: Detecting and Characterizing Semantic Inconsistencies in Ported Code

Inconsistent Control Flow

Reference Targetfor(p ..) { for(kg ..) { ...+ if (ke->ke_cpticks == 0)+ continue; .... }

for(p) {

…+ if (ke->ke_cpticks == 0)+ continue; …}

8

Page 9: Detecting and Characterizing Semantic Inconsistencies in Ported Code

Inconsistent Identifier Renamings

Reference Target...+ bp->b_flags |= B_ASYNC;+ bp->b_flags &= ~B_INVAL; ...+ VOP_STRATEGY(vp, bp); …

...+ rabp->b_flags |= B_ASYNC;+ rabp->b_flags &= ~B_INVAL; ...+ VOP_STRATEGY(vp, bp); …

9

Page 10: Detecting and Characterizing Semantic Inconsistencies in Ported Code

Inconsistent Renamings of Related Identifiers

Reference Target...+ if (INDEX < lowest_ofdm)+ ofdm |= RATE >> OFDM_RATE;...

...+ if (INDEX < lowest_ofdm)+ ofdm |= RATE >> CCK_RATE;...

10

Page 11: Detecting and Characterizing Semantic Inconsistencies in Ported Code

Inconsistent Data Flow

Reference Target while ((ch = getopt(argc, argv,...)) != -1)… switch (ch) { ...+ case 'o':+ if (strcmp(optarg, "space") == 0) {+ opt = FS_OPTSPACE; …

parse_uuid(const char *s, uuid_t *uuid) { ... switch (*s) …+ case 'e':+ if (strcmp(optarg, "efi") == 0) {+ uuid_t efi = GPT_ENT_TYPE_EFI; …

11

Page 12: Detecting and Characterizing Semantic Inconsistencies in Ported Code

Redundant Operation

Reference Targetmemset(&tsf_tlv, …));...

...

+ memcpy(*buffer, &tsf_tlv);

memcpy(*buffer, &tsf_val);

memcpy(&tsf_val, time_stamp, …); ..+ memcpy(*buffer, &tsf_val);

12

Page 13: Detecting and Characterizing Semantic Inconsistencies in Ported Code

Distribution of Porting Errors

13

FreeBSD LinuxTotal 113 182InconsistentControl Flow

8% 13%

Inconsistent Renaming

48% 41%

InconsistentData Flow

28% 14%

Redundant Operations

12% 26%

Other 25% 14%

Page 14: Detecting and Characterizing Semantic Inconsistencies in Ported Code

Outline

Empirical study of porting errors Classification scheme for porting errors SPA: Semantic Porting Analysis Evaluation Conclusion

14

Page 15: Detecting and Characterizing Semantic Inconsistencies in Ported Code

SPA Overview

Input: Reference and Target patches

Analyze the semantic differences between ported edits in reference and target context.

Output: Types of potential porting inconsistencies

15

Page 16: Detecting and Characterizing Semantic Inconsistencies in Ported Code

Motivating Example

16

Reference TargetR(int flags, int bufsize, ostatfs osb) { R1. + cnt = bufsize /size(ostatfs); R2. + size = cnt + size(ostatfs); R3. + err = copy(osb, sp, size); R4. return error ; }

T(int flags, int bufsize, stat osb) { T1. if (flags == 3) { return 0; } T2. + cnt = bufsize /size(ostatfs); T3. + size = cnt + size(stat); T4. + if(size) T5. + buf = new stat(); T6. + err = copy(osb, buf, size); T7. + err = copy(osb, buf, size); T8. return (err); }

Page 17: Detecting and Characterizing Semantic Inconsistencies in Ported Code

1. Identify Edited Nodes

17

Reference

+ cnt = .. + size = .. + err = .. ret err

method_decl

Target

+ cnt = .. + size = .. + err = .. err = ..

method_decl

if (flags == 3)

ret 0 ret err

+ if (size)

+ buf = ..

FT

T

Page 18: Detecting and Characterizing Semantic Inconsistencies in Ported Code

2. Compute Ported Nodes

18

Reference

+ cnt = .. + size = .. + err = .. ret err

method_decl

Target

+ cnt = .. + size = .. + err = .. err = ..

method_decl

if (flags == 3)

ret 0 ret err

+ if (size)

+ buf = ..

FT

T

Page 19: Detecting and Characterizing Semantic Inconsistencies in Ported Code

3. Detect Impacted Nodes

19

Reference

+ cnt = .. + size = .. + err = .. ret err

method_decl

Target

+ cnt = .. + size = .. + err = .. err = ..

method_decl

if (flags == 3)

ret 0 ret err

+ if (size)

+ buf = ..

FT

T

Page 20: Detecting and Characterizing Semantic Inconsistencies in Ported Code

4. Find Inconsistent Control Flow

20

Reference

+ cnt = .. + size = .. + err = .. ret err

method_decl

Target

+ cnt = .. + size = .. + err = .. err = ..

method_decl

if (flags == 3)

ret 0 ret err

+ if (size)

+ buf = ..

FT

T

Page 21: Detecting and Characterizing Semantic Inconsistencies in Ported Code

21

Reference TargetR(int flags, int bufsize, ostatfs osb) { R1. + cnt = bufsize /size(ostatfs); R2. + size = cnt + size(ostatfs); R3. + err = copy(osb, sp, size); R4. return error ; }

T(int flags, int bufsize, stat osb) { T1. if (flags == 3) { return 0; } T2. + cnt = bufsize /size(ostatfs); T3. + size = cnt + size(stat); T4. + if(size) T5. + buf = new stat(); T6. + err = copy(osb, buf, size); T7. + err = copy(osb, buf, size); T8. return (err); }

4. Find Inconsistent Control Flow

Page 22: Detecting and Characterizing Semantic Inconsistencies in Ported Code

5. Detect Inconsistent Renamings

22

Reference

+ cnt = .. + size = .. + err = .. ret err

method_decl

Target

+ cnt = .. + size = .. + err = .. err = ..

method_decl

if (flags == 3)

ret 0 ret err

+ if (size)

+ buf = ..

FT

T

- R2. size = cnt + size(ostatfs);- T3. size = cnt + size(stat);

- R1. cnt = bufsize /size(ostatfs);- T2. cnt = bufsize / size(ostatfs);

Page 23: Detecting and Characterizing Semantic Inconsistencies in Ported Code

5. Detect Inconsistent Renamings

23

Reference TargetR(int flags, int bufsize, ostatfs osb) { R1. + cnt = bufsize /size(ostatfs); R2. + size = cnt + size(ostatfs); R3. + err = copy(osb, sp, size); R4. return error ; }

T(int flags, int bufsize, stat osb) { T1. if (flags == 3) { return 0; } T2. + cnt = bufsize /size(ostatfs); T3. + size = cnt + size(stat); T4. + if(size) T5. + buf = new stat(); T6. + err = copy(osb, buf, size); T7. + err = copy(osb, buf, size); T8. return (err); }

Page 24: Detecting and Characterizing Semantic Inconsistencies in Ported Code

6. Identify Inconsistent Data Flow

24

Reference

+ cnt = .. + size = .. + err = .. ret err

method_decl

Target

+ cnt = .. + size = .. + err = .. err = ..

method_decl

if (flags == 3)

ret 0 ret err

+ if (size)

+ buf = ..

FT

T

Page 25: Detecting and Characterizing Semantic Inconsistencies in Ported Code

6. Identify Inconsistent Data Flow

25

Reference TargetR(int flags, int bufsize, ostatfs osb) { R1. + cnt = bufsize /size(ostatfs); R2. + size = cnt + size(ostatfs); R3. + err = copy(osb, sp, size); R4. return error ; }

T(int flags, int bufsize, stat osb) { T1. if (flags == 3) { return 0; } T2. + cnt = bufsize /size(ostatfs); T3. + size = cnt + size(stat); T4. + if(size) T5. + buf = new stat(); T6. + err = copy(osb, buf, size); T7. + err = copy(osb, buf, size); T8. return (err); }

Page 26: Detecting and Characterizing Semantic Inconsistencies in Ported Code

7. Detect Redundant Operation

26

Target

+ cnt = .. + size = .. + err = copy() err = copy()

method_decl

if (flags == 3)

ret 0 ret err

+ if (size)

+ buf = ..

FT

T

Page 27: Detecting and Characterizing Semantic Inconsistencies in Ported Code

7. Detect Redundant Operation

27

Reference TargetR(int flags, int bufsize, ostatfs osb) { R1. + cnt = bufsize /size(ostatfs); R2. + size = cnt + size(ostatfs); R3. + err = copy(osb, sp, size); R4. return error ; }

T(int flags, int bufsize, stat osb) { T1. if (flags == 3) { return 0; } T2. + cnt = bufsize /size(ostatfs); T3. + size = cnt + size(stat); T4. + if(size) T5. + buf = new stat(); T6. + err = copy(osb, buf, size); T7. + err = copy(osb, buf, size); T8. return (err); }

Page 28: Detecting and Characterizing Semantic Inconsistencies in Ported Code

Outline

Empirical study of porting errors Classification scheme for porting errors SPA: Semantic Porting Analysis Evaluation Conclusion

28

Page 29: Detecting and Characterizing Semantic Inconsistencies in Ported Code

Evaluation

RQ1. Can SPA accurately detect porting inconsistencies?

RQ2. Can SPA accurately categorize porting inconsistencies?

Implementation Java static analysis framework Extends LASE, Sydit [Meng et al], and uses Crystal [Aldrich

et al]

29

Page 30: Detecting and Characterizing Semantic Inconsistencies in Ported Code

RQ1. Can SPA accurately detect porting inconsistencies?

30

Reference Target

x = 5

+ foo(x)

x = x + y x = 5

+ foo(x)✔ SPA correctly

reports No Inconsistency

Reference Target

for(i=0; i < n;) {+ foo(i)i++;}

i = o;while(i<n) {+ foo(i)i++;}

✖ SPA

incorrectly reports

Inconsistency

Page 31: Detecting and Characterizing Semantic Inconsistencies in Ported Code

RQ1. Can SPA accurately detect porting inconsistencies?

Eclipse CDT Mozilla

SPA SPATotal 63 42Detected 43 34False positive 15 9False negative

3 -

31

SPA detects inconsistencies with 65% to 73% precision and 90% recall.

Page 32: Detecting and Characterizing Semantic Inconsistencies in Ported Code

RQ1. Can SPA accurately detect porting inconsistencies?

32

SPA improves precision by 14 to 17 percentage points w.r.t. earlier tools.

precision recall0

20406080

100 SPAJiang's Tool

precision

SPADejavu

Page 33: Detecting and Characterizing Semantic Inconsistencies in Ported Code

RQ2. Can SPA accurately categorize porting inconsistencies?

33

Incnst ControlFlow

IncnstIdentifier Renaming

IncnstRelated Identifier Renaming

Incnst DataFlow

Total

Detected

33 7 5 17 62

Ground Truth

23 7 4 5 39

False positive

12 2 2 12 26

False negative

2 2 1 0 3

Page 34: Detecting and Characterizing Semantic Inconsistencies in Ported Code

RQ2. Can SPA accurately categorize porting inconsistencies?

34

Reference Targetint x;x = 5;

+ foo(x)

int x = 5;

+ foo(x)✖

SPA incorrectly reports as

Inconsistent data flow.

Page 35: Detecting and Characterizing Semantic Inconsistencies in Ported Code

RQ2. Can SPA accurately categorize porting inconsistencies?

35

Incnst ControlFlow

IncnstIdentifier Renaming

IncnstRelated Identifier Renaming

Incnst DataFlow

Total

SPA 33 7 5 17 62Ground Truth

23 7 4 5 39

False positive

12 2 2 12 26

False negative

2 2 1 0 3 SPA categorizes inconsistencies with 58% to

63% precision and 92% to 100% recall.

Page 36: Detecting and Characterizing Semantic Inconsistencies in Ported Code

Summary

Study different types of porting errors in practice. Detect and categorize potential porting errors

successfully.

Future Work Integrate SPA with an integrated development

environment (IDE). Investigate other complementary approaches to

detect porting errors.

36

Page 37: Detecting and Characterizing Semantic Inconsistencies in Ported Code

Detecting and Characterizing Semantic Inconsistencies in

Ported Code

Baishakhi Ray*, Miryung Kim*, Suzette Person+, Neha Rungta!

* The University of Texas at Austin+ NASA Langley Research Center

! NASA Ames Research Center

Page 38: Detecting and Characterizing Semantic Inconsistencies in Ported Code

Acknowledgement

We thank Na Meng for the discussions and help to design and implement SPA. Google Summer Code 2012. Supported by National Science Foundation grants: CCF- 1149391, CCF-1117902, SHF-0910818, and CNS-1239498.

38

Page 39: Detecting and Characterizing Semantic Inconsistencies in Ported Code

RQ1. Can SPA accurately detect porting inconsistencies?

Eclipse CDT MozillaSPA Jiang’s

ToolSPA Dejavu

Detected 43 56 34 42False positive

15 29 9 17

False negative

3 4 - -

39

SPA detects inconsistencies with 65% to 73% precision and 90% recall.

SPA improves precision by 14 to 17 percentage points w.r.t. earlier tools.


Recommended