Assessing Product Line Derivation Operators Applied to Java Source Code:
An Empirical Study
João Bosco Ferreira Filho, Simon Allier, Olivier Barais, Mathieu Acher and Benoit Baudry
SPLC 2015, July 20 - 24, 2015, Nashville, TN, USA
2
Feature 0..1
1
0
11
1
Variation Points
Variability is omnipresent in numerous kinds of artefacts
Given a kind of artefact (expressed in a language) you want to make it vary
You want variants of a…
3
Word document
Java, HTML, CSS or C++ program
Class diagram
State machinemodel A B C
t1 t3
t2A B C
t2A C
You need a solution for (de)activating/adding/removing, substituting some elements; and thus
deriving variants of a…
4
Ideally applicable to any kind of artefact expressed in a language
Word document
Java, HTML, CSS or C++ program
Class diagram
State machinemodel A B C
t1 t3
t2A B C
t2A C
Common Variability Language (CVL): automatically deriving products (eg models)
5
0..1 0
1
1
01
1t1 t3
t2A B C
Object Existence1Link Existence1Link Existence2Link Existence3
Derivation Engine
Derivation Operators
t1 t3A B C
t2A C
Derived Products
Ideally applicable to any kind of artefact expressed in a language (conformant to a metamodel)
6
0..1 0
1
1
01
1t1 t3
t2A B C
Object Existence1Link Existence1Link Existence2Link Existence3
Derivation Engine
t1
t2A C
Derivation Operators
t1 t3A B C
t2A C
Derived Products
Previous work show#1 Using CVL “as is” is not working. It is highly beneficial to specialize derivation operators for a given language [Filho et al. SPLC’13]; mandatory in industrial context [Filho et al. STTT’14]
#2 Hard for users to do not make mistake: verification techniques eg [Czarnecki et al. GPCE’06, Batory et al. GPCE’07, Classen et al. ICSE’10]; or support for preventing errors and guiding users when specifying variability in an IDE
7
0..1 0
1
1
01
1 Object Existence1Link Existence1Link Existence2Link Existence3
Derivation Engine
Derivation Operators
Derived Products
Previous work show#1 Using CVL “as is” is not working. It is highly beneficial to specialize derivation operators for a given language [Filho et al. SPLC’13]; mandatory in industrial context [Filho et al. STTT’14]#2 Hard for users to do not make mistake: verification techniques eg [Czarnecki et al. GPCE’06, Batory et al. GPCE’07, Classen et al. ICSE’10]; or support for preventing errors and guiding users when specifying variability in an IDE
e.g., automatically FIND THESE PERCENTAGES!
Which derivation operations are more likely to work? derive uncompilable code? Can we identify operations subject to specialization?
8
#1 Using CVL “as is” is not working. We need to specialize derivation operators for a given language [Filho et al. SPLC’13] [Filho et al. STTT’14]#2 Hard for users to do not make mistake: support for preventing errors and guiding users when specifying variability in an IDE
Object Existence2Object Existence3Object Existence4Object Existence5
VS1
VS6
VS4
VS5
VS2
0..2VS3
Object Existence1 >80%<10%
>40%
t1 t3
t2A B C
For any programming/
modeling “language”
e.g., automatically FIND THESE PERCENTAGES!
Which derivation operations are more likely to work? derive uncompilable code? Can we identify operations subject to specialization?
9
#1 Using CVL “as is” is not working. We need to specialize derivation operators for a given language [Filho et al. SPLC’13] [Filho et al. STTT’14]#2 Hard for users to do not make mistake: support for preventing errors and guiding users when specifying variability in an IDE
Object Existence2Object Existence3Object Existence4Object Existence5
VS1
VS6
VS4
VS5
VS2
0..2VS3
Object Existence1
>80%
<10%
>40%
10
http://grepcode.com/file/repo1.maven.org/maven2/com.codahale.metrics/metrics-core/3.0.0-BETA3/com/codahale/metrics/CsvReporter.java
Object Existence1 Object Existence2Object Existence3
Kind of derivation operations by example
11
Object existence expresses whethera determined object will make part or not of the derived variant; its execution implies on deleting or adding any source code element (e.g., statements, assignments, blocks,literals, etc.) from the original program.
Kind of derivation operations by example
12
Object Substitution expresses that a determined program element will be replaced by another of its same type
Kind of derivation operations
13
Object existence expresses whethera determined object will make part or not of the derived variant; its execution implies on deleting or adding any source code element (e.g., statements, assignments, blocks,literals, etc.) from the original program.
Link Existence expresses whether there is a relationship or not between two elements, in the case of Java programs, we consider as a link any relationship between classes: association, composition, inheritance, etc (e.g., to remove anextends Class A from a class' header).
Object Substitution expresses that a determined program element will be replaced by another of its same type, e.g., a method substituted by another method.
Link End Substitution expresses that a relationship between a class A and a class B will be replaced by another relationship of the same type between class A and a third class C (e.g., A extends C instead of A extends B).
e.g., automatically FIND THESE PERCENTAGES!
Which derivation operations are more likely to work? derive uncompilable code? Can we identify operations subject to specialization?
14
#1 Using CVL “as is” is not working. We need to specialize derivation operators for a given language [Filho et al. SPLC’13] [Filho et al. STTT’14]#2 Hard for users to do not make mistake: support for preventing errors and guiding users when specifying variability in an IDE
Object Existence2Object Existence3Object Existence4Object Existence5
VS1
VS6
VS4
VS5
VS2
0..2VS3
Object Existence1
>80%
<10%
>40%
Empirical Study
15
Methodology
16
p
p'
Object ExistenceLink ExistenceObject Subst.LinkEnd Subst.
CVL VPs
DerivationOperation
Compilation and Testing
CounterexampleVariantSosie
[Baudry ISSTA’14]
Random operator applied to a randomcode element in the program
Study (the big picture)
17
p'
86 CVL operations
Derivation Compilation and Testing
Object Existence FieldObject Existence InterfaceObject Existence ForeachObject Existence Break…Object Substitution SuperAccessObject Substitution Annotation…Link Existence …LinkEnd Subst …
8 Java projects
p'p'p'
370,000 programs
%Counterexamples(does not compile)
%Variants(only compiles)
%Sosies (compiles and pass the test suites)
Subject Programs
18
All have JUnit tests (statement coverage greater than 70%)
Variables and measurements
19
Derivation
Object Existence FieldObject Existence InterfaceObject Existence ForeachObject Existence Break…Object Substitution SuperAccessObject Substitution Annotation…Link Existence …LinkEnd Subst …
%Counterexamples(does not compile)
%Variants(only compiles)
%Sosies (compiles and pass the test suites)
Instrumentation
20
Instrumentation (2)
21
370,000 programsone month of computation
Compilation and Testing
Results and Findings
22
Results and Findings
• 86% of the possible pairs (CVL operator + Type of Java Code Element) resulted in compilable programs at least once
▪Many possibilities to vary a Java program • We found 72 types of CVL-based derivation operations that could produce
compilable Java programs – e.g., substituting one child or parent class by another, – suppressing an if statement, – introducing/suppressing a method invocation, – etc.
23
Results and Findings• There are operations that will always lead to
counterexamples or to variants
24
Success Rate
~100%
~0%
Results and Findings
• Operations with low success rate may not be directly discarded, but specialized
▪Looking into those operations • Qualitatively analysing them with the help of dedicated tooling
25
~0%
Visualization Tool
26
ProjectView
ClassView
Operation View
1 2
3
Results and Findings
• Specialization to avoid recurrent errors ▪Simple specializations
• e.g., removing a try in the case of removing a catch
▪Static-analysis-based specializations • e.g., identifying strongly connected classes to be removed together
• Specialized and typed operators ▪Object Existence
• Class existence, field existence, parameter existence
27
Results and Findings
• Varying entire blocks of code instead of single instructions is more likely to generate correct programs
▪Do, For, ForEach, While, If, Throw
• 70% to 98% of variants when coupled with Object Existence
▪Fine-grained variability works better (ie easier to put variability inside a method than to manipulate coarse elements like « interface »)
▪Anomaly: Classes or Methods à 0.1% – Invoked in other parts of the code
28
Results and Findings
• Object Existence is more likely to generate variants ▪Object Existence à 21.97% of variants ▪ Link Existence à 11.81% of variants ▪Object Substitution à 9.22% of variants ▪ Link Substitution à 4.51% of variants
• Overall CVL is not safe to be directly applied to Java ▪Many different ways to vary a program
• But high probability to break it (90%)
▪Generic language to vary any base model type but specialization is clearly required
29
Conclusions
• Large-scale assessment of derivation operations ▪More than 370,000 derived products
• Many kinds of operations ▪86 ways of varying a Java program
• Some of them were never considered before by variability studies
• Quantitatively and qualitatively supported insights ▪Extensive panorama of success rates for each operation
▪Visualization tool for analyzing the transformations
• Open new perspectives for supporting variability in languages, specially in Java
30
Future Work
• Using the results to ▪Devise specialized derivation operators for Java
▪Help current variability supporting IDEs to incorporate new possibilities of variation, knowing the risks to do so
• Apply derivation operators in different areas, for different objectives (e.g., resilience), such as software diversification
31
32
Object Existence2Object Existence3Object Existence4Object Existence5
VS1
VS6
VS4
VS5
VS2
0..2VS3
Object Existence1
>80%
<10%
>40%
Long term vision: variability-aware IDE, (anti-)patterns for any language through automated explorationQuestion?