+ All Categories
Home > Documents > V. Winter, J. Guerrero, A. James, C. Reinke LINKING SYNTACTIC AND SEMANTIC MODELS OF JAVA SOURCE...

V. Winter, J. Guerrero, A. James, C. Reinke LINKING SYNTACTIC AND SEMANTIC MODELS OF JAVA SOURCE...

Date post: 23-Dec-2015
Category:
Upload: gilbert-simmons
View: 217 times
Download: 3 times
Share this document with a friend
Popular Tags:
38
V. Winter, J. Guerrero, A. James, C. Reinke LINKING SYNTACTIC AND SEMANTIC MODELS OF JAVA SOURCE CODE WITHIN A PROGRAM TRANSFORMATION SYSTEM
Transcript
Page 1: V. Winter, J. Guerrero, A. James, C. Reinke LINKING SYNTACTIC AND SEMANTIC MODELS OF JAVA SOURCE CODE WITHIN A PROGRAM TRANSFORMATION SYSTEM.

V. Winter, J. Guerrero, A. James, C. Reinke

LINKING SYNTACTIC AND SEMANTIC MODELS OF JAVA SOURCE CODE WITHIN A PROGRAM

TRANSFORMATION SYSTEM

Page 2: V. Winter, J. Guerrero, A. James, C. Reinke LINKING SYNTACTIC AND SEMANTIC MODELS OF JAVA SOURCE CODE WITHIN A PROGRAM TRANSFORMATION SYSTEM.

OUTLINE• Introduction

• Motivation: The need for static analysis

• Why transformation systems are interesting in this setting

• Creating a rule in PMD

• Creating a rule in Sextant

• GPS-Traverse

• Overview

• Example: Constructing a call-graph

• Technical details of GPS-Traverse

Page 3: V. Winter, J. Guerrero, A. James, C. Reinke LINKING SYNTACTIC AND SEMANTIC MODELS OF JAVA SOURCE CODE WITHIN A PROGRAM TRANSFORMATION SYSTEM.

SOURCE-CODE ANALYSIS• Is heavily employed across the public and private sectors including:

• the top 5 commercial banks

• 5 of the top 7 computer software companies

• 3 of the top 5 commercial aerospace and defense industry leaders

• the 3 largest arms services for the US

• 3 of the leading 4 accounting firms

• 2 of the top 3 insurance companies

Page 4: V. Winter, J. Guerrero, A. James, C. Reinke LINKING SYNTACTIC AND SEMANTIC MODELS OF JAVA SOURCE CODE WITHIN A PROGRAM TRANSFORMATION SYSTEM.

SOURCE-CODE ANALYSIS• It has been argued that source-code analysis can play an important role with respect to

software assurance within an Agile development process

• The FDA is recommending (and may eventually mandate) the use of static-analysis tools for the development of medical device software.

• GrammaTech’s CodeSonar is a static-analysis tool that the FDA is currently using to investigate failures in recalled medical devices.

Page 5: V. Winter, J. Guerrero, A. James, C. Reinke LINKING SYNTACTIC AND SEMANTIC MODELS OF JAVA SOURCE CODE WITHIN A PROGRAM TRANSFORMATION SYSTEM.

STATIC-ANALYSIS TOOLS• Are frequently rule-based

• Utilize a variety of software models (e.g AST, call-graph, control-flow graph)

• In an OO implementation, involve traversals of object-structures using the visitor pattern.

• Make use of pattern recognition (e.g., matching).

• May transform source-code (e.g., inserting markers/annotations to control analysis)

• Query software models

• Aggregate information

Page 6: V. Winter, J. Guerrero, A. James, C. Reinke LINKING SYNTACTIC AND SEMANTIC MODELS OF JAVA SOURCE CODE WITHIN A PROGRAM TRANSFORMATION SYSTEM.

CREATING A RULE IN PMDAvoid using while-loops without curly braces

Page 7: V. Winter, J. Guerrero, A. James, C. Reinke LINKING SYNTACTIC AND SEMANTIC MODELS OF JAVA SOURCE CODE WITHIN A PROGRAM TRANSFORMATION SYSTEM.

CREATING A RULE IN PMD• Step 1: Figure out what to look for. In this case we want to capture the convention that

while-loops must use braces.

• Construct a compilation unit containing an instance of the syntactic property you want to detect.

class Example { void bar() { while (baz) buz.doSomething(); } }

Page 8: V. Winter, J. Guerrero, A. James, C. Reinke LINKING SYNTACTIC AND SEMANTIC MODELS OF JAVA SOURCE CODE WITHIN A PROGRAM TRANSFORMATION SYSTEM.

AST GENERATION• PMD uses JavaCC to generate an AST (Abstract Syntax Tree) corresponding to the

source code.

CompilationUnit TypeDeclaration ClassDeclaration:(package private) UnmodifiedClassDeclaration(Example) ClassBody ClassBodyDeclaration MethodDeclaration:(package private) ResultType MethodDeclarator(bar) FormalParameters Block BlockStatement Statement WhileStatement Expression PrimaryExpression PrimaryPrefix Name:baz Statement StatementExpression:null PrimaryExpression PrimaryPrefix Name:buz.doSomething PrimarySuffix Arguments

Page 9: V. Winter, J. Guerrero, A. James, C. Reinke LINKING SYNTACTIC AND SEMANTIC MODELS OF JAVA SOURCE CODE WITHIN A PROGRAM TRANSFORMATION SYSTEM.

PATTERN SELECTION• Select and generalize the smallest portion of the AST containing the pattern in which you

are interested. Make sure you discriminate good patterns from bad patterns (e.g., blocks versus no blocks). Consult Java grammar as needed.

CompilationUnit TypeDeclaration ClassDeclaration:(package private) UnmodifiedClassDeclaration(Example) ClassBody ClassBodyDeclaration MethodDeclaration:(package private) ResultType MethodDeclarator(bar) FormalParameters Block BlockStatement Statement WhileStatement Expression PrimaryExpression PrimaryPrefix Name:baz Statement StatementExpression:null PrimaryExpression PrimaryPrefix Name:buz.doSomething PrimarySuffix Arguments

Page 10: V. Winter, J. Guerrero, A. James, C. Reinke LINKING SYNTACTIC AND SEMANTIC MODELS OF JAVA SOURCE CODE WITHIN A PROGRAM TRANSFORMATION SYSTEM.

CREATE RULE

public class WhileLoopsMustUseBracesRule extends AbstractRule { public Object visit(ASTWhileStatement node, Object data) { SimpleNode firstStmt = (SimpleNode)node.jjtGetChild(1); if (!hasBlockAsFirstChild(firstStmt)) { addViolation(data, node); } return super.visit(node,data); } }

Page 11: V. Winter, J. Guerrero, A. James, C. Reinke LINKING SYNTACTIC AND SEMANTIC MODELS OF JAVA SOURCE CODE WITHIN A PROGRAM TRANSFORMATION SYSTEM.

CREATE PATTERN MATCHER

// pattern matcher private boolean hasBlockAsFirstChild(SimpleNode node) {

return (node.jjtGetNumChildren() != 0 && (node.jjtGetChild(0) instanceof ASTBlock));

}

Page 12: V. Winter, J. Guerrero, A. James, C. Reinke LINKING SYNTACTIC AND SEMANTIC MODELS OF JAVA SOURCE CODE WITHIN A PROGRAM TRANSFORMATION SYSTEM.

ADD RULE TO RULESET• Add the Newly Created Rule to the PMD ruleset

<?xml version="1.0"?><ruleset name="My custom rules"xmlns="http://pmd.sf.net/ruleset/1.0.0"xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"xsi:schemaLocation="http://pmd.sf.net/ruleset/1.0.0 http://pmd.sf.net/ruleset_xml_schema.xsd"xsi:noNamespaceSchemaLocation="http://pmd.sf.net/ruleset_xml_schema.xsd"><rule name="WhileLoopsMustUseBracesRule"message="Avoid using 'while' statements without curly braces"class="WhileLoopsMustUseBracesRule"><description>Avoid using 'while' statements without using curly braces</description><priority>3</priority><example><![CDATA[public void doSomething() {while (true)x++;}]]></example></rule></ruleset>

Page 13: V. Winter, J. Guerrero, A. James, C. Reinke LINKING SYNTACTIC AND SEMANTIC MODELS OF JAVA SOURCE CODE WITHIN A PROGRAM TRANSFORMATION SYSTEM.

IN SEXTANTAvoid using while-loops without curly braces

Page 14: V. Winter, J. Guerrero, A. James, C. Reinke LINKING SYNTACTIC AND SEMANTIC MODELS OF JAVA SOURCE CODE WITHIN A PROGRAM TRANSFORMATION SYSTEM.

CREATE BASIC RULE PATTERN

strategy WhileLoopsMustUseBracesRule:

Statement[:] while ( <Expression>_1 ) <Statement>_1 [:] Statement[:] while ( <Expression>_1 ) <Statement>_1 [:]

Page 15: V. Winter, J. Guerrero, A. James, C. Reinke LINKING SYNTACTIC AND SEMANTIC MODELS OF JAVA SOURCE CODE WITHIN A PROGRAM TRANSFORMATION SYSTEM.

ADD SPECIFIC PATTERN CONSTRAINT

strategy WhileLoopsMustUseBracesRule:

Statement[:] while ( <Expression>_1 ) <Statement>_1 [:] Statement[:] while ( <Expression>_1 ) <Statement>_1 [:] if { not(<Statement>_1 = Statement[:] <Block>_1 [:]) }

Page 16: V. Winter, J. Guerrero, A. James, C. Reinke LINKING SYNTACTIC AND SEMANTIC MODELS OF JAVA SOURCE CODE WITHIN A PROGRAM TRANSFORMATION SYSTEM.

ADD METRIC/ACTION

strategy WhileLoopsMustUseBracesRule:

Statement[:] while ( <Expression>_1 ) <Statement>_1 [:] Statement[:] while ( <Expression>_1 ) <Statement>_1 [:] if { not(<Statement>_1 = Statement[:] <Block>_1 [:]) andalso sml.addViolation(<Statement>_1) }

Page 17: V. Winter, J. Guerrero, A. James, C. Reinke LINKING SYNTACTIC AND SEMANTIC MODELS OF JAVA SOURCE CODE WITHIN A PROGRAM TRANSFORMATION SYSTEM.

OBSERVATIONS• Primitive operations in transformation systems include:

• Parsing

• Matching

• Traversal

• The software models that transformation systems typically operate on are terms – either concrete or abstract syntax trees.

• This makes the foundational framework of transformation systems well-suited for rule-based source-code analysis systems. Especially systems whose rules have syntax-based specifications.

Page 18: V. Winter, J. Guerrero, A. James, C. Reinke LINKING SYNTACTIC AND SEMANTIC MODELS OF JAVA SOURCE CODE WITHIN A PROGRAM TRANSFORMATION SYSTEM.

SEMANTIC RULESUse equals() instead of == to compare objects

Page 19: V. Winter, J. Guerrero, A. James, C. Reinke LINKING SYNTACTIC AND SEMANTIC MODELS OF JAVA SOURCE CODE WITHIN A PROGRAM TRANSFORMATION SYSTEM.

JAVA’S INTEGER CACHE• Some rules require semantic analysis

• The implementation of such rules requires the ability to query semantic models (i.e., software models other than an AST)

Page 20: V. Winter, J. Guerrero, A. James, C. Reinke LINKING SYNTACTIC AND SEMANTIC MODELS OF JAVA SOURCE CODE WITHIN A PROGRAM TRANSFORMATION SYSTEM.

package p1; public class A { static void myEq(Integer x, Integer y) {

System.out.println(x == y); }

public static void main(String[] args) { myEq(100,100); myEq(200,200); } }

Page 21: V. Winter, J. Guerrero, A. James, C. Reinke LINKING SYNTACTIC AND SEMANTIC MODELS OF JAVA SOURCE CODE WITHIN A PROGRAM TRANSFORMATION SYSTEM.

GPS-TRAVERSELinking Syntactic and Semantic Models within a Transformation System

Page 22: V. Winter, J. Guerrero, A. James, C. Reinke LINKING SYNTACTIC AND SEMANTIC MODELS OF JAVA SOURCE CODE WITHIN A PROGRAM TRANSFORMATION SYSTEM.

GPS-TRAVERSE • GPS-Traverse

• enables contextual information to be transparently tracked during transformation.

• is a collection of transformations whose purpose is to associate terms with the contexts in which they are defined

• This association is based on:

• Structural properties

• Nested classes

• Local classes

• Anonymous classes

• Frame variables currently in scope

• Generic variables currently in scope

Page 23: V. Winter, J. Guerrero, A. James, C. Reinke LINKING SYNTACTIC AND SEMANTIC MODELS OF JAVA SOURCE CODE WITHIN A PROGRAM TRANSFORMATION SYSTEM.

NESTED CLASSES

package p1; class B1 { class B2 { class B3 { int x; } } }

Page 24: V. Winter, J. Guerrero, A. James, C. Reinke LINKING SYNTACTIC AND SEMANTIC MODELS OF JAVA SOURCE CODE WITHIN A PROGRAM TRANSFORMATION SYSTEM.

FIELDS VERSUS LOCAL VARIABLES

class B { int x = 1; void f() { { ... x ... int x = 2; ... x ... } }

Page 25: V. Winter, J. Guerrero, A. James, C. Reinke LINKING SYNTACTIC AND SEMANTIC MODELS OF JAVA SOURCE CODE WITHIN A PROGRAM TRANSFORMATION SYSTEM.

GENERIC TYPES VERSUS STANDARD TYPES

class C<T> { class T { T T; // field T of type <T> } }

Page 26: V. Winter, J. Guerrero, A. James, C. Reinke LINKING SYNTACTIC AND SEMANTIC MODELS OF JAVA SOURCE CODE WITHIN A PROGRAM TRANSFORMATION SYSTEM.

IN SUMMARY…• GPS-Traverse: term context

• In turn, a tuple of the form (term, context) provides the basis for a variety of semantic analysis functions

• A particularly useful such analysis function is called resolution

Page 27: V. Winter, J. Guerrero, A. James, C. Reinke LINKING SYNTACTIC AND SEMANTIC MODELS OF JAVA SOURCE CODE WITHIN A PROGRAM TRANSFORMATION SYSTEM.

RESOLUTION• Resolution is a semantic analysis function that operates on terms denoting references

• The resolution function used by Java is highly complex and involves:

• Static evaluation

• Type analysis

• Overloading, overriding, shadowing

• Generic analysis

• Local analysis

• Visibility – public, protected, package private, private

• Subtyping

• Imports: single-type, on-demand, and static

Page 28: V. Winter, J. Guerrero, A. James, C. Reinke LINKING SYNTACTIC AND SEMANTIC MODELS OF JAVA SOURCE CODE WITHIN A PROGRAM TRANSFORMATION SYSTEM.

USES OF RESOLUTION• Resolution is a prerequisite for a variety of software-based analysis and manipulation

activities such as:

• Bootstrapping semantic models

• Software metrics

• API usage analysis

• Refactoring

• Slicing

• Migration – a well-formed compliment of slicing

• Join point recognition

• Resolution-informed transformation is well-suited for many of these activities

• And finally, resolution-informed transformation can also play a key role in the construction of semantic models of software such as the call graph of a software system

Page 29: V. Winter, J. Guerrero, A. James, C. Reinke LINKING SYNTACTIC AND SEMANTIC MODELS OF JAVA SOURCE CODE WITHIN A PROGRAM TRANSFORMATION SYSTEM.

EXAMPLE: CALL GRAPH

package p1; public class A extends C { class innerA extends B1 { void g(byte b) {

f(b + 0); f(0);

} } } class B1 extends B2 { private void f(int x) { } } class B2 { void f(long x) { } void f(short x) { } } class C { void f(int x) { } }

Page 30: V. Winter, J. Guerrero, A. James, C. Reinke LINKING SYNTACTIC AND SEMANTIC MODELS OF JAVA SOURCE CODE WITHIN A PROGRAM TRANSFORMATION SYSTEM.
Page 31: V. Winter, J. Guerrero, A. James, C. Reinke LINKING SYNTACTIC AND SEMANTIC MODELS OF JAVA SOURCE CODE WITHIN A PROGRAM TRANSFORMATION SYSTEM.

TECHNICAL DETAILSBascinet, the TL System, and Sextant

Page 32: V. Winter, J. Guerrero, A. James, C. Reinke LINKING SYNTACTIC AND SEMANTIC MODELS OF JAVA SOURCE CODE WITHIN A PROGRAM TRANSFORMATION SYSTEM.

BASCINET• A Netbeans-based IDE supporting the development of TL programs

• Syntax-directed editors for TL, ML, and EBNF files

• Code-folding for both TL and ML

• Hyperlinks from MLton compiler output to ML source code

• Integrated with third-party visualization tools such as Cytoscape , GraphViz, and TreeMap

• Solves some key system-level problems:

• Discrete concurrent (forgetful) application of a transformation to a file hierarchy

{ transformation } x {file1, file2, …}

• Continuous sequential (stateful) application of a transformation to a file hierarchy

state1 = transformation( state0, file1)

state2 = transformation( state1, file2)

Page 33: V. Winter, J. Guerrero, A. James, C. Reinke LINKING SYNTACTIC AND SEMANTIC MODELS OF JAVA SOURCE CODE WITHIN A PROGRAM TRANSFORMATION SYSTEM.

THE TL SYSTEM• Input: GLR Parser

• Output: Abstract Prettyprinter

• TL – A language for specifying higher-order transformation

• First-order matching on concrete syntax trees

• First-order and higher-order generic traversals

• Standard combinators plus special-purpose combinators

• Modular

• Partially type-checked

• ML – A functional programming language tightly integrated with TL

• Computation is expressed in terms of modules written in TL and ML.

Page 34: V. Winter, J. Guerrero, A. James, C. Reinke LINKING SYNTACTIC AND SEMANTIC MODELS OF JAVA SOURCE CODE WITHIN A PROGRAM TRANSFORMATION SYSTEM.

TL• The terms being manipulated are concrete syntax trees

• The computational unit is the conditional rewrite rule:

termlhs termrhs if { condition }

• Rules (also called strategies) can be bound to identifiers:

r: termlhs termrhs if { condition }

• Strategies can be constructed by composing rules using a variety of combinators:

r1 <+ r2

r1 <; r2

• Strategies can be applied to terms using traversals and iterators:

TDL myStrategy myTerm

Page 35: V. Winter, J. Guerrero, A. James, C. Reinke LINKING SYNTACTIC AND SEMANTIC MODELS OF JAVA SOURCE CODE WITHIN A PROGRAM TRANSFORMATION SYSTEM.

import_closed GPS.Locator

module CyclomaticComplexity strategy initialize: ...

strategy outputResults: ...

strategy collectMetrics: TDL ( GPS.Locator.enter <; ccAnalysis <; GPS.Locator.exit )

strategy ccAnalysis: MethodCC <+ ConstructorCC strategy MethodCC: ... strategy ConstructorCC: ...

end // module

Page 36: V. Winter, J. Guerrero, A. James, C. Reinke LINKING SYNTACTIC AND SEMANTIC MODELS OF JAVA SOURCE CODE WITHIN A PROGRAM TRANSFORMATION SYSTEM.

GPS-TRAVERSE• Transformationally maintains a semantic model which can be queried in a variety of

ways:

• getContextKey

• getEnclosingContextKey

• currentContextType

• enclosingContextType

• withinContextType

• inMethod

• isGeneric

• isLocalGeneric

• isVar

Page 37: V. Winter, J. Guerrero, A. James, C. Reinke LINKING SYNTACTIC AND SEMANTIC MODELS OF JAVA SOURCE CODE WITHIN A PROGRAM TRANSFORMATION SYSTEM.

strategy CallGraph: <SelectorOptExpression>_methodCall <SelectorOptExpression>_methodCall if {

isMethodCall <SelectorOptExpression>_methodCallandalso sml.GPS_inMethod()andalso <key>_methodContext = sml.GPS_getContextKey()

// semantic queryandalso <key>_calledMethod = sml.resolve( <key>_methodContext ,<SelectorOptExpression>_methodCall)andalso sml.outputPP( <key>_methodContext )andalso sml.output(" calls ")andalso sml.outputPP( <key>_calledMethod )

}

strategy isMethodCall:

//basic call SelectorOptExpression[:] <TypeArgsOpt>_1 <Id>_1 <Arguments>_1 [:] SelectorOptExpression[:] <TypeArgsOpt>_1 <Id>_1 <Arguments>_1 [:] <+ // embedded call ...

Page 38: V. Winter, J. Guerrero, A. James, C. Reinke LINKING SYNTACTIC AND SEMANTIC MODELS OF JAVA SOURCE CODE WITHIN A PROGRAM TRANSFORMATION SYSTEM.

Questions?

THE END


Recommended