Date post: | 30-Dec-2015 |
Category: |
Documents |
Upload: | indira-duncan |
View: | 30 times |
Download: | 3 times |
Refactoring Functional Programs
Simon Thompson
with
Huiqing Li
Claus Reinke
www.cs.kent.ac.uk/projects/refactor-fp
AFP04 2
Session 2
AFP04 3
Overview
Review mini-project.
Implementation of HaRe.
Larger-scale examples.
Case study.
AFP04 4
Mini-project feedback
Refactorings performed.
Refactorings and language features?
Machine support feasible? Useful?
‘Not-quite’ refactorings? Support possible here?
AFP04 5
Examples
Argument permutations (NB partial application).
(Un)group arguments.
Slice function for a component of its result.
Error handling / exception handling.
AFP04 6
More examples
Introduce type synonym, selectively.
Introduce ‘branded’ type.
Modify the return type of a function from T to
Maybe T, Either T S, [T].
Ditto for input types … and modify variable names correspondingly.
AFP04 7
Implementing HaRe
AFP04 8
Proof of concept …
To show proof of concept it is enough to:
build a stand-alone tool,
work with a subset of the language,
pretty print the results of refactorings.
AFP04 9
… or a useful tool?
Integrate with existing program development tools: stand-alone program links to editors emacs and vim, any other IDEs also possible.
Work with the complete language: Haskell 98?
Preserve the formatting and comments in the refactored source code.
Allow users to extend and script the system.
AFP04 10
The refactorings in HaRe
Rename
Delete
Lift / Demote
Introduce definition
Remove definition
Unfold
Generalise
Add / remove params All these refactorings are module aware.
Move def between modulesDelete /add to exports
Clean importsMake imports explicit
Data type to ADT
AFP04 11
The Implementation of HaRe
Informationgathering
Pre-conditionchecking
Programtransformation
Programrendering
AFP04 12
Information needed
Syntax: replace the function called sq, not the variable sq …… parse tree.
Static semantics: replace this function sq, not all the sq functions …… scope information.
Module information: what is the traffic between this module and its clients …… call graph.
Type information: replace this identifier when it is used at this type …… type annotations.
AFP04 13
Infrastructure: decisions
Build a tool that can interoperate with emacs, vim, … yet act separately.
Leverage existing libraries for processing Haskell 98, for tree transformation … as few modifications as possible.
Be as portable as possible, in the Haskell space.
Abstract interface to compiler internals?
AFP04 14
Haskell landscape (end 2002)
Parser: many
Type checker: few
Tree transformations: few
Difficulties
Haskell 98 vs. Haskell extensions.
Libraries: proof of concept vs. distributable.
Source code regeneration.
Real project
AFP04 15
Programatica
Project at OGI to build a Haskell system …
… with integral support for verification at various levels: assertion, testing, proof etc.
The Programatica project has built a Haskell front end in Haskell, supporting syntax, static, type and module analysis …
… freely available under BSD licence.
AFP04 16
The Implementation of HaRe
Informationgathering
Pre-conditionchecking
Programtransformation
Programrendering
AFP04 17
First steps … lifting and friends
Use the Haddock parser … full Haskell given in 500 lines of data type definitions.
Work by hand over the Haskell syntax: 27 cases for expressions …
Code for finding free variables, for instance …
AFP04 18
Finding free variables ‘by hand’instance FreeVbls HsExp where freeVbls (HsVar v) = [v] freeVbls (HsApp f e) = freeVbls f ++ freeVbls e freeVbls (HsLambda ps e) = freeVbls e \\ concatMap paramNames ps freeVbls (HsCase exp cases) = freeVbls exp ++ concatMap freeVbls cases freeVbls (HsTuple _ es) = concatMap freeVbls es … etc.
AFP04 19
This approach
Boilerplate code … 1000 lines for 100 lines of significant code.
Error prone: significant code lost in the noise.
Want to generate the boiler plate and the tree traversals …
… DriFT: Winstanley, Wallace
… Strafunski: Lämmel and Visser
AFP04 20
Strafunski
Strafunski allows a user to write general (read generic), type safe, tree traversing programs, with ad hoc behaviour at particular points.
Top-down / bottom up, type preserving / unifying,
full stop one
AFP04 21
Strafunski in use
Traverse the tree accumulating free variables from components, except in the case of lambda abstraction, local scopes, …
Strafunski allows us to work within Haskell …
Other options? Generic Haskell, Template Haskell, AG, …
AFP04 22
Rename an identifier
rename:: (Term t)=>PName->HsName->t->Maybe t rename oldName newName = applyTP worker where worker = full_tdTP (idTP ‘adhocTP‘ idSite) idSite :: PName -> Maybe PName idSite v@(PN name orig) | v == oldName = return (PN newName orig) idSite pn = return pn
AFP04 23
The coding effort
Transformations: straightforward in Strafunski …
… the chore is implementing conditions that the transformation preserves meaning.
This is where much of our code lies.
AFP04 24
Move f from module A to B
Is f defined at the top-level of B?Are the free variables in f accessible within module B?Will the move require recursive modules?
Remove the definition of f from module A.Add the definition to module B.Modify the import/export lists in module A, B and the
client modules of A and B if necessary. Change uses of A.f to B.f or f in all affected modules.Resolve ambiguity.
AFP04 25
The Implementation of HaRe
Informationgathering
Pre-conditionchecking
Programtransformation
Programrendering
AFP04 26
Program rendering example-- This is an example
module Main where
sumSquares x y = sq x + sq y where sq :: Int->Int sq x = x ^ pow pow = 2 :: Int
main = sumSquares 10 20
Promote the definition of sq to top level
AFP04 27
Program rendering examplemodule Main where
sumSquares x y = sq pow x + sq pow y where pow = 2 :: Int
sq :: Int->Int->Intsq pow x = x ^ pow
main = sumSquares 10 20
Using a pretty printer: comments lost and layout quite different.
AFP04 28
Program rendering example-- This is an example
module Main where
sumSquares x y = sq x + sq y where sq :: Int->Int sq x = x ^ pow pow = 2 :: Int
main = sumSquares 10 20
Promote the definition of sq to top level
AFP04 29
Program rendering example-- This is an example
module Main where
sumSquares x y = sq pow x + sq pow y where pow = 2 :: Int
sq :: Int->Int->Intsq pow x = x ^ pow
main = sumSquares 10 20
Layout and comments preserved.
AFP04 30
Token stream and AST
White space and comments in the token stream.
Modification of the AST guides the modification of the token stream.
After a refactoring, the program source is extracted from the token stream not the AST.
Heuristics associate comments with program entities.
AFP04 31
Production tool
Programaticaparser and
type checker
Refactorusing a
Strafunskiengine
Render codefrom the
token streamand
syntax tree.
AFP04 32
Production tool (optimised)
Programaticaparser and
type checker
Refactorusing a
Strafunskiengine
Render codefrom the
token streamand
syntax tree.
Pass lexical information toupdate thesyntax treeand so avoid reparsing
AFP04 33
What have we learned?
Emerging Haskell libraries make it practical(?)
Efficiency and robustness• type checking large systems, • linking, • editor script languages (vim, emacs).
Limitations of editor interactions.
Reflections on Haskell itself.
AFP04 35
Reflections on Haskell
Cannot hide items in an export list (cf import).
Field names for prelude types?
Scoped class instances not supported.
‘Ambiguity’ vs. name clash.
‘Tab’ is a nightmare!
Correspondence principle fails …
AFP04 36
Correspondence
Operations on definitions and operations on expressions can be placed in one to one correspondence
(R.D.Tennent, 1980)
AFP04 37
Correspondence
Definitions
where
f x y = e
f x | g1 = e1 | g2 = e2
Expressions
let
\x y -> e
f x = if g1 then e1 else if g2 … …
AFP04 38
Function clauses f x | g1 = e1
f x | g2 = e2
Can ‘fall through’ a function clause … no direct correspondence in the expression language.
f x = if g1 then e1 else if g2 …
No clauses for anonymous functions … no reason to omit them.
AFP04 39
Work in progress
‘Fold’ against definitions … find duplicate code.
All, some or one? Effect on the interface …f x = … e … e …
Traditional program transformations• Short-cut fusion• Warm fusion
AFP04 40
Where next?
Opening up to users: API or little language?
Link with other IDEs (and front ends?).
Detecting ‘bad smells’.
More useful refactorings supported by us.
Working without source code.
AFP04 41
API
Refactorings
Refactoringutilities
Strafunski
Haskell
AFP04 42
DSL
Refactorings
Refactoringutilities
Strafunski
Haskell
Combining forms
AFP04 43
Larger-scale examples
More complex examples in the functional domain; often link with data types.
Dawning realisation that can some refactorings are pretty powerful.
Bidirectional … no right answer.
AFP04 44
Algebraic or abstract type?
data Tr a
= Leaf a |
Node a (Tr a) (Tr a) Tr
Leaf
Node
flatten :: Tr a -> [a]
flatten (Leaf x) = [x]
flatten (Node s t)
= flatten s ++
flatten t
AFP04 45
Algebraic or abstract type?
data Tr a
= Leaf a |
Node a (Tr a) (Tr a)
isLeaf = …
isNode = …
…
Tr
isLeaf
isNode
leaf
left
right
mkLeaf
mkNode
flatten :: Tr a -> [a]
flatten t
| isleaf t = [leaf t]
| isNode t
= flatten (left t)
++ flatten (right t)
AFP04 46
Algebraic or abstract type?
Pattern matching syntax is more direct …
… but can achieve a considerable amount with field names.
Other reasons? Simplicity (due to other refactoring steps?).
Allows changes in the implementation type without affecting the client: e.g. might memoise
Problematic with a primitive type as carrier.
Allows an invariant to be preserved.
AFP04 47
Outside or inside?
data Tr a
= Leaf a |
Node a (Tr a) (Tr a)
isLeaf = …
isNode = …
…
Tr
isLeaf
isNode
leaf
left
right
mkLeaf
mkNode
flatten :: Tr a -> [a]
flatten t
| isleaf t = [leaf t]
| isNode t
= flatten (left t)
++ flatten (right t)
AFP04 48
Outside or inside?
data Tr a
= Leaf a |
Node a (Tr a) (Tr a)
isLeaf = …
isNode = …
flatten t = …
Tr
isLeaf
isNode
leaf
left
right
mkLeaf
mkNode
flatten
AFP04 49
Outside or inside?
If inside and the type is reimplemented, need to reimplement everything in the signature, including flatten.
The more outside the better, therefore.
If inside can modify the implementation to memoise values of flatten, or to give a better implementation using the concrete type.
Layered types possible: put the utilities in a privileged zone.
AFP04 50
Memoise flatten :: Tr a->[a]
data Tree a
= Leaf { val::a } |
Node { val::a, left,right::(Tree a) }
leaf = Leaf
node = Node
flatten (Leaf x) = [x]
flatten (Node x l r) =
(x : (flatten l ++ flatten r))
data Tree a
= Leaf { val::a, flatten:: [a] } |
Node { val::a, left,right::(Tree a), flatten::[a] }
leaf x = Leaf x [x]
node x l r = Node x l r (x : (flatten l ++ flatten r))
AFP04 51
Memoise flatten
Invisible outside the implementation module, if tree type is already an ADT.
Field names in Haskell make it particularly straightforward.
AFP04 52
Data type or existential type?
data Shape data Shape = Circle Float | = forall a. Sh a => Shape a Rect Float Float class Sh a wherearea :: Shape -> Float area :: a -> Floatarea (Circle f) = pi*r^2 perim :: a -> Floatarea (Rect h w) = h*w data Circle = Circle Floatperim :: Shape -> Float perim (Circle f) = 2*pi*r instance Sh Circleperim (Rect h w) = 2*(h+w) area (Circle f) = pi*r^2 perim (Circle f) = 2*pi*r
data Rect = Rect Float
instance Sh Rect area (Rect h w) = h*w perim (Rect h w) = 2*(h+w)
AFP04 53
Constructor or constructor?
data Expr data Expr = Epsilon | .... | = Epsilon | .... | Then Expr Expr | Then Expr Expr | Star Expr Star Expr | Plus Expr
plus e = Then e (Star e)
AFP04 54
Monadification: expressions
data Expr = Lit Integer | -- Literal integer value Vbl Var | -- Assignable variables Add Expr Expr | -- Expression addition: e1+e2 Assign Var Expr -- Assignment: x:=e
type Var = Stringtype Store = [ (Var, Integer) ]
lookup :: Store -> Var -> Integerlookup st x = head [ i | (y,i) <- st, y==x ]
update :: Store -> Var -> Integer -> Storeupdate st x n = (x,n):st
AFP04 55
Monadification: evaulationeval :: Expr -> evalST :: Expr -> Store -> (Integer, Store) State Store Integer
eval (Lit n) st evalST (Lit n) = (n,st) = do return n
eval (Vbl x) st evalST (Vbl x) = (lookup st x,st) = do st <- get return (lookup st x)
AFP04 56
Monadification: evaulation 2eval :: Expr -> evalST :: Expr -> Store -> (Integer, Store) State Store Integer
eval (Add e1 e2) st evalST (Add e1 e2) = (v1+v2, st2) = do where v1 <- evalST e1 (v1,st1) = eval e1 st v2 <- evalST e2 (v2,st2) = eval e2 st1 return (v1+v2)
eval (Assign x e) st evalST (Assign x e) = (v, update st' x v) = do where v <- evalST e (v,st') = eval e st st <- get put (update st x v) return v
AFP04 57
Classes and instancesType Store = [Int]
empty :: Storeempty = []
get :: Var -> Store -> Intget v st = head [ i | (var,i) <- st, var==v]
set :: Var -> Int -> Store -> Storeset v i = ((v,i):)
AFP04 58
Classes and instancesType Store = [Int]
empty :: Storeget :: Var -> Store -> Intset :: Var -> Int -> Store -> Store
empty = []get v st = head [ i | (var,i) <- st, var==v]set v i = ((v,i):)
AFP04 59
Classes and instancesclass Store a where empty :: a get :: Var -> a -> Int set :: Var -> Int -> a -> a
instance Store [Int] where
empty = [] get v st = head [ i | (var,i) <- st, var==v] set v i = ((v,i):)
Need newtype wrapper in Haskell 98 …end
AFP04 62
Understanding a program
Take a working semantic tableau system written by an anonymous 2nd year student …
… refactor to understand its behaviour.
Nine stages of unequal size.
Reflections afterwards.
AFP04 63
An example tableau((AC)((AB)C))
((AB)C)(AC)
CA
(AB)C
A B Make B TrueMake A and C False
AFP04 64
v1: Name types
Built-in types[Prop][[Prop]]
used for branches and tableaux respectively.
Modify by addingtype Branch = [Prop]type Tableau = [Branch]
Change required throughout the program.
Simple edit: but be aware of the order of substitutions: avoid
type Branch = Branch
AFP04 65
v2: Rename functionsExisting names
tableaux
removeBranch
remove
becometableauMain
removeDuplicateBranches
removeBranchDuplicates
and add comments clarifying the (intended) behaviour.
Add test datum.
Discovered some edits undone in stage 1.
Use of the type checker to catch errors.
test will be useful later?
AFP04 66
v3: Literate normal script
Change from literate form:Comment …
> tableauMain tab> = ...
to-- Comment …
tableauMain tab = ...
Editing easier: implicit assumption was that it was a normal script.
Could make the switch completely automatic?
AFP04 67
v4: Modify function definitionsFrom explicit recursion:
displayBranch
:: [Prop] -> String
displayBranch [] = []
displayBranch (x:xs)
= (show x) ++ "\n" ++
displayBranch xs
todisplayBranch
:: Branch -> String
displayBranch
= concat . map (++"\n") . map show
Abstraction: move from explicit list representation to operations such as map and concat which could be over any collection type.
First time round added incorrect (but type correct) redefinition … only spotted at next stage.
Version control: un/redo etc.
AFP04 68
v5: Algorithms and types (1)removeBranchDup :: Branch -> BranchremoveBranchDup [] = []removeBranchDup (x:xs) | x == findProp x xs = [] ++ removeBranchDup xs | otherwise = [x] ++ removeBranchDup xs
findProp :: Prop -> Branch -> PropfindProp z [] = FALSEfindProp z (x:xs) | z == x = x | otherwise = findProp z xs
AFP04 69
v5: Algorithms and types (2)removeBranchDup :: Branch -> BranchremoveBranchDup [] = []removeBranchDup (x:xs) | findProp x xs = [] ++ removeBranchDup xs | otherwise = [x] ++ removeBranchDup xs
findProp :: Prop -> Branch -> BoolfindProp z [] = FalsefindProp z (x:xs) | z == x = True | otherwise = findProp z xs
AFP04 70
v5: Algorithms and types (3)removeBranchDup :: Branch -> BranchremoveBranchDup = nub
findProp :: Prop -> Branch -> BoolfindProp = elem
AFP04 71
v5: Algorithms and types (4)removeBranchDup :: Branch -> BranchremoveBranchDup = nub
Fails the test! Two duplicate branches output, with different ordering of elements.
The algorithm used is the 'other' nub algorithm, nubVar:nub [1,2,0,2,1] = [1,2,0]nubVar [1,2,0,2,1] = [0,2,1]
Code using lists in a particular order to represent sets.
AFP04 72
v6: Library function to module
Add the definition:
nubVar = …
to the module
ListAux.hs
and replace the definition by
import ListAux
Editing easier: implicit assumption was that it was a normal script.
Could make the switch completely automatic?
AFP04 73
v7: Housekeeping
Remanings: including foo and bar and contra (becomes notContra).
An instance of filter,looseEmptyLists
is defined using filter, and subsequently inlined.
Put auxiliary function into a where clause.
Generally cleans up the script for the next onslaught.
AFP04 74
v8: Algorithm (1)splitNotNot :: Branch -> TableausplitNotNot ps = combine (removeNotNot ps) (solveNotNot ps)
removeNotNot :: Branch -> BranchremoveNotNot [] = []removeNotNot ((NOT (NOT _)):ps) = psremoveNotNot (p:ps) = p : removeNotNot ps
solveNotNot :: Branch -> TableausolveNotNot [] = [[]]solveNotNot ((NOT (NOT p)):_) = [[p]]solveNotNot (_:ps) = solveNotNot ps
AFP04 75
v8: Algorithm (2)
splitXXX removeXXX solveXXX for each of nine rules.
The algorithm applies rules in a prescribed order, using an integer value to pass information between functions.
Aim: generic versions of split remove solve
Change order of rule application … effect on duplicates.
Add map sort to top level pipeline before duplicate removal.
AFP04 76
v9: Replace lists by sets.Wholesale replacement of lists by a Set library.
map mapSet
foldr foldSet (careful!)filter filterSet
The library exposes the representation: pick, flatten.
Use with discretion … further refactoring possible.
Library needed to be augmented with primRecSet :: (a -> Set a -> b -> b) -> b -> Set a -> b
AFP04 77
v9: Replace lists by sets (2)
Drastic simplification: no explicit worries about
… ordering (and equality), (removal of) duplicates.
Hard to test intermediate stages: type change is all or nothing …
… work with dummy definitions and the type checker.
Further opportunities: why choose one rule from a set when could apply to all elements at once? Gets away from picking on one value (and breaking the set interface).
AFP04 78
Conclusions of the case study
Heterogeneous process: some small, some large.
Are all these stages strictly refactorings: some semantic changes always necessary too?
Importance of type checking for hand refactoring … … and testing when any semantic changes.
Undo, redo, reordering the refactorings … CVS.
In this case, directional … not always the case.
AFP04 79
Teaching and learning design
Exciting prospect of using a refactoring tool as an integral part of an elementary programming course.
Learning a language: learn how you could modify the programs that you have written …
… appreciate the design space, and
… the features of the language.
AFP04 80
Conclusions
Refactoring + functional programming: good fit.
Real benefit from using available libraries … with work.
Want to use the tool in building itself.
Much more to do than we have time for.