Refactoring Functional Programs
Huiqing LiClaus Reinke
Simon Thompson
Computing Lab, University of Kent
SBLP 2003 2
Refactoring
Refactoring means changing the design of program …
… without changing its behaviour.
Refactoring comes in many forms
• micro refactoring as a part of program development• major refactoring as a preliminary to revision• as a part of debugging, …
As programmers, we do all the time.
SBLP 2003 3
Refactoring functional programs
• What is possible?
• What is different about functional programs?
• Building a usable tool vs. …
• … building a tool that will be used.
• Reflection on language design.
• Experience, demonstration, next steps.
SBLP 2003 4
Refactoring
Paper or presentationmoving sections about; amalgamate sections; move inline code to a figure; animation; …
Proof introduce lemma; remove, amalgamate hypotheses, …
Programthe topic of the lecture
SBLP 2003 5
Overview
Example refactorings
Refactoring functional programs
Generalities
Tooling: demo, rationale, design.
Catalogue of refactorings
Larger-scale examples … and a case study
Conclusions
SBLP 2003 6
Rename
f x y = …
Name may be too specific, if the function is a candidate for reuse.
findMaxVolume x y = …
Make the specific purpose of the function clearer.
Needs scope information: just change this f and not all fs (e.g. local definitions or variables).
SBLP 2003 7
Lift / demote
f x y = … h …
where
h = …
Hide a function which is clearly subsidiary to f; clear up the namespace.
f x y = … (h y) …
h y = …
Makes h accessible to the other functions in the module (and beyond?).
Needs free variable information: which of the parameters of f is used in the definition of h?
Need h not to be defined at the top level, … , DMR.
SBLP 2003 8
Introduce and use a type defn
f :: Int -> Char
g :: Int -> Int
…
Reuse supported (a synonym is transparent, but can be misleading).
type Length = Int
f :: Length -> Char
g :: Int -> Length
Clearer specification of the purpose of f,g. (Morally) can only apply to lengths.
Avoid name clashes
Problem with instance declarations (Haskell specific).
SBLP 2003 9
Introduce and use branded type
f :: Int -> Char
g :: Int -> Int
…
Reuse supported, but lose the clarity of specification.
data Length
= Length {length::Int}
f :: Length -> Char
g :: Int -> Length
Can only apply to lengths.
Needs function call information: where are (these definitions of) f and g called?• Change the calls of f … and the call sites of g.
Choice of data and newtype (Haskell specific).
SBLP 2003 10
Lessons from the first examples
Changes are not limited to a single point or even a single module: diffuse and bureaucratic …
… unlike traditional program transformation.
Many refactorings bidirectional …
… there is no single correct design.
SBLP 2003 11
Refactoring functional programs
Semantics: can articulate preconditions and … … verify transformations.
Absence of side effects makes big changes predictable and verifiable … … unlike OO.
XP is second nature to a functional programmer.
Language support: expressive type system, abstraction mechanisms, HOFs, …
SBLP 2003 12
Composing refactorings
Interesting refactorings can be built from simple components …
… each of which looks trivial in its own right.
A set of examples …
… which we have implemented.
SBLP 2003 13
Example program
showAll :: Show a => [a] -> String
showAll = table . map show
where
format :: [String] -> [String]
format [] = []
format [x] = [x]
format (x:xs) = (x ++ "\n") : format xs
table :: [String] -> String
table = concat . format
SBLP 2003 14
Examples
Lift definitions from local to global
Demote a definition before lifting its container
Lift a definition with dependencies
SBLP 2003 15
Example 1
showAll :: Show a => [a] -> String
showAll = table . map show
where
format :: [String] -> [String]
format [] = []
format [x] = [x]
format (x:xs) = (x ++ "\n") : format xs
table :: [String] -> String
table = concat . format
SBLP 2003 16
Example 1 lift
showAll :: Show a => [a] -> String
showAll = table . map show
where
format :: [String] -> [String]
format [] = []
format [x] = [x]
format (x:xs) = (x ++ "\n") : format xs
table :: [String] -> String
table = concat . format
SBLP 2003 17
Example 1
showAll :: Show a => [a] -> String
showAll = table . map show
where
table :: [String] -> String
table = concat . format
format :: [String] -> [String]
format [] = []
format [x] = [x]
format (x:xs) = (x ++ "\n") : format xs
SBLP 2003 18
Example 1 lift
showAll :: Show a => [a] -> String
showAll = table . map show
where
table :: [String] -> String
table = concat . format
format :: [String] -> [String]
format [] = []
format [x] = [x]
format (x:xs) = (x ++ "\n") : format xs
SBLP 2003 19
Example 1
showAll :: Show a => [a] -> String
showAll = table . map show
table :: [String] -> String
table = concat . format
format :: [String] -> [String]
format [] = []
format [x] = [x]
format (x:xs) = (x ++ "\n") : format xs
SBLP 2003 20
Example 2
showAll :: Show a => [a] -> String
showAll = table . map show
where
format :: [String] -> [String]
format [] = []
format [x] = [x]
format (x:xs) = (x ++ "\n") : format xs
table :: [String] -> String
table = concat . format
SBLP 2003 21
Example 2 demote
showAll :: Show a => [a] -> String
showAll = table . map show
where
format :: [String] -> [String]
format [] = []
format [x] = [x]
format (x:xs) = (x ++ "\n") : format xs
table :: [String] -> String
table = concat . format
SBLP 2003 22
Example 2
showAll :: Show a => [a] -> String
showAll = table . map show
where
table :: [String] -> String
table = concat . format
where
format :: [String] -> [String]
format [] = []
format [x] = [x]
format (x:xs) = (x ++ "\n") : format xs
SBLP 2003 23
Example 2 lift
showAll :: Show a => [a] -> String
showAll = table . map show
where
table :: [String] -> String
table = concat . format
where
format :: [String] -> [String]
format [] = []
format [x] = [x]
format (x:xs) = (x ++ "\n") : format xs
SBLP 2003 24
Example 2
showAll :: Show a => [a] -> String
showAll = table . map show
table :: [String] -> String
table = concat . format
where
format :: [String] -> [String]
format [] = []
format [x] = [x]
format (x:xs) = (x ++ "\n") : format xs
SBLP 2003 25
Example 2 lift
showAll :: Show a => [a] -> String
showAll = table . map show
table :: [String] -> String
table = concat . format
where
format :: [String] -> [String]
format [] = []
format [x] = [x]
format (x:xs) = (x ++ "\n") : format xs
SBLP 2003 26
Example 2
showAll :: Show a => [a] -> String
showAll = table . map show
table :: [String] -> String
table = concat . format
format :: [String] -> [String]
format [] = []
format [x] = [x]
format (x:xs) = (x ++ "\n") : format xs
SBLP 2003 27
Example 3
showAll :: Show a => [a] -> String
showAll = table . map show
where
format :: [String] -> [String]
format [] = []
format [x] = [x]
format (x:xs) = (x ++ "\n") : format xs
table :: [String] -> String
table = concat . format
SBLP 2003 28
Example 3 lift with dependencies
showAll :: Show a => [a] -> String
showAll = table . map show
where
format :: [String] -> [String]
format [] = []
format [x] = [x]
format (x:xs) = (x ++ "\n") : format xs
table :: [String] -> String
table = concat . format
SBLP 2003 29
Example 3
showAll :: Show a => [a] -> String
showAll = table format . map show
where
format :: [String] -> [String]
format [] = []
format [x] = [x]
format (x:xs) = (x ++ "\n") : format xs
table :: ([String] -> [String]) -> [String] -> String
table format = concat . format
SBLP 2003 30
Example 3 rename
showAll :: Show a => [a] -> String
showAll = table format . map show
where
format :: [String] -> [String]
format [] = []
format [x] = [x]
format (x:xs) = (x ++ "\n") : format xs
table :: ([String] -> [String]) -> [String] -> String
table format = concat . format
SBLP 2003 31
Example 3
showAll :: Show a => [a] -> String
showAll = table format . map show
where
format :: [String] -> [String]
format [] = []
format [x] = [x]
format (x:xs) = (x ++ "\n") : format xs
table :: ([String] -> [String]) -> [String] -> String
table fmt = concat . fmt
SBLP 2003 32
Example 3 lift
showAll :: Show a => [a] -> String
showAll = table format . map show
where
format :: [String] -> [String]
format [] = []
format [x] = [x]
format (x:xs) = (x ++ "\n") : format xs
table :: ([String] -> [String]) -> [String] -> String
table fmt = concat . fmt
SBLP 2003 33
Example 3
showAll :: Show a => [a] -> String
showAll = table format . map show
format :: [String] -> [String]
format [] = []
format [x] = [x]
format (x:xs) = (x ++ "\n") : format xs
table :: ([String] -> [String]) -> [String] -> String
table fmt = concat . fmt
SBLP 2003 34
Example 3 unfold/inline
showAll :: Show a => [a] -> String
showAll = table format . map show
format :: [String] -> [String]
format [] = []
format [x] = [x]
format (x:xs) = (x ++ "\n") : format xs
table :: ([String] -> [String]) -> [String] -> String
table fmt = concat . fmt
SBLP 2003 35
Example 3
showAll :: Show a => [a] -> String
showAll = (concat . format) . map show
format :: [String] -> [String]
format [] = []
format [x] = [x]
format (x:xs) = (x ++ "\n") : format xs
table :: ([String] -> [String]) -> [String] -> String
table fmt = concat . fmt
SBLP 2003 36
Example 3 delete
showAll :: Show a => [a] -> String
showAll = (concat . format) . map show
format :: [String] -> [String]
format [] = []
format [x] = [x]
format (x:xs) = (x ++ "\n") : format xs
table :: ([String] -> [String]) -> [String] -> String
table fmt = concat . fmt
SBLP 2003 37
Example 3
showAll :: Show a => [a] -> String
showAll = (concat . format) . map show
format :: [String] -> [String]
format [] = []
format [x] = [x]
format (x:xs) = (x ++ "\n") : format xs
SBLP 2003 38
Example 3 new definition
showAll :: Show a => [a] -> String
showAll = (concat . format) . map show
format :: [String] -> [String]
format [] = []
format [x] = [x]
format (x:xs) = (x ++ "\n") : format xs
tableName?
SBLP 2003 39
Example 3
showAll :: Show a => [a] -> String
showAll = table . map show
format :: [String] -> [String]
format [] = []
format [x] = [x]
format (x:xs) = (x ++ "\n") : format xs
table :: [String] -> String
table = concat . format
SBLP 2003 40
Beyond the text editor
All the refactorings can – in principle – be implemented using a text editor, but this is
• tedious,• error-prone,• difficult to reverse, …
With machine support refactoring becomes
• low-cost: easy to do and to undo,• reliable,• a full part of the programmer's repertoire.
SBLP 2003 41
Information needed
Syntax: replace the function called sq, not the variable sq …… parse tree.
Static semantics: replace this function sq, not all the sq functions …… scope information.
Module information: what is the traffic between this module and its clients …… call graph.
Type information: replace this identifier when it is used at this type …… type annotations.
SBLP 2003 42
Machine support invaluable
Current practice: editor + type checker (+ tests).
Our project: automated support for a repertoire of refactorings …
… integrated into the existing development process: tools such as vim and emacs.
Demonstration of the tool, hosted in vim.
SBLP 2003 43
Proof of concept …
To show proof of concept it is enough to:
• build a stand-alone tool,
• work with a subset of the language,
• ‘pretty print’ the refactored source code in a standard format.
SBLP 2003 44
… or a useful tool?
To make a tool that will be used we must:
• integrate with existing program development tools: the program editors emacs and vim.
• work with the complete Haskell 98 language,
• preserve the formatting and comments in the refactored source code.
SBLP 2003 45
Consequences
To achieve this we chose to:
• build a tool that can interoperate with emacs, vim, … yet act separately.
• leverage existing libraries for processing Haskell 98, for tree transformation, yet …
… modify them as little as possible.
• be as portable as possible, in the Haskell space.
SBLP 2003 46
The Haskell background
Libraries• parser: many• type checker: few• tree transformations: few
Difficulties• Haskell98 vs. Haskell extensions.• Libraries: proof of concept vs. distributable.• Source code regeneration.• Real project
SBLP 2003 47
First steps … lifting and friends
Use the Haddock parser … full Haskell given in 500 lines of data type definitions.
Work by hand over the Haskell syntax: 27 cases for expressions …
Code for finding free variables, for instance …
SBLP 2003 48
Finding free variables … 100 lines
instance FreeVbls HsExp where
freeVbls (HsVar v) = [v]
freeVbls (HsApp f e)
= freeVbls f ++ freeVbls e
freeVbls (HsLambda ps e)
= freeVbls e \\ concatMap paramNames ps
freeVbls (HsCase exp cases)
= freeVbls exp ++ concatMap freeVbls cases
freeVbls (HsTuple _ es)
= concatMap freeVbls es
… etc.
SBLP 2003 49
This approach
Boiler plate code …
… 1000 lines for 100 lines of significant code.
Error prone: significant code lost in the noise.
Want to generate the boiler plate and the tree traversals …
… DriFT: Winstanley, Wallace… Strafunski: Lämmel and Visser
SBLP 2003 50
Strafunski
Strafunski allows a user to write general (read generic) tree traversing programs …
… with ad hoc behaviour at particular points.
Traverse through the tree accumulating free variables from component parts, except in the case of lambda abstraction, local scopes, …
Strafunski allows us to work within Haskell … other options are under development.
SBLP 2003 51
Production tool (version 0)
Programaticaparser and
type checker
Refactorusing a
Strafunskiengine
Pretty printfrom the
augmented Programaticasyntax tree
SBLP 2003 52
Production tool (version 1)
Programaticaparser and
type checker
Refactorusing a
Strafunskiengine
Pretty printfrom the
augmented Programaticasyntax tree
Pass lexical information toupdate thesyntax treeand so avoid reparsing
SBLP 2003 53
Experience so far
We can do it … but …
• efficiency• formalising static semantics• change management (CVS etc.)• user interface• interface to other tools
• problems of getting code to work• different systems working together • clash of instance: global problem• Haskell in the large (e.g. 20 minute link time)
SBLP 2003 54
Catalogue of refactorings
• name (a phrase)• label (a word)• description• left-hand code• right-hand code• comments
• l to r• r to l• general
• primitive / composed• cross-references
• internal• external (Fowler)
• category (just one) or … … classifiers (keywords)• language
• specific (Haskell, ML etc.)• feature (lazy etc.)
• conditions• left / right• analysis required (e.g. names, types, semantic info.)• which equivalence?
• version info• date added• revision number
SBLP 2003 55
Preconditions
SBLP 2003 56
Preconditions: renaming
The existing binding structure must not be affected.
No binding for the new name may exist in the same binding group.
No binding for the new name may intervene between the binding of the old name and any of its uses …
… as the renamed identifier would be captured by the renaming.
Conversely, the binding to be renamed must not intervene between bindings and uses of the new name.
SBLP 2003 57
Preconditions: lifting
•Widening the scope of the binding must not capture independent uses of the name in the outer scope.
• There should be no existing definition of the name in the outer binding group (irrespective of whether or not it is used).
• The binding to be promoted must not make use of bindings in the inner scope. Instead lambda lift over these; extra conds apply:
• The binding must be a simple binding of a function or constant, not a pattern.
•Any argument must not be used polymorphically.
SBLP 2003 58
Larger-scale examples
More complex examples in the functional domain; often link with data types.
Dawning realisation that can some refactorings are pretty powerful.
Bidirectional … no right answer.
SBLP 2003 59
Algebraic or abstract type?
data Tr a
= Leaf a |
Node a (Tr a) (Tr a)
Tr
Leaf
Node
flatten :: Tr a -> [a]
flatten (Leaf x) = [x]
flatten (Node s t)
= flatten s ++
flatten t
SBLP 2003 60
Algebraic or abstract type?
data Tr a
= Leaf a |
Node a (Tr a) (Tr
a)
isLeaf = …
isNode = …
…
Tr
isLeaf
isNode
leaf
left
right
mkLeaf
mkNode
flatten :: Tr a -> [a]
flatten t
| isleaf t = [leaf t]
| isNode t
= flatten (left t)
++ flatten (right t)
SBLP 2003 61
Algebraic or abstract type?
Pattern matching syntax is more direct …
… but can achieve a considerable amount with field names.
Other reasons? Simplicity (due to other refactoring steps?).
Allows changes in the implementation type without affecting the client: e.g. might memoise
Problematic with a primitive type as carrier.
Allows an invariant to be preserved.
SBLP 2003 62
Outside or inside?
data Tr a
= Leaf a |
Node a (Tr a) (Tr
a)
isLeaf = …
…
Tr
isLeaf
isNode
leaf
left
right
mkLeaf
mkNode
flatten :: Tr a -> [a]
flatten t
| isleaf t = [leaf t]
| isNode t
= flatten (left t)
++ flatten (right t)
SBLP 2003 63
Outside or inside?
data Tr a
= Leaf a |
Node a (Tr a) (Tr
a)
isLeaf = …
…
flatten = …
Tr
isLeaf
isNode
leaf
left
right
mkLeaf
mkNode
flatten
SBLP 2003 64
Outside or inside?
If inside and the type is reimplemented, need to reimplement everything in the signature, including flatten.
The more outside the better, therefore.
If inside can modify the implementation to memoise values of flatten, or to give a better implementation using the concrete type.
Layered types possible: put the utilities in a privileged zone.
SBLP 2003 65
Replace function by constructor
data Expr = Star Expr |
Then Expr Expr | …
plus e = Then e (Star e)
plus is just syntactic sugar; reduce the number of cases in definitions.
[Character range is a better example.]
data Expr = Star Expr |
Plus Expr |
Then Expr Expr | …
Can treat Plus differently, e.g. literals (Plus e) = literals e
but require each function over Expr to have a Plus clause.
SBLP 2003 66
Other examples ...
Modify the return type of a function from T to Maybe T, Either T T' or [T].
Would be nice to have field names in Prelude types.
Add an argument; (un)group arguments; reorder arguments.
Move to monadic presentation: important case study.
Flat or layered datatypes (Expr: add BinOp type).
Various possibilities for error handling/exceptions.
… Tableau case study.
SBLP 2003 67
Change of user interface
Refactor the existing text-based application …
… so that it can have textual or graphical user interface.
SBLP 2003 68
Changing functionality?
The aim is not to change functionality …
… or at least not required functionality.
What level of behaviour is visible?
May change incidental properties …
… cf legacy systems: preserve their essential properties but not their accidental ones.
SBLP 2003 69
Other uses of refactoring
Understand someone else’s code …
… make it your own.
Learning a language: learn how you could modify the programs that you have written …
… appreciate the design space.
SBLP 2003 70
Conclusions
Refactoring + functional programming: good fit.
Stresses the type system: generic traversal …
Practical tool … not ‘yet another type tweak’.
Leverage from available libraries … with work.
We are eager to use the tool in building itself!
SBLP 2003 71
Understanding: semantic tableaux
Take a working semantic tableau system written by an anonymous 2nd year student …
… refactor to understand its behaviour.
Nine stages of unequal size.
Reflections afterwards.
SBLP 2003 72
An example tableau
((AC)((AB)C))
((AB)C)(AC)
CA
(AB)C
A B Make BTrueMake A and C
False
SBLP 2003 73
v1: Name types
Built-in types
[Prop]
[[Prop]]
used for branches and tableaux respectively.
Modify by adding
type Branch = [Prop]
type Tableau = [Branch]
Change required throughout the program.
Simple edit: but be aware of the order of substitutions: avoid
type Branch = Branch
SBLP 2003 74
v2: Rename functions
Existing namestableaux
removeBranch
remove
becometableauMain
removeDuplicateBranches
removeBranchDuplicates
and add comments clarifying the (intended) behaviour.
Add test datum.
Discovered some edits undone in stage 1.
Use of the type checker to catch errors.
test will be useful later?
SBLP 2003 75
v3: Literate normal script
Change from literate form:
Comment …
> tableauMain tab
> = ...
to
-- Comment …
tableauMain tab
= ...
Editing easier: implicit assumption was that it was a normal script.
Could make the switch completely automatic?
SBLP 2003 76
v4: Modify function definitions
From explicit recursion:displayBranch
:: [Prop] -> String
displayBranch [] = []
displayBranch (x:xs)
= (show x) ++ "\n" ++
displayBranch xs
todisplayBranch
:: Branch -> String
displayBranch
= concat . map (++"\n") . map show
More abstract … move somewhat away from the list representation to operations such as map and concat which could appear in the interface to any collection type.
First time round added incorrect (but type correct) redefinition … only spotted at next stage.
Version control: undo, redo, merge, … ?
SBLP 2003 77
v5: Algorithms and types (1)
removeBranchDup :: Branch -> Branch
removeBranchDup [] = []
removeBranchDup (x:xs)
| x == findProp x xs = [] ++ removeBranchDup xs
| otherwise = [x] ++ removeBranchDup xs
findProp :: Prop -> Branch -> Prop
findProp z [] = FALSE
findProp z (x:xs)
| z == x = x
| otherwise = findProp z xs
SBLP 2003 78
v5: Algorithms and types (2)
removeBranchDup :: Branch -> Branch
removeBranchDup [] = []
removeBranchDup (x:xs)
| findProp x xs = [] ++ removeBranchDup xs
| otherwise = [x] ++ removeBranchDup xs
findProp :: Prop -> Branch -> Bool
findProp z [] = False
findProp z (x:xs)
| z == x = True
| otherwise = findProp z xs
SBLP 2003 79
v5: Algorithms and types (3)
removeBranchDup :: Branch -> Branch
removeBranchDup = nub
findProp :: Prop -> Branch -> Bool
findProp = elem
SBLP 2003 80
v5: Algorithms and types (4)
removeBranchDup :: Branch -> Branch
removeBranchDup = nub
Fails the test! Two duplicate branches output, with different ordering of elements.
The algorithm used is the 'other' nub algorithm, nubVar:
nub [1,2,0,2,1] = [1,2,0]
nubVar [1,2,0,2,1] = [0,2,1]
The code is dependent on using lists in a particular order to represent sets.
SBLP 2003 81
v6: Library function to module
Add the definition:
nubVar = …
to the module
ListAux.hs
and replace the definition by
import ListAux
Editing easier: implicit assumption was that it was a normal script.
Could make the switch completely automatic?
SBLP 2003 82
v7: Housekeeping
Remanings: including foo and bar and contra (becomes notContra).
An instance of filter,looseEmptyLists
is defined using filter, and subsequently inlined.
Put auxiliary function into a where clause.
Generally cleans up the script for the next onslaught.
SBLP 2003 83
v8: Algorithm (1)
splitNotNot :: Branch -> Tableau
splitNotNot ps = combine (removeNotNot ps) (solveNotNot ps)
removeNotNot :: Branch -> Branch
removeNotNot [] = []
removeNotNot ((NOT (NOT _)):ps) = ps
removeNotNot (p:ps) = p : removeNotNot ps
solveNotNot :: Branch -> Tableau
solveNotNot [] = [[]]
solveNotNot ((NOT (NOT p)):_) = [[p]]
solveNotNot (_:ps) = solveNotNot ps
SBLP 2003 84
v8: Algorithm (2)
splitXXX removeXXX solveXXX
are present for each of nine rules.
The algorithm applies rules in a prescribed order, using an integer value to pass information between functions.
Aim: generic versions of split remove solve
Have to change order of rule application …… which has a further effect on duplicates.
Add map sort to top level pipeline prior to duplicate removal.
SBLP 2003 85
v9: Replace lists by sets.
Wholesale replacement of lists by a Set library.
map mapSet
foldr foldSet (careful!)
filter filterSet
The library exposes the representation: pick, flatten. Use with discretion … further refactoring possible.
Library needed to be augmented with
primRecSet :: (a -> Set a -> b -> b) -> b -> Set a -> b
SBLP 2003 86
v9: Replace lists by sets (2)
Drastic simplification: no need for explicit worries about … ordering and its effect on equality, … (removal of) duplicates.
Difficult to test whilst in intermediate stages: the change in a type is all or nothing …… work with dummy definitions and the type checker.
Further opportunities:… why choose one rule from a set when could apply to all elements at once? Gets away from picking on one value (and breaking the set interface).
SBLP 2003 87
Conclusions of the case study
Heterogeneous process: some small, some large.
Are all these stages strictly refactorings: some semantic changes always necessary too?
Importance of type checking for hand refactoring … … and testing when any semantic changes.
Undo, redo, reordering the refactorings … CVS.
In this case, directional … not always the case.