Type-driven testing in Haskell slides

Post on 10-Apr-2015

520 views 7 download


Simon Peyton Jones talks about QuickCheck and SmallCheck.


Purely testing

Simon Peyton Jones

Microsoft Research



1. Over the next 10 years, the software

battleground will be the control of effects

2. To succeed, we must shift programming

perspective from imperative-by-default to


3. A concrete example: testing

o Functional programs are far easier to test

o A functional language is a fantastic test generation


c.f. static

types 1995-




X := In1

X := X*X

X := X + In2*In2

C, C++, Java, C#, VB Excel, Haskell

Do this, then do that

“X” is the name of a cell

that has different values

at different times

No notion of sequence

“A2” is the name of a

(single) value

Commands, control flow Expressions, data flow


(no effects)


A bigger example

N-shell of atom A

Atoms accessible in N hops (but no fewer) from A

A50-shell of 100k-atom model

of amorphous silicon,

generated using F#

Thanks: Jon Harrop

A bigger example

N-shell of atom A

Atoms accessible in N hops (but no fewer) from A


1-shell of atom A

A bigger example

N-shell of atom A

Atoms accessible in N hops (but no fewer) from A


2-shell of atom A

A bigger exampleTo find the N-shell of A

• Find the (N-1) shell of A

• Union the 1-shells of each of those atoms

• Delete the (N-2) shell and (N-1) shell of A

Suppose N=4

A‟s 3-shell

A bigger exampleTo find the N-shell of A

• Find the (N-1) shell of A

• Union the 1-shells of each of those atoms

• Delete the (N-2) shell and (N-1) shell of A

Suppose N=4

A‟s 3-shell

1-shell of 3-shell atoms

A bigger exampleTo find the N-shell of A

• Find the (N-1) shell of A

• Union the 1-shells of each of those atoms

• Delete the (N-2) shell and (N-1) shell of A

Suppose N=4

A‟s 4-shell

A‟s 2-shell and 3-shell

A bigger exampleTo find the N-shell of A

• Find the (N-1) shell of A

• Find all the neighbours of those atoms

• Delete the (N-2) shell and (N-1) shell of A

nShell :: Graph -> Int -> Atom -> Set Atom

nShell g 0 a = unitSet a

nShell g 1 a = neighbours g a

nShell g n a = (mapUnion (neighbours g) s1) – s1 – s2


s1 = nShell g (n-1) a

s2 = nShell g (n-2) a

unitSet :: a -> Set a

(–) :: Set a -> Set a -> Set a

neighbours :: Graph -> Atom -> Set Atom

(–) :: Set a -> Set a -> Set a

mapUnion :: (a -> Set b) -> Set a -> Set b

neighbours :: Graph -> Atom -> Set Atom

nShell n needs

• nShell (n-1)

• nShell (n-2)

nShell :: Graph -> Int -> Atom -> Set Atom

nShell g 0 a = unitSet a

nShell g 1 a = neighbours g a

nShell g n a = (mapUnion (neighbours g) s1) – s1 – s2


s1 = nShell g (n-1) a

s2 = nShell g (n-2) a


nShell n needs

• nShell (n-1) which needs

• nShell (n-2)

• nShell (n-3)

• nShell (n-2) which needs

• nShell (n-3)

• nShell (n-4)

nShell :: Graph -> Int -> Atom -> Set Atom

nShell g 0 a = unitSet a

nShell g 1 a = neighbours g a

nShell g n a = (mapUnion (neighbours g) s1) – s1 – s2


s1 = nShell g (n-1) a

s2 = nShell g (n-2) a



nShell :: Graph -> Int -> Atom -> Set Atom

nShell g 0 a = unitSet a

nShell g 1 a = neighbours g a

nShell g n a = (mapUnion (neighbours g) s1) – s1 – s2


s1 = nShell g (n-1) a

s2 = nShell g (n-2) a


BUT, the two calls to (nShell g (n-2) a)

must yield the same resultAnd so we can safely share them

• Memo function, or

• Return a pair of results

Same inputs


same outputs


“Referential transparency”

“No side effects”


g n a

Purity pays: understanding

Would it matter if we swapped the order of

these two calls?

What if X1=X2?

I wonder what else X1.insert does?

Lots of heroic work on static analysis, but

hampered by unnecessary effects

X1.insert( Y )

X2.delete( Y )What does this

program do?

Purity pays: verification

void Insert( int index, object value )

requires (0 <= index && index <= Count)

ensures Forall{ int i in 0:index; old(this[i]) == this[i] }

{ ... }


The pre and post-conditions are

written in... a functional language

Also: object invariants

But: invariants temporarily broken

Hence: “expose” statements



Purity pays: maintenance

The type of a function tells you a LOT

about it

Large-scale data representation changes

in a multi-100kloc code base can be done


o change the representation

o compile until no type errors


reverse :: [a] -> [a]

Purity pays: performance

Execution model is not so close to machine

o Hence, bigger job for compiler, execution may be


But: algorithm is often more important than raw


And: purity supports radical optimisations

o nShell runs 100x faster in F# than C++

Why? More sharing of parts of sets.

o SQL, XQuery query optimisers

Real-life example: Smoke Vector Graphics

library: 200kloc C++ became 50kloc OCaml, and

ran 5x faster

Purity pays: parallelism

Pure programs are “naturally parallel”

No mutable state

means no locks,

no race hazards

Results totally unaffected by parallelism

(1 processor or zillions)


oGoogle‟s map/reduce

o SQL on clusters







Purity pays: parallelism

Can I run this LINQ query in parallel?

Race hazard because of the side effect in

the „where‟ clause

May be concealed inside calls

Parallel query is correct/reliable only if the

expressions in the query are 100% pure

int index = 0;

List<Customer> top10 = (from c in customers

where index++ < 10

select c).ToList();

Purity pays: testing

Testing is tremendously important in


Regression tests check for, well,

regressions. Only catches 15% of bugs.

Desperately needed: semi-automatic test

generators. Challenges:

oHow do we say what to test?

oHow do we generate test data?

Purity pays: testing

In an imperative or OO language, you must

set up the state of the object, and the external

state it reads or writes

make the call(s)

inspect the state of the object, and the external


perhaps copy part of the object or global state,

so that you can use it in the postcondition

Purity pays: testing in Haskell

• How do we say what to test?

Answer: write a Haskell function

• Ordinary Haskell (no new language)

• Type-checked

• May involve inter-relationships

prop_union :: Set a -> Bool

prop_union s = union s s == s

prop_revapp xs ys = reverse (xs ++ ys)


(reverse xs) ++ (reverse ys)

No “old” s

Purity pays: testing in Haskell

• How do we generate test data?

Answer: use the QuickCheck library

• QuickCheck is just a Haskell library

• No new tools to learn

• Lightweight, so more likely to be used

Main> quickCheck prop_union

*** OK Passed 100 tests


SMS encoding

Pack 7-bit characters into 8-bit bytes, and


Pack and unpack should be inverses

pack :: [Word8] -> [Word8]

unpack :: [Word8] -> [Word8]

prop_pack :: [Word8] -> Bool

prop_pack s = unpack (pack s) == s



If too much data is discarded, QuickCheck

warns you (e.g. False ==> condition

should not just say “passed”!)

prop_pack s = length s == 8


unpack (pack s) == s

Prelude> quickCheck prop_ins

*** Gave up! Passed only 53 tests:


Danger of skewed data distribution

Chances of a list being ordered decrease

with size => test distribution will be

skewed towards small lists

insert :: Ord a => a -> [a] -> [a]

ordered :: Ord a => [a] -> Bool

prop_ins x xs = ordered xs


ordered (insert x xs)


Show data distribution

prop_ins x xs = ordered xs


collect (length xs)

(ordered (insert x xs))

Prelude> quickCheck prop_ins

*** Gave up! Passed only 53 tests:

39% 1

22% 0

20% 2

15% 3

1% 6


Generators allow you to control the shape

and distribution of your data

prop_pack2 = forAll (vectorOf 8 arbitrary) prop_pack

prop_pack s = unpack (pack s) == s


arbitrary :: Gen Word8

vectorOf :: Int -> Gen a -> Gen [a]

forAll :: Gen a -> (a -> Bool) -> Property

Digression: how can this work?


prop_rev :: [Int] -> Bool

prop_rev xs = xs == reverse (reverse xs)

prop_revapp :: [Int] -> [Int] -> Bool

prop_revapp xs ys = xs++ys == reverse xs ++ reverse ys

Prelude> quickCheck prop_rev


Prelude> quickCheck prop_revapp


What type does quickCheck have????

Prelude> :i quickCheck

quickCheck :: Testable p => p -> IO ()

If a function works for every type that has particular properties, the type of the function says just that

Otherwise, it must work for any type whatsoever

Type classes

delete :: w. Eq w => [w] -> w -> [w]

sort :: Ord a => [a] -> [a]

serialise :: Show a => a -> String

square :: Num n => n -> n

“for all types w that support the Eq operations”

reverse :: [a] -> [a]

filter :: (a -> Bool) -> [a] -> [a]

Type classes

square :: Num n => n -> n

square x = x*x

class Num a where

(+) :: a -> a -> a

(*) :: a -> a -> a

negate :: a -> a


FORGET all you know about OO classes!

The classdeclaration says what the Num operations are

Works for any type „n‟ that supports the Num operations

instance Num Int where

a + b = plusInt a b

a * b = mulInt a b

negate a = negInt a


An instancedeclaration for a

type T says how the Num operations are implemented on T‟s

plusInt :: Int -> Int -> Int

mulInt :: Int -> Int -> Int

etc, defined as primitives

How type classes work

square :: Num n => n -> n

square x = x*x

square :: Num n -> n -> n

square d x = (*) d x x

The “Num n =>” turns into an extra value argument to the

function.It is a value of data type Num n

When you write this... ...the compiler generates this

A value of type (Num T) is a vector of the Num operations for

type T

How type classes work

square :: Num n => n -> n

square x = x*x

class Num a where

(+) :: a -> a -> a

(*) :: a -> a -> a

negate :: a -> a


The class decl translates to:• A data type decl for Num• A selector function for

each class operation

square :: Num n -> n -> n

square d x = (*) d x x

When you write this... ...the compiler generates this

data Num a

= MkNum (a->a->a)




(*) :: Num a -> a -> a -> a

(*) (MkNum _ m _ ...) = m

A value of type (Num T) is a vector of the Num operations for

type T

dNumInt :: Num Int

dNumInt = MkNum plusInt




How type classes work

square :: Num n => n -> n

square x = x*x

An instance decl for type T translates to a value

declaration for the Num dictionary for T

square :: Num n -> n -> n

square d x = (*) d x x

When you write this... ...the compiler generates this

A value of type (Num T) is a vector of the Num operations for

type T

instance Num Int where

a + b = plusInt a b

a * b = mulInt a b

negate a = negInt a


All this scales up nicely

sumSq :: Num n => n -> n -> n

sumSq x y = square x + square y

sumSq :: Num n -> n -> n -> n

sumSq d x y = (+) d (square d x)

(square d y)

Pass on d to squareExtract addition operation from d

You can build big overloaded functions by calling smaller overloaded functions

Example: complex numbers

data Cpx a = Cpx a a

instance Num a => Num (Cpx a) where

(Cpx r1 i1) + (Cpx r2 i2) = Cpx (r1+r2) (i1+i2)

fromInteger n = Cpx (fromInteger n) 0

class Num a where

(+) :: a -> a -> a

(-) :: a -> a -> a

fromInteger :: Integer -> a


inc :: Num a => a -> a

inc x = x + 1

Even literals are overloaded

“1” means “fromInteger 1”

Properties can be overloaded too

The type signature tells quickCheck

whether to generate Ints or Floats

prop_assoc :: Num a => a -> a -> a -> Bool

prop_assoc x y z = (x+y)+z == x+(y+z)

Prelude> quickCheck (prop_assoc :: Int -> Int -> Int)


Prelude> quickCheck (prop_assoc :: Flt -> Flt -> Flt)


Back to QuickCheck

quickCheck :: Testable a => a -> IO ()

class Testable a where

test :: a -> RandSupply -> Bool

class Arbitrary a where

arby :: RandSupply -> a

instance Testable Bool where

test b r = b

instance (Arbitrary a, Testable b)

=> Testable (a->b) where

test f r = test (f (arby r1)) r2

where (r1,r2) = split r

split :: RandSupply -> (RandSupply, RandSupply)

A completely different example:


test prop_rev r

= test (prop_rev (arby r1)) r2

where (r1,r2) = split r

= prop_rev (arby r1)

prop_rev:: [Int] -> Bool

Using instance for (->)

Using instance for Bool

Generating arbitrary values

class Arbitrary a where

arby :: RandSupply -> a

instance Arbitrary Int where

arby r = randInt r

instance Arbitrary a

=> Arbitrary [a] where

arby r | even r1 = []

| otherwise = arby r2 : arby r3


(r1,r’) = split r

(r2,r3) = split r’

split :: RandSupply -> (RandSupply, RandSupply)

randInt :: RandSupply -> Int

Generate cons value

Generate Nil value

Three take-away thoughts

1. Testing pure functions is a lot easier than

testing stateful ones

2. To generate tests you need a “domain

specific language”

oHigher order functional languages (higher

order) are ideal for this purpose

3. You can use (2) without (1)

Testing imperative programs

Imperative program

(e.g. Web Service)





model Results


John Hughes‟ s company, Quvik,

does just this for telecoms software

Other testing tools for Haskell

Hunit (unit testing)

Lazy Smallcheck (exhaustive testing)

Catch (static analysis for pattern match


Haskell Program Coverage Tool (so you

can see where your tests reach)

Time and space profiling


Standing back....

Mainstream languages are hamstrung by

gratuitous (ie unnecessary) effects: effects

are part of the fabric of computation

Future software will be effect-free by


oWith controlled effects where necessary

o Statically checked by the type system

T = 0; for (i=0; i<N; i++) { T = T + i }

And the future is here...

Functional programming has fascinated

academics for decades

But professional-developer interest in

functional programming has sky-rocketed

in the last 5 years.

Suddenly, FP is cool, not geeky.

Most research languages

1yr 5yr 10yr 15yr





The quick death




Successful research languages

1yr 5yr 10yr 15yr





The slow death




C++, Java, Perl, Ruby

1yr 5yr 10yr 15yr





The regrettable

absence of death



ctitioners Threshold of immortality






The second life?




“Learning Haskell is a great way of

training yourself to think functionally so

you are ready to take full advantage of

C# 3.0 when it comes out”

(blog Apr 2007)

“I'm already looking at coding

problems and my mental

perspective is now shifting

back and forth between purely

OO and more FP styled


(blog Mar 2007)

1990 1995 2000 2005 2010

Lots of other great examples

Erlang: widely respected and admired as

a shining example of functional

programming applied to an important


F#: now being commercialised by


OCaml, Scala, Scheme: academic

languages being widely used in industry

C#: explicitly adopting functional ideas

(e.g. LINQ)

Sharply rising activity

GHC bug tracker


Haskell IRC channel


Jan 20 Austin Functional Programming Austin

Feb 9 FringeDC Washington DC

Feb 11 PDXFunc Portland

Feb 12 Fun in the afternoon London

Feb 13 BayFP San Francisco

Feb 16 St-Petersburg Haskell User Group Saint-Petersburg

Feb 19 NYFP Network New York

Feb 20 Seattle FP Group Seattle


Commercial Users

of Functional Programming


CUFP 2008 is part of the a new

Functional Programming Developer Conference(tutorials, tools, recruitment, etc)

Victoria, British Columbia, Sept 2008

Same meeting: workshops on Erlang, ML, Haskell, Scheme.

Speakers describing applications in:

banking, smart cards, telecoms, data

parallel, terrorism response training,

machine learning, network services,

hardware design, communications

security, cross-domain security

Summary The languages and tools of functional

programming are being used to make

money fast

The ideas of functional programming are

rapidly becoming mainstream

In particular, the Big Deal for

programming in the next decade is the

control of effects, and functional

programming is the place to look for


Quotes from the front line “Learning Haskell has completely reversed my feeling that static

typing is an old outdated idea.”

“Changing the type of a function in Python will lead to strange

runtime errors. But when I modify a Haskell program, I already

know it will work once it compiles.”

“Our chat system was implemented by 3 other groups (two Java,

one C++). Haskell implementation is more stable, provides more

features, and has about 70% less code.”

“I‟m no expert, but I got an order of magnitude improvement in code

size and 2 orders of magnitude development improvement in

development time”

“My Python solution was 50 lines. My Haskell solution was 14

lines, and I was quite pleased. Your Haskell solution was 5.”

"C isn't hard; programming in C is hard. On the other hand, Haskell

is hard, but programming in Haskell is easy.”