POC d'une architecture distribuee de calculs financiers

Post on 29-Jun-2015

112 views 1 download

Tags:

description

Présentation effectuée pendant l'Open-XKE de Xebia France. Ceci est le résultat d'un POC sur la création d'une architecture distribuée de calculs financiers. On y parle de Scala, programmation fonctionnelle, de Stream, du patter Iteratee, de Akka Actors et Akka Cluster

transcript

Système distribué de calculs financiers

Par Xavier Bucchiotty

ME

@xbucchiotty

https://github.com/xbucchiotty

http://blog.xebia.fr/author/xbucchiotty

Build a testable,

composable and scalable

cash-flow system

Stream API Iteratees Akka actor Akka cluster

Step 4Step 1 Step 2 Step 3

Use caseFinancial debt management

CAUTION

initial = 1000 €duration = 5 yearsfixed interets rate = 5%

Date Amort Interests Outstanding

2013-01-01 200 € 50 € 800 €

2014-01-01 200 € 40 € 600 €

2015-01-01 200 € 30 € 400 €

2016-01-01 200 € 20 € 200 €

2017-01-01 200 € 10 € 0 €

1000 €

initial = 1000 €duration = 5 yearsfixed interets rate = 5%

Date Amort Interests Outstanding

2013-01-01 200 € 50 € 800 €

2014-01-01 200 € 40 € 600 €

2015-01-01 200 € 30 € 400 €

2016-01-01 200 € 20 € 200 €

2017-01-01 200 € 10 € 0 €

1000 €

date = last date + (1 year)

initial = 1000 €duration = 5 yearsfixed interets rate = 5%

Date Amort Interests Outstanding

2013-01-01 200 € 50 € 800 €

2014-01-01 200 € 40 € 600 €

2015-01-01 200 € 30 € 400 €

2016-01-01 200 € 20 € 200 €

2017-01-01 200 € 10 € 0 €

1000 €

amort = initial / duration

initial = 1000 €duration = 5 yearsfixed interets rate = 5%

Date Amort Interests Outstanding

2013-01-01 200 € 50 € 800 €

2014-01-01 200 € 40 € 600 €

2015-01-01 200 € 30 € 400 €

2016-01-01 200 € 20 € 200 €

2017-01-01 200 € 10 € 0 €

1000 €

outstanding = last oustanding - amort

initial = 1000 €duration = 5 yearsfixed interets rate = 5%

Date Amort Interests Outstanding

2013-01-01 200 € 50 € 800 €

2014-01-01 200 € 40 € 600 €

2015-01-01 200 € 30 € 400 €

2016-01-01 200 € 20 € 200 €

2017-01-01 200 € 10 € 0 €

1000 €

interests = last outstanding * rate

val f = (last: Row) => new Row {

def date = last.date + (1 year)

def amortization = last amortization

def outstanding = last.outstanding - amortization

def interests = last.outstanding * fixedRate

}

Step 1Stream API

Date Amort Interests Outstanding

2013-01-01 200 € 50 € 800 €

2014-01-01 200 € 40 € 600 €

2015-01-01 200 € 30 € 400 €

2016-01-01 200 € 20 € 200 €

2017-01-01 200 € 10 € 0 €

Date Amort Interests Outstanding

2013-01-01 200 € 50 € 800 €

2014-01-01 200 € 40 € 600 €

2015-01-01 200 € 30 € 400 €

2016-01-01 200 € 20 € 200 €

2017-01-01 200 € 10 € 0 €

first

f(first)

f(f(first))

case class Loan( ... ) {

def first: Row

def f:(Row => Row)

def rows = Stream.iterate(first)(f) .take(duration)

}

case class Portfolio(loans: Seq[Loan]) {

def rows =

loans.stream.flatMap(_.rows)

}

3450 €Total

Date Amort Interests Total paid

2013-01-01 200 € 50 € 250 €2014-01-01 200 € 40 € 240 €2015-01-01 200 € 30 € 230 €2016-01-01 200 € 20 € 220 €2017-01-01 200 € 10 € 210 €2013-01-01 200 € 50 € 250 €2014-01-01 200 € 40 € 240 €2015-01-01 200 € 30 € 230 €2016-01-01 200 € 20 € 220 €2017-01-01 200 € 10 € 210 €2013-01-01 200 € 50 € 250 €2014-01-01 200 € 40 € 240 €2015-01-01 200 € 30 € 230 €2016-01-01 200 € 20 € 220 €2017-01-01 200 € 10 € 210 €

Loan 1

Loan 2

Loan 3

// Produce rowsval totalPaid = portfolio.rows

// Transform rows to amount.map(row => row.interests + row.amortization)

//Consume amount.foldLeft(0 EUR)(_ + _)

// Produce rowsval totalPaid = portfolio.rows

// Transform rows to amount.map(row => row.interests + row.amortization)

//Consume amount.foldLeft(0 EUR)(_ + _)

type RowProducer = Iterable[Row]

type RowTransformer[T] = (Row=>T)

type AmountConsumer[T] = (Iterable[Amount]=>T)

//LoanStream.iterate(first)(f) take duration

//Porfolioloans => loans flatMap (loan => loan.rows)

RowProducer(Iterable[Row])

+ on demand computation- sequential computation

object RowTransformer {

val totalPaid = (row: Row) =>

row.interests + row.amortization

}

+ function composition- type limited to «map»

RowTransformer(Row => T)

object AmountConsumer {

def sum = (rows: Iterable[Amount]) => rows.foldLeft(Amount(0, EUR))(_ + _)

}

AmountConsumer(Iterable[Amount] => T)

+ function composition- synchronism

Stream API

Step 1

5000 loans50 rows

~ 560 ms

On demand computation

Function composition

Sequential computation

Synchronism

Transformation limited to «map»

Pros Cons

Step 2Iteratees

Integrating Play iterateeslibraryDependencies ++= Seq( "com.typesafe.play" %% "play-iteratees" % "2.2.0-RC2")

Enumerator

Iteratee

Producer

Input Status

Consumer

Enumerator

Iteratee

Input StatusIteratees are immutable

Asynchronous by design

Type safe

Enumerator

enumerate and interleave

case class Loan(initial: Amount, duration: Int, rowIt: RowIt) {

def rows(implicit ctx: ExecutionContext) =

Stream.iterate(first)(f).take(duration)

}

Data producer

Enumerator.enumerate(

)

case class Portfolio(loans: Seq[Loansan]) {

def rows(implicit ctx: ExecutionContext) =

}

producers can be combined

Enumerator.interleave(loans.map(_.rows))

2013-01-01 200 € 50 € 250 €2014-01-01 200 € 40 € 240 €2015-01-01 200 € 30 € 230 €2016-01-01 200 € 20 € 220 €2017-01-01 200 € 10 € 210 €

2013-01-01 200 € 50 € 250 €2014-01-01 200 € 40 € 240 €2015-01-01 200 € 30 € 230 €2016-01-01 200 € 20 € 220 €2017-01-01 200 € 10 € 210 €

Date Amort Interests Total paid

2013-01-01 200 € 50 € 250 €2014-01-01 200 € 40 € 240 €2015-01-01 200 € 30 € 230 €2016-01-01 200 € 20 € 220 €2017-01-01 200 € 10 € 210 €

3450 €Total

Iteratee

Consumer as a state machine

Iteratees consume Input

object Input {

case class El[+E](e: E)

case object Empty

case object EOF

}

and propagates a state

object Step {

case class Done[+A, E](a: A, remaining: Input[E])

case class Cont[E, +A](k: Input[E] => Iteratee[E, A])

case class Error[E](msg: String, input: Input[E])

}

Enumerator

Iterateedef step = ...val count = 0

Input

El(...)

Status

Continue

Iterateedef step = ...val count = 1

computes

Iterateedef step = ...val count = 1

Iterateedef step = ...val count = 1

Enumerator

Input

EOF

Status

Done

computes

Iterateedef step = ...val count = 1

Enumerator

Input

El(...)

Status

Error

Iterateedef step = ...val error = "Runtime Error"

computes

val last: RowConsumer[Option[Row]] = {

def step(last: Option[Row]): K[Row,Option[Row]]= {

case Input.Empty => Cont(step(last))

case Input.EOF => Done(last, Input.EOF)

case Input.El(e) => Cont(step(Some(e)))

}

Cont(step(Option.empty[Row]))

}

object AmountConsumer {

val sum: AmountConsumer[Amount] =

}

(rows: Iterable[Amount]) => rows.foldLeft(Amount(0, EUR))(_ + _)

object AmountConsumer {

val sum: AmountConsumer[Amount] =

}

Iteratee.fold[Amount, Amount](Amount(0, EUR))(_ + _)

import RowTransformer.totalPaidimport AmountConsumer.sum

val totalPaidComputation: Future[Amount] = portfolio.rows.run(sum)

import RowTransformer.totalPaidimport AmountConsumer.sum

val totalPaidComputation: Future[Amount] = portfolio.rows |>>> sum

Enumeratee

map and filter

Enumerator

Iteratee

Producer

Input Status

Consumer

Enumerator

Iteratee

Producer

Input[A]

Status

Consumer

EnumerateeTransformation

Input[B]

Data transformation

object RowTransformer {

val totalPaid =

Enumeratee.map[Row](row =>

row.interests + row.amortization

)

}

def until(date: DateMidnight) = Enumeratee.filter[Row](

row => !row.date.isAfter(date)

)

Data filtering

type RowProducer = Iterable[Row]

type RowProducer = Enumerator[Row]

type AmountConsumer[T] = (Iterable[Amount]=>T)

type RowTransformer[T] = (Row=>T)

type RowTransformer[T] = Enumeratee[Row, T]

type AmountConsumer[T] = Iteratee[Amount, T]

Futures are composable

map, flatMap, filteronComplete, onSuccess, onError, recover

// Produce rowsval totalPaidComputation: Future[Amount] = portfolio.rows &> totalPaid |>>> sum

// Blocking the thread to wait for the resultval totalPaid =

Await.result(

totalPaidComputation,

atMost = defaultTimeout)

totalPaid should equal(3480 EUR)

We still have function compositionand prepares the code for asynchronism

RowProducer//LoanEnumerator.enumerate( Stream.iterate(first)(f).take(duration))

//PorfolioEnumerator.interleave(loans.map(_.rows))

+ on demand computation+ parallel computation

RowTransformer

+ Function composition+ map, filter, ...

val totalPaid = Enumeratee.map[Row](row =>

row.interests + row.amortization

)

AmountConsumer

+ Function composition+ Asynchronism

def sum = Iteratee.fold[Amount, Amount]

(Amount(0, EUR))(_ + _)

Stream API

Step 1

5000 loans50 rows

~ 560 ms

Iteratees

Step 2

5000 loans50 rows

~ 3500 ms?

simple test

complex test Thread.sleep((Math.random() * 1000) % 2) toLong)

Stream API

Step 1

5000 loans50 rows

~ 560 ms

with pause~ 144900 ms

Iteratees

Step 2

5000 loans50 rows

~ 3500 ms

with pause~ 157285 ms

?

Cost of using this implementation of iteratees

is greater than gain of interleaving for such small

operations

Bulk interleaving

//Portfolioval split =loans.map(_.stream).grouped(loans.size / 4)

Stream API

Step 1

5000 loans50 rows

~ 560 ms

with pause~ 144900 ms

Iteratees

Step 2

5000 loans50 rows

~ 4571 ms

with pause~ 39042 ms

On demand computation

Function composition

Sequential computation

Synchronism

Transformation limited to «map»

Pros Cons

On demand computation

Function composition

Sequential computation

Synchronism

Pros Cons

On demand computation

Pros Cons

Function composition

Parallel computation

Asynchronism

No error management

No elasticity

No resilience

Step 3Akka actor

Integrating AkkalibraryDependencies ++= Seq( "com.typesafe.akka" %% "akka-actor" % "2.2.0")

Actors are objects

They communicate with each other by messages

asynchronously

class Backend extends Actor {

def receive = {

case Compute(loan) => sender.tell( msg = loan.stream.toList, sender = self)

}}

case class Compute(loan: Loan)

case class Loan

def rows(implicit calculator: ActorRef, ctx: ExecutionContext) = {

val responseFuture = ask(calculator,Compute(this))

val rowsFuture = responseFuture .mapTo[List[Row]]

rowsFuture.map(Enumerator.enumerate(_)) ) }}

val system = ActorSystem.create("ScalaIOSystem")

val calculator = system.actorOf(Props[Backend].withRouter(

RoundRobinRouter(nrOfInstances = 10)),"calculator")

}

Supervisionval simpleStrategy = OneForOneStrategy() { case _: AskTimeoutException => Resume case _: RuntimeException => Escalate}

system.actorOf(Props[Backend]....withSupervisorStrategy(simpleStrategy)), "calculator")

Router

Routee 3

Routee 2

Routee 1

ComputeCompute

Router

Routee 3

Routee 2

Routee 1

AskTimeoutException

Resume

Router

Routee 3

Routee 2

Routee 1

Actor System

RowProducer//Loanask(calculator,Compute(this))

.mapTo[List[Row]]

.map(Enumerator.enumerate(_))

//PorfolioEnumerator.interleave(loans.map(_.rows))

+ parallel computation- on demand computation

RowTransformer

+ Nothing changed

val totalPaid = Enumeratee.map[Row](row =>

row.interests + row.amortization

)

AmountConsumerdef sum = Iteratee.fold[Amount, Amount]

(Amount(0, EUR))(_ + _)

+ Nothing changed

5000 loans50 rows

~ 4571 ms

with pause~ 39042 ms

Stream API

Step 1

5000 loans50 rows

~ 560 ms

with pause~ 144900 ms

Iteratees

Step 2

Akka actor

Step 3

5000 loans50 rows

~ 4271 ms

with pause~ 40882 ms

On demand computation

Function composition

Parallel computation

Asynchronism

Pros Cons

No error management

No elasticity

No resilience

On demand computation

Function composition

Parallel computation

Asynchronism

Pros Cons

No error management

No elasticity

No resilience

No on demand computation

Function composition

Parallel computation

Asynchronism

Error management

Pros Cons

No elasticity

No resilience

Step 4Akka cluster

Integrating Akka ClusterlibraryDependencies ++= Seq( "com.typesafe.akka" %% "akka-cluster" % "2.2.0")

Cluster RouterClusterRouterConfig

Can create actors on different nodes of the cluster Role Local actors or not Control number of actors per node per system

Cluster RouterAdaptiveLoadBalancingRouter

Collect metrics (CPU, HEAP, LOAD) via JMX or Hyperic Sigar and make load balancing

val calculator = system.actorOf(Props[Backend].withRouter(

RoundRobinRouter(nrOfInstances = 10)),"calculator")

}

val calculator = system.actorOf(Props[Backend] .withRouter(ClusterRouterConfig(

local = localRouter, settings = clusterSettings))

, "calculator")}

Router

Routee 3

Routee 1

Actor System

Routee 4

Routee 3

Actor System

Routee 6

Routee 5

Actor System

Elasticity

application.conf

cluster {

seed-nodes = ["akka.tcp://ScalaIOSystem@127.0.0.1:2551","akka.tcp://ScalaIOSystem@127.0.0.1:2552"] auto-down = on

}

Router

Routee 3

Routee 1

Actor System

Routee 4

Routee 3

Actor System

Routee 6

Routee 5

Actor System

Resilience

//Loanask(calculator,Compute(this))

.mapTo[List[Row]]

.map(Enumerator.enumerate(_))

//PorfolioEnumerator.interleave(loans.map(_.rows))

RowProducer

+ Nothing changed

RowTransformer

+ Nothing changed

val totalPaid = Enumeratee.map[Row](row =>

row.interests + row.amortization

)

AmountConsumerdef sum = Iteratee.fold[Amount, Amount]

(Amount(0, EUR))(_ + _)

+ Nothing changed

Function composition

Parallel computation

Asynchronism

Error management

Pros Cons

No on demand computation

No elasticity

No resilience

Function composition

Parallel computation

Asynchronism

Error management

Pros Cons

No on demand computation

No elasticity

No resilience

Function composition

Parallel computation

Asynchronism

Error management

Elasticity

Resilience

Network serialization

Pros Cons

No on demand computation

Stream API

Step 1

5000 loans50 rows

~ 560 ms

with pause~ 144900 ms

Iteratees

Step 2

5000 loans50 rows

~ 4571 ms

with pause~ 39042 ms

Akka actor

Step 3

5000 loans50 rows

~ 4271 ms

with pause~ 40882 ms

Akka cluster

Step 4

5000 loans50 rows

~ 6213 ms

with pause~ 77957 ms

1 node / 2 actors

Stream API

Step 1

5000 loans50 rows

~ 560 ms

with pause~ 144900 ms

Iteratees

Step 2

5000 loans50 rows

~ 4571 ms

with pause~ 39042 ms

Akka actor

Step 3

5000 loans50 rows

~ 4271 ms

with pause~ 40882 ms

Akka cluster

Step 4

5000 loans50 rows

~ 5547 ms

with pause~ 39695 ms

2 nodes / 4 actors

Conclusion

Stream API

Step 1

powerful library

low memory

performance when single

threaded

Iteratees

Step 2

Akka actor

Step 3

error management

control on parallel execution via configuration

Akka cluster

Step 4

elasticity

resilience

monitoring

elegant API

enable asynchronism

and parallelism

It’s all about trade-off

But do you really need distribution?

Hot subject

Recet blog post from «Mandubian» for Scalaz stream machines and iteratees [1]

Recent presentation from «Heather Miller» for spores (distribuables closures) [2]

Recent release of Scala 2.10.3 and performance optimization of Promise

Release candidate of play-iteratee module with performance optimization

Lots of stuff in the roadmap of Akka cluster 2.3.0

YOUFOR watching

THANK

Merci!