AgendaA bit of Tradeshift historyTypical sources of failuresEvent sourcingActors, Akka, and ClusteringExample use case: collaboration
Tradeshift in 2011
Frontend
Backend
x10
Postgresql
Two major software componentsthe frontendthe backend
<10 developers
Tradeshift in 2016
Frontend
Backend
SFTPPublic API
App Backend(old)
App Backend(new) Workflow
Payments(old)
Payments(new)
Conversions
Others...
x150
30 deployed components (and growing)150 developers250.000 LoC in the backend
Scaling of Tradeshift's systemsMoore's law applies to AWSSingle point of not quite failing often enough
2016 directive:
All new components must be clustered
Yeah, what about the 30-ish existing ones?
New architecture is needed
Scaling of Tradehift's development process2011: "We're a Java shop"
Scaling of Tradehift's development process2011: "We're a Java shop"2016: Not really, at least not anymore
Groovy and grailsPython, go, ruby for infrastructureCrazy javascript peopleScala
But still, mostly Java
Empower teams to pick their own language and frameworks
Typical sources of failures
FrontendBackendPublic APIApp Backend
(old)
We're down!We overloaded the databasewhich caused the backend to respond slowlywhich caused the frontend to respond slowlywhich caused our users' web browsers to respond slowlywhich caused our users to reload their pageGOTO 10
Typical sources of failures
FrontendBackendPublic APIApp Backend
(old)
We're down!We overloaded the databasewhich caused the backend to respond slowlywhich caused the frontend to respond slowlywhich caused our users' web browsers to respond slowlywhich caused our users to reload their pageGOTO 10
Enter the buzzwordsLet it crash [2003, Amstrong]Micro-services [2005, Rodgers]Self-contained systems [2015, scs-architecture.org]
Service 2
Service 1
Self-contained systemsNo outgoing calls while handling an incoming request
(except to our own databases)All inter-service communication must be asynchronousThis implies data replication
No single points of failureSystem must be clustered
Design must trivially scale to 10x expected load
Event sourcing
User #1
Created
Name changed to Alice
Pet added: Gary the Goldfish
Pet removed: Gary the Goldfish
User #2
Created
Name changed to Bob
Pet added: Charlie the Cat
System considers an append-only Event journal the only source of truthAggregate is one unit of information to which (and only which) an event atomically appliesEvents have a guaranteed order, but only within an aggregate
Event sourcingNice scalability properties
Each aggregate can process changes independently
All information that spans >1 aggregate is materialized using event listeners
Traditionally only applied inside a system
Synchronous APIs only ("get customer history")
Why not expose event stream itself?
Eventual consistencyLatency implicationsSecurity implications
Implementation
The actor modelActor is an entity that responds only to messages by
sending messages to other actorscreating other (child) actorsadjusting its behaviour
Akka is a toolkit for writing actors in Java
Actor is a normal Java class that extends UntypedActor or AbstractActor
Message is an immutable, serializable, Java class
Parent actor is the supervisor of its child actors. On child actor failure, parent decides what to do:
Restart childStop childEscalate
Actor ping pongpublic class PongActor extends UntypedActor {
public void onReceive(Object message) {if (message instanceof String) {
System.out.println("In PongActor - received message: " + message); getSender().tell("pong", getSelf()); } }}
Actor ping pongpublic class Initialize {}public class PingActor extends AbstractActor {
private int counter = 0;private ActorRef pongActor = getContext().actorOf(Props.create(PongActor.class), "pongActor");
{ receive(ReceiveBuilder .match(Initialize.class, msg -> { System.out.println("In PingActor - starting ping-pong"); pongActor.tell("ping", getSelf()); }) .match(String.class, msg -> { System.out.println("In PingActor - received message: " + message); counter += 1;
if (counter == 3) { getContext().system().shutdown(); } else { getSender().tell("ping", getSelf()); } }) .build()); }}
Actor ping pingpublic static void main() { ActorSystem system = ActorSystem.create(); ActorRef pingActor = system.actorOf(Props.create(PingActor.class)); PingActor.tell(new Initialize());}
Output:
In PingActor - starting ping-pongIn PongActor - received message: pingIn PingActor - received message: pongIn PongActor - received message: pingIn PingActor - received message: pongIn PongActor - received message: pingIn PingActor - received message: pong
Akka persistenceFramework to do event sourcing using actorsPersistence plugins for levelDB, cassandra, kafka, ...Each PersistentActor has a String identifier, under which events are stored
public class ChatActor extends AbstractPersistentActor {private final List<String> messages = new ArrayList<>();
@Override public String persistenceId() { return "chat-1"; }
private void postMessage(String msg) { persist(msg, evt -> { messages.add(msg); sender().tell(Done.getInstance(), self()); }); }
private void getMessageList() { sender().tell(new ArrayList<>(messages), self()); }
// ... }
Akka persistencepublic class ChatActor extends AbstractPersistentActor {
private final List<String> messages = new ArrayList<>();
private void postMessage(String msg) { /* ... */ }private void getMessageList() { /* ... */ }
@Override public String persistenceId() { return "chat-1"; }
@Override public PartialFunction<Object,BoxedUnit> receiveRecover() {return ReceiveBuilder
.match(String.class, messages::add) .build(); }
@Override public void receiveCommand() {return ReceiveBuilder
.matchEquals("/list", msg -> getMessageList()) .match(String.class, this::postMessage) .build(); }}
Akka remoting and clusteringTransparently lets actors communicate between systemsActorRef can point to a remote actorMessages must be serializable (using configurable mechanisms)
akka { actor { provider = "akka.remote.RemoteActorRefProvider" } remote { enabled-transports = ["akka.remote.netty.tcp"] netty.tcp { hostname = "127.0.0.1" port = 2552 } } cluster { seed-nodes = [
"akka.tcp://[email protected]:2551","akka.tcp://[email protected]:2552"]
} }
Akka cluster shardingDynamically distributes a group of actors across an akka clusterMessageExtractor informs cluster sharding where a particular message should go
class ChatMessage { UUID conversation; String msg; }
class MyMessageExtractor implements MessageExtractor {private final int numberOfShards = 256;
@Override public String entityId(Object command) {return ChatMessage.cast(command).conversation.toString();
}
@Override public String shardId(Object command) {return String.valueOf(entityId(command).hashCode() % numberOfShards);
}
@Override public Object entityMessage(Object command) {return ChatMessage.cast(command).msg;
}}
Akka cluster shardingShardRegion proxy sits between client and real (remote) persistent actorPersistent actor names will be their persistence id
public class ChatActor extends AbstractPersistentActor {// ...
@Override public String persistenceId() { return getSelf().path().name(); }}
ActorRef proxy = ClusterSharding.get(system).start("conversations",
Props.create(ChatActor.class), ClusterShardingSettings.create(system),
new MyMessageExtractor());
proxy.tell(new ChatMessage( UUID.fromString("67c67d28-4719-4bf9-bfe6-3944ed961a60"),
"hello!"));
Putting it all together2015: Let's try it out first
2016: Collaboration in production
Real-time text message exchange between employeesText interface to automated travel agent
In development: documents, FTP
Stuff that works well for us: https://github.com/Tradeshift/ts-reaktive/
ts-reaktive-actors: Persistent actor base classes with reasonable defaults, and HTTP API for eventjournalts-reaktive-marshal: Non-blocking streaming marshalling framework for akka streamsts-reaktive-replication: Master-slave replication across data centers for persistent actors
A bit of refactoringIntroduce a single base class for all commands
public abstract class ChatCommand {}
public class GetMessageList extends ChatCommand {}
public class PostMessage extends ChatCommand {private final String message;
public PostMessage(String message) {this.message = message;
}
public String getMessage() {return message;
}}
A bit of refactoringIntroduce a single base class for all events
public abstract class ChatEvent {}
public class MessagePosted extends ChatEvent {private final String message;
public MessagePosted(String message) {this.message = message;
}
public String getMessage() {return message;
}}
Independent state classpublic class ChatState extends AbstractState<ChatEvent, ChatState> {
public static final ChatState EMPTY = new ChatState(Vector.empty());
private final Seq<String> messages;
@Overridepublic ChatState apply(ChatEvent event) {
if (event instanceof MessagePosted) {return new ChatState(messages.append(
MessagePosted.class.cast(event).getMessage())); } else {
return this; } }}
Plain old Java class, hence easily unit testable (compared to actor)
Stateful persistent actorpublic class ChatActor extends AbstractStatefulPersistentActor<
ChatCommand, ChatEvent, ChatState> {
public static abstract class Handler extends AbstractCommandHandler<ChatCommand, ChatEvent, ChatState> { }
@Overrideprotected ChatState initialState() {
return ChatState.EMPTY; }
@Overrideprotected PartialFunction<ChatCommand,Handler> applyCommand() {
return new PFBuilder<ChatCommand,Handler>() .match(ChatCommand.GetMessageList.class, cmd ->
new GetMessageListHandler(getState(), cmd)) .match(ChatCommand.PostMessage.class, cmd ->
new PostMessageHandler(getState(), cmd)) .build();} }
Business logic pushed out to ChatState and *Handler
Command handler examplepublic class PostMessageHandler extends ChatActor.Handler {
@Overridepublic Seq<ChatEvent> getEventsToEmit() {
return Vector.of(new ChatEvent.MessagePosted( ChatCommand.PostMessage.class.cast(cmd).getMessage())); }
@Overridepublic Object getReply(Seq<ChatEvent> emittedEvents,
long lastSequenceNr) {return Done.getInstance();
}}
Plain old Java class, hence easily unit testable (compared to actor)Behaviour of different commands separated into different classes
Event journal over HTTPWe wanted easy consumption of the event log by other systemsHTTP chunked encoding (1 chunk per event), without completion
GET /events?since=1473689920000
200 OKContent-Type: application/protobufTransfer-Encoding: chunked
14(14 bytes with the first protobuf event)11(11 bytes with the second protobuf event)(TCP stream stalls here)
Additional events can arrive in real time
Use case: collaboration
Presentationnodejs
Workflowjava
Contentjava + akka
Cassandrajournal
PostgreSQLmaterialization
Content backend has API to add messages to conversationsMessages go into the event journalJournal is queryable over HTTP
Presentation backend listens to event streamMaterializes into views that are UI dependentCan combine other sources as well
Browser talks to both Presentation and Content backendsWeb socket stream informs browser of incoming messages
Wrap-upScalable systems: check
Scalable development: check
We're not quite there yet
Akka, cassandra and the reaktive combo in active developmentAttitude of I've done Spring for 10+ years successfully, why would I learn thisThe proof is in more pudding
Want to get involved?
Get: http://akka.io/Read: http://doc.akka.io/docs/akka/current/java.htmlChat: https://gitter.im/akka/akkaHack: https://github.com/Tradeshift/ts-reaktive/ and https://github.com/jypma/ts-reaktive-examples/
Extra slides
Introducing Akka StreamsGraph is a blueprint for a closed, finite network of stages, which communicate by streaming elementsGraphStage<S extends Shape> is one processing stage within a graph, taking elements in through zero ormore Inlets, and emitting through OutletsIt's completely up to the stage when and how to respond to arriving elementsAll built-in graph stages embrace backpressure and bounded processing
Mostly used graph stages
Source<T, M> has one outlet of type TSink<T, M> has one inlet of type TFlow<A, B, M> has one inlet of type A and one outlet of type BRunnableGraph<M> has no inlets or outlets
Reactive streams
Akka is a reactive streams implementation (just like RxJava and others)You typically don't interact in terms of publisher and subscriber directly
Hello, streamsfinal ActorSystem system = ActorSystem.create("QuickStart");final Materializer materializer = ActorMaterializer.create(system);
final Source<Integer, NotUsed> numbers = Source.range(1, 100);
final Sink<Integer, CompletionStage<Done>> print = Sink.foreach(i -> System.out.println(i));
final CompletionStage<Done> done = numbers.runWith(print, materializer);
// Output: // 1// 2// ...
Stream materializationGraph is only a blueprint: nothing runs until it's given to a materializer, typically ActorMaterializerAll graph stages are generic in their materialized type MGraph can be materialized (run, runWith) more than once
class Source<T, M> {// A graph which materializes into the M2 of the sink (ignoring source's M)public RunnableGraph<M2> to(Sink<T,M2> sink);
// Materializes, and returns the M of the sink (ignoring this source's M) public <M2> M2 runWith(Sink<T, M2> sink, Materializer m) { ... }
// A graph which materializes into the result of applying [combine] to // this source's M and the sink's M2 public <M2, MR> RunnableGraph<MR> toMat(Sink<T,M2> sink,
Function2<M,M2,MR> combine);}
class RunnableGraph<M> {public M run(Materializer m);
}
Reusable piecesSource, Sink and Flow are all normal, immutable objects, so they're ideal to be constructed in reusablefactory methods:
public Sink<String, CompletionStage<IOResult>> lineSink(String filename) { Sink<ByteString, CompletionStage<IOResult>> file = FileIO.toPath(Paths.get(filename);
// Let's start with some stringsreturn Flow.of(String.class) // Flow<String, String, NotUsed>
// Convert them into bytes (UTF-8), adding a newline// We now have a Flow<String, ByteString, NotUsed>
.map(s -> ByteString.fromString(s + "\n"))
// Send them into a file, and we want the IOResult of the// FileIO sink as materialized value of our own sink
.toMat(file), Keep.right());}
numbers.runWith(lineSink("numbers.txt"), materializer);
Time-based processingfinal Source<Integer, NotUsed> numbers = Source.range(1, 100000000);
final Sink<Integer, CompletionStage<Done>> print = Sink.foreach(i -> System.out.println(i));
final CompletionStage<Done> done = numbers .throttle(1, Duration.create(1, TimeUnit.SECONDS), 1, ThrottleMode.shaping()) .runWith(print, materializer);
This does what you expect: print one message per secondNo OutOfMemoryError, akka buffers only as needed: backpressure
Example SourcesMaterialize as ActorRef
Source<T, ActorRef> s = Source.actorRef(10, OverflowStrategy.fail());
Materialize as reactive Subscriber<T>
Source<T, Subscriber<T>> s = Source.asSubscriber();
Read from a reactive Publisher<T> p
Source<T, NotUsed> s = Source.fromPublisher(p);
Emit the same element regularly
Source<T, Cancellable> s = Source.tick(duration, duration, element);
Example SinksSend to ActorRef
Sink<T, NotUsed> s = Sink.actorRef(target, "done");
Materialize as reactive Publisher<T>
Sink<T, Publisher<T>> s = Sink.asPublisher(WITH_FANOUT);
Materialize into a java.util.List of all elements
Sink<T, List<T>> s = Sink.seq();
Example source and flow operatorsSend Source<String, M> src to an additional Sink<String> sink
Source<String, M> s = src.alsoTo(sink);
Process batches of concatenated strings, but only if coming in too fast
Source<String, M> s = src.batchWeighted(1000, s -> s.length(), s -> s, (s1,s2) -> s1 + s2);
Process 1 seconds' worth of elements at a time, but at most 100
Source<List<String>, M> s = src.groupedWithin(100, Duration.create(1, SECONDS));
Invoke a CompletionStage for each element, and resume with the results in order
CompletionStage<Integer> process(String s) { ... }Source<String, M> s = src.mapAsync(this::process);