Asynchronous Design with Spring and RTI: 1M Events Per Second

© 2014 SpringOne 2GX. All rights reserved. Do not distribute without permission.

Asynchronous Design: 1M events per

second – with Spring

By Stuart Williams

Bio

2

Stuart Williams

• Software Engineer at Pivotal – RTI project lead

@pidster

What is this all about?

• We built a product using Pivotal products

• Learned some lessons

• We found a few limitations & some room for

improvement…

• … but we addressed them & now things go faster.

A lot faster.

3

Dogfood

• Built with Spring IO Platform • Boot, Data, Integration, Reactor, AMQP, SpEL, Shell (and a little Groovy)

• GemFire

• RabbitMQ

4

Spring

Framework Spring

Data

Spring

Integration

Spring

Reactor

Spring

AMQP

Spring

Hateoas Groovy

Spring

Boot

Questions for you

• Heard of Spring Integration?

• Tried it?

• In production?

• Heard of Reactor?

• Tried it?

• In production?

5

RTI

6

What is RTI?

• RTI == ‘Real Time Intelligence’ • Stream processing application

• Location based services

• Analytics (e.g. network health)

• Telecom network data • ‘Control plane’ is meta data

• ‘User plane’ is actual data (30x more)

• Rich data model

7

Input Data Rates

RTI*

• 100k/s average

• 120k/s daily peak

• 1M/s annual peak

8

*Control-plane only, user-plane is 20x

Input Data Rates

RTI*

• 100k/s baseline

• >120k/s daily peak

• >1M/s annual peak

10

*Control-plane only, user-plane is 20x

Twitter**

• 6k/s average

• 9k/s daily peak

• 30k/s large events

**Source @catehstn twitter.com/catehstn/status/494918021358813184

https://twitter.com/catehstn/status/494918021358813184

Load Characteristics

• Low numbers of inbound connections

• High rates, micro-bursts

• Occasional peaks of nearly 2x, rare peaks of 10x

• Variable payload size (200B – 300KB)

• Internal fan-outs multiple event rates

11

More statistics…

• 100k/s order of magnitude • 8,640,000,000 (per day)

• An Integer based counter will ‘roll over’ in ~2 days

• 400Mbps of raw data (‘control plane’) • 10Gbps NICs required to support traffic peaks

• Logging! Any verbose errors can blow a disk away

• Queues backing up == #fail

12

Architecture

13

Architecture

14

Ingester Ingest Grid Distribution

Analytics

AMQP

Metrics

Firehose

HTTP HTTP

WAN

Queue

Architectural Challenges

• Ingest • Responsibility

• Micro-bursts

• Infrastructure considerations • Compute

• Memory

• Disk

• Network

15

Architecture

16


Analytics

AMQP

Metrics

Firehose

HTTP HTTP

WAN

Queue

Ingester Architecture

17

Ingester

• Spring Integration • TCP Server

• Transformer

• Filters

• Reactor Stream

• GemFire client

• Single process • Multiple protocols – different rates & sizes

Architecture

18


Analytics

AMQP

Metrics

Firehose

HTTP HTTP

WAN

Queue

Analytics Architecture

19

Analytics • Reactor

• SpEL evaluation

• Hundreds of expression

calculations per event

• Collate across nodes

• Output to File or AMQP

Architecture

20


Analytics

AMQP

Metrics

Firehose

HTTP HTTP

WAN

Queue

Architecture

21

Distribution

• Spring Integration • Enrichers

• Filters

• Reactor Stream

• Output to File / AMQP / JDBC

• Membership checks

• LBS, opt-in’s

First Ingester solution

22

Solution #1 – ‘Naïve’ proof of concept

• Build codecs • More on this in John Davies’ “Big Data In Memory” talk later today…

• Spring Integration (SI) pipeline • TCP Inbound Adapter

• Transformer

• Filters

• Outbound adapter

23

Solution #1 – ‘Naïve’ proof of concept

24

Solution #1 results

• Message UUID generation was slow!

• SpEL-based method resolution was slow!

• Standard Channels added overhead!

• Single event output was slow!

25

Ingester Mark 2

26

Solution #2 – Use interfaces

27

Solution #2 – Use interfaces

• Use the IdGenerator interface

• Use specific endpoint interfaces • … we’ll come back to SpEL …

• Use a Chain

• Use an Aggregator to build a batch

28

Solution #2 results

• IdGenerator helped a lot

• Specific interfaces not recognised!

• Using <int:chain helped

• Aggregator helped, but is too slow

• <int:tcp-inbound-adapter is too slow

29

Many whiteboards later…

30

Many whiteboards later…

31

Working version

32

Solution N

33

Message-only Filters

Reference Data Filters

Batcher

GB

IUPS

IUCS

A

Radius/Diamete

r

4G

Working version

• Netty / Reactor TCP

• IdGenerator

• Specific endpoint interfaces

• Chain

• Reactor Stream based batching

• + many improvements & enhancements

34

Roundup

35

Spring Improvements

• Performance • Spring Integration

• SpEL

• Reactor

• Spring XD benefits from these upgrades

36

Spring Integration

37

Spring Integration

• UUID generator

• MessageBuilderFactory & MutableMessage

• Dispatcher optimisation

• SpEL parser caching

• Counters are ‘long’

• Interfaces used directly – if you’re specific SI

respects that

38

Spring Expression Language

• Compilation of expressions • Configuration options

• SI just re-evaluates expressions

• Trade-offs & limitations

• Much, much faster

39

SpEL demo

40

Reactor

• Drop-in replacements • Thread pools, dispatchers

• TCP/UDP Server & Client • Much faster – lower resource utilisation

• Stream API • Batching and other functionality

• More about Reactor • Thu, 8.30am “Building Reactive Applications…”

41

Batching endpoint code

42

Summary

• Spring Integration is much faster

• Good performance means better resource

utilisation

• For cloud applications cost savings can be dramatic

• Batching payloads makes a big difference

• Many applications wait on network IO

• Trade-off risk of data loss for performance

• Reactor FTW

43

Questions

44

45

Learn More. Stay Connected

Tweet #rti #s2gx if you’d like to go faster

@pidster

“Big Data in Memory”

John Davis – Trinity 1-2 4.30pm

@springcentral | spring.io/video

Date post:	02-Jul-2015
Category:	Software
Upload:	spring-io
View:	697 times
Download:	0 times