© 2014 SpringOne 2GX. All rights reserved. Do not distribute without permission.
Asynchronous Design: 1M events per
second – with Spring
By Stuart Williams
Bio
2
Stuart Williams
• Software Engineer at Pivotal – RTI project lead
@pidster
What is this all about?
• We built a product using Pivotal products
• Learned some lessons
• We found a few limitations & some room for
improvement…
• … but we addressed them & now things go faster.
A lot faster.
3
Dogfood
• Built with Spring IO Platform • Boot, Data, Integration, Reactor, AMQP, SpEL, Shell (and a little Groovy)
• GemFire
• RabbitMQ
4
Spring
Framework Spring
Data
Spring
Integration
Spring
Reactor
Spring
AMQP
Spring
Hateoas Groovy
Spring
Boot
Questions for you
• Heard of Spring Integration?
• Tried it?
• In production?
• Heard of Reactor?
• Tried it?
• In production?
5
RTI
6
What is RTI?
• RTI == ‘Real Time Intelligence’ • Stream processing application
• Location based services
• Analytics (e.g. network health)
• Telecom network data • ‘Control plane’ is meta data
• ‘User plane’ is actual data (30x more)
• Rich data model
7
Input Data Rates
RTI*
• 100k/s average
• 120k/s daily peak
• 1M/s annual peak
8
*Control-plane only, user-plane is 20x
Input Data Rates
RTI*
• 100k/s baseline
• >120k/s daily peak
• >1M/s annual peak
10
*Control-plane only, user-plane is 20x
Twitter**
• 6k/s average
• 9k/s daily peak
• 30k/s large events
**Source @catehstn twitter.com/catehstn/status/494918021358813184
Load Characteristics
• Low numbers of inbound connections
• High rates, micro-bursts
• Occasional peaks of nearly 2x, rare peaks of 10x
• Variable payload size (200B – 300KB)
• Internal fan-outs multiple event rates
11
More statistics…
• 100k/s order of magnitude • 8,640,000,000 (per day)
• An Integer based counter will ‘roll over’ in ~2 days
• 400Mbps of raw data (‘control plane’) • 10Gbps NICs required to support traffic peaks
• Logging! Any verbose errors can blow a disk away
• Queues backing up == #fail
12
Architecture
13
Architecture
14
Ingester Ingest Grid Distribution
Analytics
AMQP
Metrics
Firehose
HTTP HTTP
WAN
Queue
Architectural Challenges
• Ingest • Responsibility
• Micro-bursts
• Infrastructure considerations • Compute
• Memory
• Disk
• Network
15
Architecture
16
Ingester Ingest Grid Distribution
Analytics
AMQP
Metrics
Firehose
HTTP HTTP
WAN
Queue
Ingester Architecture
17
Ingester
• Spring Integration • TCP Server
• Transformer
• Filters
• Reactor Stream
• GemFire client
• Single process • Multiple protocols – different rates & sizes
Architecture
18
Ingester Ingest Grid Distribution
Analytics
AMQP
Metrics
Firehose
HTTP HTTP
WAN
Queue
Analytics Architecture
19
Analytics • Reactor
• SpEL evaluation
• Hundreds of expression
calculations per event
• Collate across nodes
• Output to File or AMQP
Architecture
20
Ingester Ingest Grid Distribution
Analytics
AMQP
Metrics
Firehose
HTTP HTTP
WAN
Queue
Architecture
21
Distribution
• Spring Integration • Enrichers
• Filters
• Reactor Stream
• Output to File / AMQP / JDBC
• Membership checks
• LBS, opt-in’s
First Ingester solution
22
Solution #1 – ‘Naïve’ proof of concept
• Build codecs • More on this in John Davies’ “Big Data In Memory” talk later today…
• Spring Integration (SI) pipeline • TCP Inbound Adapter
• Transformer
• Filters
• Outbound adapter
23
Solution #1 – ‘Naïve’ proof of concept
24
Solution #1 results
• Message UUID generation was slow!
• SpEL-based method resolution was slow!
• Standard Channels added overhead!
• Single event output was slow!
25
Ingester Mark 2
26
Solution #2 – Use interfaces
27
Solution #2 – Use interfaces
• Use the IdGenerator interface
• Use specific endpoint interfaces • … we’ll come back to SpEL …
• Use a Chain
• Use an Aggregator to build a batch
28
Solution #2 results
• IdGenerator helped a lot
• Specific interfaces not recognised!
• Using <int:chain helped
• Aggregator helped, but is too slow
• <int:tcp-inbound-adapter is too slow
29
Many whiteboards later…
30
Many whiteboards later…
31
Working version
32
Solution N
33
Message-only Filters
Reference Data Filters
Batcher
GB
IUPS
IUCS
A
Radius/Diamete
r
4G
Working version
• Netty / Reactor TCP
• IdGenerator
• Specific endpoint interfaces
• Chain
• Reactor Stream based batching
• + many improvements & enhancements
34
Roundup
35
Spring Improvements
• Performance • Spring Integration
• SpEL
• Reactor
• Spring XD benefits from these upgrades
36
Spring Integration
37
Spring Integration
• UUID generator
• MessageBuilderFactory & MutableMessage
• Dispatcher optimisation
• SpEL parser caching
• Counters are ‘long’
• Interfaces used directly – if you’re specific SI
respects that
38
Spring Expression Language
• Compilation of expressions • Configuration options
• SI just re-evaluates expressions
• Trade-offs & limitations
• Much, much faster
39
SpEL demo
40
Reactor
• Drop-in replacements • Thread pools, dispatchers
• TCP/UDP Server & Client • Much faster – lower resource utilisation
• Stream API • Batching and other functionality
• More about Reactor • Thu, 8.30am “Building Reactive Applications…”
41
Batching endpoint code
42
Summary
• Spring Integration is much faster
• Good performance means better resource
utilisation
• For cloud applications cost savings can be dramatic
• Batching payloads makes a big difference
• Many applications wait on network IO
• Trade-off risk of data loss for performance
• Reactor FTW
43
Questions
44
45
Learn More. Stay Connected
Tweet #rti #s2gx if you’d like to go faster
@pidster
“Big Data in Memory”
John Davis – Trinity 1-2 4.30pm
@springcentral | spring.io/video