1©MapR Technologies - Confidential
Real-time and Long-time with Storm and Hadoop
2©MapR Technologies - Confidential
Real-time and Long-time with Storm and Hadoop MapR
3©MapR Technologies - Confidential
Contact:– [email protected]– @ted_dunning
Slides and such (available late tonight):– http://info.mapr.com/ted-utahjug
Hash tags: #mapr #storm
4©MapR Technologies - Confidential
The Challenge
Hadoop is great of processing vats of data– But sucks for real-time (by design!)
Storm is great for real-time processing– But lacks any way to deal with batch processing
It sounds like there isn’t a solution– Neither fashionable solution handles everything
5©MapR Technologies - Confidential
This is not a problem.
It’s an opportunity!
6©MapR Technologies - Confidential
What is Map-Reduce?
Map-reduce programs are defined (mostly) by– A map function that does independent record transformations (and
deletions and replications)– Reduce functions that do aggregation
Map-reduce programs run in framework that– Schedules and re-runs tasks– Splits the input– Moves map outputs to reduce inputs– Receives the results
6
7©MapR Technologies - Confidential
Inside Map-Reduce
7
Input Map CombineShuffleand sort
Reduce Output
Reduce
"The time has come," the Walrus said,"To talk of many things:Of shoes—and ships—and sealing-wax
the, 1time, 1has, 1come, 1…
come, [3,2,1]has, [1,5,2]the, [1,2,1]time, [10,1,3]…
come, 6has, 8the, 4time, 14…
8©MapR Technologies - Confidential
Not Just Text
Counting words is easy
Many other problems work as well– Sessionize user logs– Very large scale joins– Large scale matrix recommendations– Computing the quadrillionth digit of π
Map-reduce is inherently batch oriented
9©MapR Technologies - Confidential
Inside Map-Reduce
9
Input Map Shuffleand sort
Reduce Output
road1, polyline(p1, p2, p3, …)lake1, polygon(p4, p5, p7, p9, …)road2, polyline(p6, p7, p9, …)
tile0918-1412, road1tile1082-8143, road1tile0014-3284, lake1tile1082-8143, lake1…
tile0918-1412, [road1]tile1082-8143, [road1, lake1]tile0014-3284, [lake1]…
tile1082-8143, img#1tile0014-3284, img#2tile1082-8143, img#3…
10©MapR Technologies - Confidential
What is Storm?
A Storm program is called a topology– Spouts inject data into a topology– Bolts process data
The units of data are called tuples All processing is flow-through Bolts can buffer or persist Output tuples can be anchored If a tuple and all descendants are not ack’ed in time, bolt has died Bolts that fail are restarted and un-acked tuples are replayed
11©MapR Technologies - Confidential
t
now
Hadoop is Not Very Real-time
UnprocessedData
Fully processed
Latest full period
Hadoop job takes this long for this data
12©MapR Technologies - Confidential
t
now
Hadoop works great back here
Storm workshere
Real-time and Long-time together
Blended view
Blended view
Blended View
13©MapR Technologies - Confidential
One Alternative
Search Engine
NoSqlde Jour
Consumer
Real-time Long-time
?
14©MapR Technologies - Confidential
Problems
Simply dumping into noSql engine doesn’t quite work Insert rate is limited No load isolation– Big retrospective jobs kill real-time
Low scan performance– Hbase pretty good, but not stellar
Difficult to set boundaries– where does real-time end and long-time begin?
15©MapR Technologies - Confidential
Data Sources
Catcher Cluster
Rough Design – Data Flow
Catcher Cluster
Query Event Spout
Logger Bolt
Counter Bolt
Raw Logs
LoggerBolt
Semi Agg
Hadoop Aggregator
Snap
Long agg
ProtoSpout Counter Bolt
Logger Bolt
Data Sources
16©MapR Technologies - Confidential
Closer Look – Catcher Protocol
Data Sources
Catcher ClusterCatcher Cluster
Data Sources
The data sources and catchers communicate with a very simple protocol.
Hello() => list of catchersLog(topic,message) => (OK|FAIL, redirect-to-catcher)
17©MapR Technologies - Confidential
Closer Look – Catcher Queues
Catcher Cluster
Catcher Cluster
The catchers forward log requests to the correct catcher and return that host in the reply to allow the client to avoid the extra hop.
Each topic file is appended by exactly one catcher.
Topic files are kept in shared file storage.
TopicFile
TopicFile
18©MapR Technologies - Confidential
Closer Look – ProtoSpout
The ProtoSpout tails the topic files,parses log records into tuples andinjects them into the Storm topology.
Last fully acked position stored in shared file system.
TopicFile
TopicFile
ProtoSpout
19©MapR Technologies - Confidential
Closer Look – Counter Bolt
Critical design goals:– fast ack for all tuples– fast restart of counter
Ack happens when tuple hits the replay log (10’s of milliseconds) Restart involves replaying semi-agg’s + replay log (very fast) Replay log only lasts until next semi-aggregate goes out
Counter Bolt
Replay Log
Semi-aggregated records
Incoming records
Real-time Long-time
20©MapR Technologies - Confidential
A Frozen Moment in Time
Snapshot defines the dividing line
All data in the snap is long-time, all after is real-time
Semi-agg strategy allows clean query
Semi Agg
Hadoop Aggregator
Snap
Long agg
21©MapR Technologies - Confidential
Guarantees
Counter output volume is small-ish– the greater of k tuples per 100K inputs or k tuple/s– 1 tuple/s/label/bolt for this exercise
Persistence layer must provide guarantees– distributed against node failure– must have either readable flush or closed-append
HDFS is distributed, but provides no guarantees and strange semantics
MapRfs is distributed, provides all necessary guarantees
22©MapR Technologies - Confidential
Presentation Layer
Presentation must– read recent output of Logger bolt– read relevant output of Hadoop jobs– combine semi-aggregated records
User will see– counts that increment within 0-2 s of events– seamless and accurate meld of short and long-term data
23©MapR Technologies - Confidential
Example 2 – AB testing in real-time
I have 15 versions of my landing page Each visitor is assigned to a version– Which version?
A conversion or sale or whatever can happen– How long to wait?
Some versions of the landing page are horrible– Don’t want to give them traffic
29©MapR Technologies - Confidential
Bayesian Bandit
Compute distributions based on data Sample p1 and p2 from these distributions
Put a coin in bandit 1 if p1 > p2
Else, put the coin in bandit 2
30©MapR Technologies - Confidential
And it works!
31©MapR Technologies - Confidential
Video Demo
32©MapR Technologies - Confidential
The Code
Select an alternative
Select and learn
But we already know how to count!
n = dim(k)[1] p0 = rep(0, length.out=n) for (i in 1:n) { p0[i] = rbeta(1, k[i,2]+1, k[i,1]+1) } return (which(p0 == max(p0)))
for (z in 1:steps) { i = select(k) j = test(i) k[i,j] = k[i,j]+1 } return (k)
33©MapR Technologies - Confidential
The Basic Idea
We can encode a distribution by sampling Sampling allows unification of exploration and exploitation
Can be extended to more general response models
34©MapR Technologies - Confidential
The Bigger Basic Idea
Online algorithms generally have relatively small state (like counting)
Online algorithms generally have a simple update (like counting) If we can do this with counting, we can do it with all kinds of
algorithms
37©MapR Technologies - Confidential
Contact:– [email protected]– @ted_dunning
Slides and such (available late tonight):– Send email !!
Hash tags: #mapr #bigdata
We are hiring!
38©MapR Technologies - Confidential
Thank You