Kafka & Storm - FifthElephant 2015 by @bhaskerkode, Helpshift

Post on 15-Aug-2015

220 views 0 download

transcript

#kafka#storm

@FifthEl ‘15

I’m @bhaskerkode aka BoskyProduct Engg at Helpshift

you’re here at a bigdata, & analytics conference to …

mobile

webnumbers

insights

predictiot

internal

?

this talk

mobile

webnumbers

insights

predictiot

internal

your jobyour raise?

this talk really is about sleep

goodsleep

Kafka is your glue to have different teams break down complex end-to-end problems into smaller more manageable oneslike a “checkpoint” in a long hike/game/drive

seperation of concerns helps you design distributed systems

kafka & stream processingpower combo <insert quote to impress friend>

developedworld

wiredcommunication

wireless / mobile

developing world

wireless / mobile

the same may happen with bigdata & analyticsdataingestion

offline processing

streamprocessing

dataingestion

streamprocessing

Linkedinhttp://kafka.apache.org/

Twitterhttps://storm.apache.org/

mobile

webHA

iot

internal

biz logic

typical analytics ingestion architecture

dbcachesearchinfra

mo

web

HAiot

inte

auth logic

task #1

kafkaproducer

dbcachesearchinfra

kafkaconsumer

consumer storm spark samza

how kafka helps me sleep betterhow? let’s talk numbers

~300 millionrequests/day to our events kafka producer; ekaf

served by 22 coresor 11 servers

running on auto-pilotfor 1+ years little maintenance, only scale serverszero-downtime support plenty of metrics

(started with 3 servers replacing 14 JVM producers)

separate auth logic from producer logic

erlang

= peaceful sleepopen source

(scalable)

mo

web

iot

inte

modular design

kafboy

ekaf

auth logic

HTTP 500unauthorizedwrong api’s

metrics

HTTP 200

modular design

kafboy

ekaf

auth logic

HTTP 500unauthorizedwrong api’s

metrics

HTTP 200

one thing particularly distinguishes kafkasecret sauce? let’s take wild guesses

community?NO let’s take more wild guesses at the secret sauce

let’s illustrate with some examples

You: “Create topic foo with 3 partitions”

kafka: “OK! each partition is set on a different broker. one of the brokers is an elected leader,

push your messages to the leader host, port please. if leader goes down, other broker is elected.”

is it the fault-tolerance you get?

NO let’s take more wild guesses at the secret sauce

You: “create topic foo” with 3 partitions

partitionsleaders

election

You: “hey broker1, where should i push to?”

kafka: please dial 1-800-metadata to any broker. we all maintain this

data. you can make a metadata request with topic(s) you want, and

we will return it s partitions, and their broker hosts, ports, leader

info, etc

is it the separation of concerns?

NO let’s take more wild guesses at the secret sauce

You: “hey broker1, where should i push to?”

metadata on all brokers

gives host & port of leader for partitions + more info

You: “hey partition1 on broker1, for topic1here are 10 messages.”

kafka: “thank you for calling partition1. i will append it to a dir/file called topic1/

partition1.log.

Since its append only like a commit log, you get ordering within partition free!”

is it the producer speed you get?

NO let’s take more wild guesses at the secret sauce

You: “hey partition1 on broker1, for topic1here are 10 messages.”

append only commit-logordering within partition

make 1 partition for global ordering

You: “hey broker2, where should i consume from?

kafka: “thank you for connecting to the right broker. you can now read any

partition you want, and from any offset. All i care about is which offset

to start reading from in a topic/partition.log file”

You: “hey broker1, topic1, partition1i want all data from message offset ….10(from ZK) onwards”

kafka: “thank you for calling partition1. i’m going to sendfile the bytes you asked from kernel-space directly to the socket

of a consumer using zero copy, thus reducing context switches & minimal garbage collection. yes, i am badass.”

is it the consumer speed you get?

NO let’s take more wild guesses at the secret sauce

You: “hey broker1, topic1, partition1i want all data from message offset ….10(from ZK) onwards”

offset bytessendfile bytes from offset to socket

kernel space not userspace

You: “for topic1, i want 3 consumers all reading every msg. for topic2, i want the data split between 3 consumers”

kafka: “3 different pipelines/actions on the same input topic1? Nice!

I can see your team is growing.

Thank you for grokking the concept of consumer groups topic2. Make sure all

3 of your use the same group-id, and i’ll take of the rest!”

is it the consumer parallelism?

NO let’s take more wild guesses at the secret sauce

You: “for topic1, i want 3 consumers all reading every msg.for topic2, i want the data split between 3 consumers”

can broadcast to all consumerscan split b/w a group of consumers

binary protocol.but importantly a documented spec of what goes/comes over-the-wire

the one thing particularly distinguishes kafka & gives it a stellar status in the ecosystem is its

the creators of kafka built the brokers & the specof how to communicate with it (producers & consumers),

and let the community speak the protocol

a documented spec of what goes/comes over-the-wire

the creators of kafka built the brokers & the spec

a documented spec of what goes/comes over-the-wire

eg: do you know what is sent by a namenode/jobtracker/datanode over the wire?

(PS: where is the spec? come meet me after)

protocols win over api’s/drivers

the creators of kafka focussed on the spec

If you knew what data a namenode/jobtracker/datanode actually communicates over the wire. It opens up a new world.

allows diff language clients to express themselves best

its just data over TCP sockets.

more freedom

more integrations & faster adoption

000300000000000 10007636c69656e7 4310000000100066

576656e7473

0,3,0,0,0,0,0,1,0,7,99,108,

105,101,110,116,49,0,0,0,1,0,

6,101,118,101,110,116,115

1. open a tcp socket to any kafka 0.8+ broker:port2. send these 29 bytes that asks for metadata for topic

“events”

these 29 bytes

(or in hex)

and you’llget backmetadata

for the topic“events”

always.

0,3, 0,0,

0,0,0,1, 0,7,

99,108,105,101,110,116,49, 0,0,0,1,

0,6, 101,118,101,110,116,115

[2 bytes] metadata code = 3 [2 bytes] api version = 0 [4 bytes] int id (for replies) [2 bytes] client id length = 7 [7 bytes] “client1” [4 bytes] no# of topics = 1 [2 bytes] topic[0] length = 6 [6 bytes] “events”

what this packet means

<<0,0,0,0,0,0,0,3,0,0,0,3,0,25,118,97,103,114,97,110,116, 45,117,98,117,110,116,117,45,112,114,101,99,105,115,101,

45,54,52,0,0,35,133,0,0,0,1,0,25,118,97,103,114,97,110, 116,45,117,98,117,110,116,117,45,112,114,101,99,105,115,

101,45,54,52,0,0,35,131,0,0,0,2,0,25,118,97,103,114,97, 110,116,45,117,98,117,110,116,117,45,112,114,101,99,105,

115,101,45,54,52,0,0,35,132,0,0,0,3,0,0,0,2,97,49,0,0,0, 2,0,0,0,0,0,0,0,0,0,3,0,0,0,1,0,0,0,3,0,0,0,1,0,0,0,3,0, 0,0,0,0,1,0,0,0,1,0,0,0,1,0,0,0,1,0,0,0,1,0,0,0,1,0,0,0, 2,97,50,0,0,0,2,0,0,0,0,0,0,0,0,0,1,0,0,0,1,0,0,0,1,0,0, 0,1,0,0,0,1,0,0,0,0,0,1,0,0,0,2,0,0,0,1,0,0,0,2,0,0,0,1, 0,0,0,2,0,0,0,2,97,51,0,0,0,2,0,0,0,0,0,0,0,0,0,3,0,0,0, 2,0,0,0,3,0,0,0,1,0,0,0,2,0,0,0,3,0,0,0,1,0,0,0,0,0,1,0, 0,0,1,0,0,0,2,0,0,0,1,0,0,0,2,0,0,0,2,0,0,0,1,0,0,0,2>>

% 0,0,0,1, % this is a response to req id 1 % 0,0,0,2, % number of brokers % 0,0,0,1, % broker[0] id % then broker name, host, port % 0,0,0,3, % topics len % 0,0,0,6, % topic[0] name len % ……………….. % topic[0] name events % 0,0,0,2, % topic[0] partitions len

% 0,0, % topic[0] partition1 error code % 0,0,0,0, % topic[0] partition1 % 0,0,0,1 % topic[0] partition1 leaderid % 0,0,0,1, % topic[0] partition1 replicas len % 0,0,0,3, % topic[0] partition1 replica1 % 0,0,0,1, % topic[0] partition1 isr len % 0,0,0,3, % topic[0] partition1 isr1

% another partition data here % etc

metadataresponsedecoded

similiarly all other operationsencode the request as a packet send over tcp socket

sync produce handle response<<0,0,0,0,0,0,0,3,0,0,0,3,0,25,118,97,103,114,97,110,116,

45,117,98,117,110,116,117,45,112,114,101,99,105,115,101, 45,54,52,0,0,35,133,0,0,0,1,0,25,118,97,103,114,97,110,

116,45,117,98,117,110,116,117,45,112,114,101,99,105,115, 101,45,54,52,0,0,35,131,0,0,0,2,0,25,118,97,103,114,97,

110,116,45,117,98,117,110,116,117,45,112,114,101,99,105, 115,101,45,54,52,0,0,35,132,0,0,0,3,0,0,0,2,97,49,0,0,0,

2,0,0,0,0,0,0,0,0,0,3,0,0,0,1,0,0,0,3,0,0,0,1,0,0,0,3,0, 0,0,0,0,1,0,0,0,1,0,0,0,1,0,0,0,1,0,0,0,1,0,0,0,1,0,0,0, 2,97,50,0,0,0,2,0,0,0,0,0,0,0,0,0,1,0,0,0,1,0,0,0,1,0,0, 0,1,0,0,0,1,0,0,0,0,0,1,0,0,0,2,0,0,0,1,0,0,0,2,0,0,0,1, 0,0,0,2,0,0,0,2,97,51,0,0,0,2,0,0,0,0,0,0,0,0,0,3,0,0,0, 2,0,0,0,3,0,0,0,1,0,0,0,2,0,0,0,3,0,0,0,1,0,0,0,0,0,1,0, 0,0,1,0,0,0,2,0,0,0,1,0,0,0,2,0,0,0,2,0,0,0,1,0,0,0,2>>

<<0,0,0,0,0,0,0,3,0,0,0,3,0,25,118,97,103,114,97,110,116, 45,117,98,117,110,116,117,45,112,114,101,99,105,115,101,

45,54,52,0,0,35,133,0,0,0,1,0,25,118,97,103,114,97,110, 116,45,117,98,117,110,116,117,45,112,114,101,99,105,115,

101,45,54,52,0,0,35,131,0,0,0,2,0,25,118,97,103,114,97, 110,116,45,117,98,117,110,116,117,45,112,114,101,99,105,

115,101,45,54,52,0,0,35,132,0,0,0,3,0,0,0,2,97,49,0,0,0, 2,0,0,0,0,0,0,0,0,0,3,0,0,0,1,0,0,0,3,0,0,0,1,0,0,0,3,0, 0,0,0,0,1,0,0,0,1,0,0,0,1,0,0,0,1,0,0,0,1,0,0,0,1,0,0,0, 2,97,50,0,0,0,2,0,0,0,0,0,0,0,0,0,1,0,0,0,1,0,0,0,1,0,0, 0,1,0,0,0,1,0,0,0,0,0,1,0,0,0,2,0,0,0,1,0,0,0,2,0,0,0,1, 0,0,0,2,0,0,0,2,97,51,0,0,0,2,0,0,0,0,0,0,0,0,0,3,0,0,0, 2,0,0,0,3,0,0,0,1,0,0,0,2,0,0,0,3,0,0,0,1,0,0,0,0,0,1,0, 0,0,1,0,0,0,2,0,0,0,1,0,0,0,2,0,0,0,2,0,0,0,1,0,0,0,2>>

async produce<<0,0,0,0,0,0,0,3,0,0,0,3,0,25,118,97,103,114,97,110,116,

45,117,98,117,110,116,117,45,112,114,101,99,105,115,101, 45,54,52,0,0,35,133,0,0,0,1,0,25,118,97,103,114,97,110,

116,45,117,98,117,110,116,117,45,112,114,101,99,105,115, 101,45,54,52,0,0,35,131,0,0,0,2,0,25,118,97,103,114,97,

110,116,45,117,98,117,110,116,117,45,112,114,101,99,105, 115,101,45,54,52,0,0,35,132,0,0,0,3,0,0,0,2,97,49,0,0,0,

2,0,0,0,0,0,0,0,0,0,3,0,0,0,1,0,0,0,3,0,0,0,1,0,0,0,3,0, 0,0,0,0,1,0,0,0,1,0,0,0,1,0,0,0,1,0,0,0,1,0,0,0,1,0,0,0, 2,97,50,0,0,0,2,0,0,0,0,0,0,0,0,0,1,0,0,0,1,0,0,0,1,0,0, 0,1,0,0,0,1,0,0,0,0,0,1,0,0,0,2,0,0,0,1,0,0,0,2,0,0,0,1, 0,0,0,2,0,0,0,2,97,51,0,0,0,2,0,0,0,0,0,0,0,0,0,3,0,0,0, 2,0,0,0,3,0,0,0,1,0,0,0,2,0,0,0,3,0,0,0,1,0,0,0,0,0,1,0, 0,0,1,0,0,0,2,0,0,0,1,0,0,0,2,0,0,0,2,0,0,0,1,0,0,0,2>>

no response

MessageSet => [Offset MessageSize Message] Offset => int64

MessageSize => int32 Messages

Message => Crc MagicByte Attributes Key Value

Crc => int32 MagicByte => int8 Attributes => int8

Key => bytes Value => bytes

the joy of knowing how your data is encoded and sent over tcp socketkeeps things simple, lets you sleep better easier to debug, test, add middle-wares to audit, etc

MQTT endoded messages crypto encoded <<0,0,0,0,0,0,0,3,0,0,0,3,0,25,118,97,103,114,97,110,116,

45,117,98,117,110,116,117,45,112,114,101,99,105,115,101, 45,54,52,0,0,35,133,0,0,0,1,0,25,118,97,103,114,97,110,

116,45,117,98,117,110,116,117,45,112,114,101,99,105,115, 101,45,54,52,0,0,35,131,0,0,0,2,0,25,118,97,103,114,97,

110,116,45,117,98,117,110,116,117,45,112,114,101,99,105, 115,101,45,54,52,0,0,35,132,0,0,0,3,0,0,0,2,97,49,0,0,0,

2,0,0,0,0,0,0,0,0,0,3,0,0,0,1,0,0,0,3,0,0,0,1,0,0,0,3,0, 0,0,0,0,1,0,0,0,1,0,0,0,1,0,0,0,1,0,0,0,1,0,0,0,1,0,0,0, 2,97,50,0,0,0,2,0,0,0,0,0,0,0,0,0,1,0,0,0,1,0,0,0,1,0,0, 0,1,0,0,0,1,0,0,0,0,0,1,0,0,0,2,0,0,0,1,0,0,0,2,0,0,0,1, 0,0,0,2,0,0,0,2,97,51,0,0,0,2,0,0,0,0,0,0,0,0,0,3,0,0,0, 2,0,0,0,3,0,0,0,1,0,0,0,2,0,0,0,3,0,0,0,1,0,0,0,0,0,1,0, 0,0,1,0,0,0,2,0,0,0,1,0,0,0,2,0,0,0,2,0,0,0,1,0,0,0,2>>

<<0,0,0,0,0,0,0,3,0,0,0,3,0,25,118,97,103,114,97,110,116, 45,117,98,117,110,116,117,45,112,114,101,99,105,115,101,

45,54,52,0,0,35,133,0,0,0,1,0,25,118,97,103,114,97,110, 116,45,117,98,117,110,116,117,45,112,114,101,99,105,115,

101,45,54,52,0,0,35,131,0,0,0,2,0,25,118,97,103,114,97, 110,116,45,117,98,117,110,116,117,45,112,114,101,99,105,

115,101,45,54,52,0,0,35,132,0,0,0,3,0,0,0,2,97,49,0,0,0, 2,0,0,0,0,0,0,0,0,0,3,0,0,0,1,0,0,0,3,0,0,0,1,0,0,0,3,0, 0,0,0,0,1,0,0,0,1,0,0,0,1,0,0,0,1,0,0,0,1,0,0,0,1,0,0,0, 2,97,50,0,0,0,2,0,0,0,0,0,0,0,0,0,1,0,0,0,1,0,0,0,1,0,0, 0,1,0,0,0,1,0,0,0,0,0,1,0,0,0,2,0,0,0,1,0,0,0,2,0,0,0,1, 0,0,0,2,0,0,0,2,97,51,0,0,0,2,0,0,0,0,0,0,0,0,0,3,0,0,0, 2,0,0,0,3,0,0,0,1,0,0,0,2,0,0,0,3,0,0,0,1,0,0,0,0,0,1,0, 0,0,1,0,0,0,2,0,0,0,1,0,0,0,2,0,0,0,2,0,0,0,1,0,0,0,2>>

thrift/avro encoded messages<<0,0,0,0,0,0,0,3,0,0,0,3,0,25,118,97,103,114,97,110,116,

45,117,98,117,110,116,117,45,112,114,101,99,105,115,101, 45,54,52,0,0,35,133,0,0,0,1,0,25,118,97,103,114,97,110,

116,45,117,98,117,110,116,117,45,112,114,101,99,105,115, 101,45,54,52,0,0,35,131,0,0,0,2,0,25,118,97,103,114,97,

110,116,45,117,98,117,110,116,117,45,112,114,101,99,105, 115,101,45,54,52,0,0,35,132,0,0,0,3,0,0,0,2,97,49,0,0,0,

2,0,0,0,0,0,0,0,0,0,3,0,0,0,1,0,0,0,3,0,0,0,1,0,0,0,3,0, 0,0,0,0,1,0,0,0,1,0,0,0,1,0,0,0,1,0,0,0,1,0,0,0,1,0,0,0, 2,97,50,0,0,0,2,0,0,0,0,0,0,0,0,0,1,0,0,0,1,0,0,0,1,0,0, 0,1,0,0,0,1,0,0,0,0,0,1,0,0,0,2,0,0,0,1,0,0,0,2,0,0,0,1, 0,0,0,2,0,0,0,2,97,51,0,0,0,2,0,0,0,0,0,0,0,0,0,3,0,0,0, 2,0,0,0,3,0,0,0,1,0,0,0,2,0,0,0,3,0,0,0,1,0,0,0,0,0,1,0, 0,0,1,0,0,0,2,0,0,0,1,0,0,0,2,0,0,0,2,0,0,0,1,0,0,0,2>>

AOL/YAHOO/* packets<<0,0,0,0,0,0,0,3,0,0,0,3,0,25,118,97,103,114,97,110,116,

45,117,98,117,110,116,117,45,112,114,101,99,105,115,101, 45,54,52,0,0,35,133,0,0,0,1,0,25,118,97,103,114,97,110,

116,45,117,98,117,110,116,117,45,112,114,101,99,105,115, 101,45,54,52,0,0,35,131,0,0,0,2,0,25,118,97,103,114,97,

110,116,45,117,98,117,110,116,117,45,112,114,101,99,105, 115,101,45,54,52,0,0,35,132,0,0,0,3,0,0,0,2,97,49,0,0,0,

2,0,0,0,0,0,0,0,0,0,3,0,0,0,1,0,0,0,3,0,0,0,1,0,0,0,3,0, 0,0,0,0,1,0,0,0,1,0,0,0,1,0,0,0,1,0,0,0,1,0,0,0,1,0,0,0, 2,97,50,0,0,0,2,0,0,0,0,0,0,0,0,0,1,0,0,0,1,0,0,0,1,0,0, 0,1,0,0,0,1,0,0,0,0,0,1,0,0,0,2,0,0,0,1,0,0,0,2,0,0,0,1, 0,0,0,2,0,0,0,2,97,51,0,0,0,2,0,0,0,0,0,0,0,0,0,3,0,0,0, 2,0,0,0,3,0,0,0,1,0,0,0,2,0,0,0,3,0,0,0,1,0,0,0,0,0,1,0, 0,0,1,0,0,0,2,0,0,0,1,0,0,0,2,0,0,0,2,0,0,0,1,0,0,0,2>>

brokers focus on the commit logs, etcand delegate several opinionated areas to the client, onus is on the client to make smart decisions

compression queue’ing<<0,0,0,0,0,0,0,3,0,0,0,3,0,25,118,97,103,114,97,110,116,

45,117,98,117,110,116,117,45,112,114,101,99,105,115,101, 45,54,52,0,0,35,133,0,0,0,1,0,25,118,97,103,114,97,110,

116,45,117,98,117,110,116,117,45,112,114,101,99,105,115, 101,45,54,52,0,0,35,131,0,0,0,2,0,25,118,97,103,114,97,

110,116,45,117,98,117,110,116,117,45,112,114,101,99,105, 115,101,45,54,52,0,0,35,132,0,0,0,3,0,0,0,2,97,49,0,0,0,

2,0,0,0,0,0,0,0,0,0,3,0,0,0,1,0,0,0,3,0,0,0,1,0,0,0,3,0, 0,0,0,0,1,0,0,0,1,0,0,0,1,0,0,0,1,0,0,0,1,0,0,0,1,0,0,0, 2,97,50,0,0,0,2,0,0,0,0,0,0,0,0,0,1,0,0,0,1,0,0,0,1,0,0, 0,1,0,0,0,1,0,0,0,0,0,1,0,0,0,2,0,0,0,1,0,0,0,2,0,0,0,1, 0,0,0,2,0,0,0,2,97,51,0,0,0,2,0,0,0,0,0,0,0,0,0,3,0,0,0, 2,0,0,0,3,0,0,0,1,0,0,0,2,0,0,0,3,0,0,0,1,0,0,0,0,0,1,0, 0,0,1,0,0,0,2,0,0,0,1,0,0,0,2,0,0,0,2,0,0,0,1,0,0,0,2>>

<<0,0,0,0,0,0,0,3,0,0,0,3,0,25,118,97,103,114,97,110,116, 45,117,98,117,110,116,117,45,112,114,101,99,105,115,101,

45,54,52,0,0,35,133,0,0,0,1,0,25,118,97,103,114,97,110, 116,45,117,98,117,110,116,117,45,112,114,101,99,105,115,

101,45,54,52,0,0,35,131,0,0,0,2,0,25,118,97,103,114,97, 110,116,45,117,98,117,110,116,117,45,112,114,101,99,105,

115,101,45,54,52,0,0,35,132,0,0,0,3,0,0,0,2,97,49,0,0,0, 2,0,0,0,0,0,0,0,0,0,3,0,0,0,1,0,0,0,3,0,0,0,1,0,0,0,3,0, 0,0,0,0,1,0,0,0,1,0,0,0,1,0,0,0,1,0,0,0,1,0,0,0,1,0,0,0, 2,97,50,0,0,0,2,0,0,0,0,0,0,0,0,0,1,0,0,0,1,0,0,0,1,0,0, 0,1,0,0,0,1,0,0,0,0,0,1,0,0,0,2,0,0,0,1,0,0,0,2,0,0,0,1, 0,0,0,2,0,0,0,2,97,51,0,0,0,2,0,0,0,0,0,0,0,0,0,3,0,0,0, 2,0,0,0,3,0,0,0,1,0,0,0,2,0,0,0,3,0,0,0,1,0,0,0,0,0,1,0, 0,0,1,0,0,0,2,0,0,0,1,0,0,0,2,0,0,0,2,0,0,0,1,0,0,0,2>>

detect downtime<<0,0,0,0,0,0,0,3,0,0,0,3,0,25,118,97,103,114,97,110,116,

45,117,98,117,110,116,117,45,112,114,101,99,105,115,101, 45,54,52,0,0,35,133,0,0,0,1,0,25,118,97,103,114,97,110,

116,45,117,98,117,110,116,117,45,112,114,101,99,105,115, 101,45,54,52,0,0,35,131,0,0,0,2,0,25,118,97,103,114,97,

110,116,45,117,98,117,110,116,117,45,112,114,101,99,105, 115,101,45,54,52,0,0,35,132,0,0,0,3,0,0,0,2,97,49,0,0,0,

2,0,0,0,0,0,0,0,0,0,3,0,0,0,1,0,0,0,3,0,0,0,1,0,0,0,3,0, 0,0,0,0,1,0,0,0,1,0,0,0,1,0,0,0,1,0,0,0,1,0,0,0,1,0,0,0, 2,97,50,0,0,0,2,0,0,0,0,0,0,0,0,0,1,0,0,0,1,0,0,0,1,0,0, 0,1,0,0,0,1,0,0,0,0,0,1,0,0,0,2,0,0,0,1,0,0,0,2,0,0,0,1, 0,0,0,2,0,0,0,2,97,51,0,0,0,2,0,0,0,0,0,0,0,0,0,3,0,0,0, 2,0,0,0,3,0,0,0,1,0,0,0,2,0,0,0,3,0,0,0,1,0,0,0,0,0,1,0, 0,0,1,0,0,0,2,0,0,0,1,0,0,0,2,0,0,0,2,0,0,0,1,0,0,0,2>>

load balancing to partitions<<0,0,0,0,0,0,0,3,0,0,0,3,0,25,118,97,103,114,97,110,116,

45,117,98,117,110,116,117,45,112,114,101,99,105,115,101, 45,54,52,0,0,35,133,0,0,0,1,0,25,118,97,103,114,97,110,

116,45,117,98,117,110,116,117,45,112,114,101,99,105,115, 101,45,54,52,0,0,35,131,0,0,0,2,0,25,118,97,103,114,97,

110,116,45,117,98,117,110,116,117,45,112,114,101,99,105, 115,101,45,54,52,0,0,35,132,0,0,0,3,0,0,0,2,97,49,0,0,0,

2,0,0,0,0,0,0,0,0,0,3,0,0,0,1,0,0,0,3,0,0,0,1,0,0,0,3,0, 0,0,0,0,1,0,0,0,1,0,0,0,1,0,0,0,1,0,0,0,1,0,0,0,1,0,0,0, 2,97,50,0,0,0,2,0,0,0,0,0,0,0,0,0,1,0,0,0,1,0,0,0,1,0,0, 0,1,0,0,0,1,0,0,0,0,0,1,0,0,0,2,0,0,0,1,0,0,0,2,0,0,0,1, 0,0,0,2,0,0,0,2,97,51,0,0,0,2,0,0,0,0,0,0,0,0,0,3,0,0,0, 2,0,0,0,3,0,0,0,1,0,0,0,2,0,0,0,3,0,0,0,1,0,0,0,0,0,1,0, 0,0,1,0,0,0,2,0,0,0,1,0,0,0,2,0,0,0,2,0,0,0,1,0,0,0,2>>

clientskeeps things simplea subset of what clients need to do include:

1. bootstrap 2. open sockets 3. encode packets 4. decode packets 5. send over tcp 6. route responses 7. handle failures, events 8. state machines

distributed systems need to be built as state machinesdon’t trust a client that blocksuntil an operation is complete

distributed systems need to be built as state machines

// send// wait for responseBAD even if this thread/processis doing nothing until timeoutthe idling is not efficient

distributed systems need to be built as state machines

// listening for socket states// listening for responses // send, and continue// on_recv, route response// after_timeout, route timeoutGOOD

distributed systems need to be built as state machines

set concurrency options for ekaf

set the hostname of a load balancer over your brokers

http://github.com/helpshift/ekaf

distributed systems need to be built as state machines

ekaf hits the ground running with 1 call // publish(topic, message) if no state machine, it flows from request metadata -> worker pool creation -> socket connecting -> ready state

http://github.com/helpshift/ekaf

distributed systems need to be built as state machines

if topic state machine has metadatait knows which broker for each partition

if state already has socket, queue it

http://github.com/helpshift/ekaf

distributed systems need to be built as state machines

all messages in states before ready are queued.

if queue hits size OR hits flush timeout. send it

http://github.com/helpshift/ekaf

http://github.com/helpshift/ekafGo through docs at

tests include broker downtime , adding a broker, etc & a mini kafka broker for tests

http://github.com/helpshift/kafkamocker

flushed queue worker up

worker down downtime saved

downtime replayed time to connect max downtime q

Callbacks used for Metrics

last year 6k/minnow 6k/sec

only scaled up

servers

hello worlddriven development

response time against endpoint to just echo

ekaf @Layer (ex-Apple engineers) “The art of powering the Internet’s next messaging system” https://www.youtube.com/watch?v=mv2MBYU8Yls#t=33m5s

ekaf@ a chinese social network1 pull request about to be merged

and elsewhere

back to pipelinesinvolving a kafka producer and consumer

SDK nginx auth/api#erlang

active user analytics pipeline @helpshift

PGS3

kafkahttpproducer#kafboy (uses ekaf)

to disk

hyperloglogcounts

#clojure

kafka consumer #clj-kafka

~1 billion devices

HA

EMR (internal jobs)

(dashboards)

mail delivery @helpshift

actually sent

#clj-kafka

kafka consumer#clj-kafka

email

[WIP] ES indexing @helpshift

actually indexed

#clj-kafka

ES bulk index

docs

audit/action trails @helpshift

PG

#clj-kafka

kafka consumer#clj-kafka

old objectnew object

diff emit/ignorerows

few rulesobjects are namespace

must have id

Storm @helpshift

iTunesPlay

reviewsdistributedcrawler #go#master#worker farm#controller

the reviews storm pipeline @helpshift

PG

kafkaproducer#shopify/sarama

deduplication

tokenization

topic extraction

sentiment analysis

stormkafka spout

example storm topology read up more on spouts and bolts (any Q’s?)

• PG and multiple bolts tip • Metrics at every bolt • Local statsite -> grafana • Avoid metric explosion

instrumenting tips

[WIP] segment population

query ESjob tracking

#clj-kafka

kafka consumer / storm

elasticsearch query

representing segment

scheduler

S3

countsin PG

“find users who match level 2 , who did not find the easter egg”

fit your use case population count

moving average

a note on samza’s state

Kafka is your glue to have different teams break down complex end-to-end problems into smaller more manageable oneslike a “checkpoint” in a long hike/game/drive

seperation of concerns helps you design distributed systems

numbers

your job

good sleep

your product

kafka + storm

Small Snapshot of Helpshift

Hay Day Boom Beach Clash of Clans Deer Hunter High School Story

Family GuyFlipboard Circa Wordpress Misfit Microsoft Outlook

APP + API

DBMONITORING

OTHER

ROUTING

HAProxy

Our SDK is being embedded in a growing list of popular apps

#kafka#storm

Thanks!

I’m @bhaskerkode aka BoskyProduct Engg @Helpshift bosky@helpshift.com

Find this talk at http://bit.ly/fifthel15-kafka-storm