Date post: | 15-Aug-2015 |
Category: |
Software |
Upload: | bhasker-kode |
View: | 220 times |
Download: | 0 times |
#kafka#storm
@FifthEl ‘15
I’m @bhaskerkode aka BoskyProduct Engg at Helpshift
you’re here at a bigdata, & analytics conference to …
mobile
webnumbers
insights
predictiot
internal
?
this talk
mobile
webnumbers
insights
predictiot
internal
your jobyour raise?
this talk really is about sleep
goodsleep
Kafka is your glue to have different teams break down complex end-to-end problems into smaller more manageable oneslike a “checkpoint” in a long hike/game/drive
seperation of concerns helps you design distributed systems
kafka & stream processingpower combo <insert quote to impress friend>
developedworld
wiredcommunication
wireless / mobile
developing world
wireless / mobile
the same may happen with bigdata & analyticsdataingestion
offline processing
streamprocessing
dataingestion
streamprocessing
mobile
webHA
iot
internal
biz logic
typical analytics ingestion architecture
dbcachesearchinfra
mo
web
HAiot
inte
auth logic
task #1
kafkaproducer
dbcachesearchinfra
kafkaconsumer
consumer storm spark samza
how kafka helps me sleep betterhow? let’s talk numbers
~300 millionrequests/day to our events kafka producer; ekaf
served by 22 coresor 11 servers
running on auto-pilotfor 1+ years little maintenance, only scale serverszero-downtime support plenty of metrics
(started with 3 servers replacing 14 JVM producers)
separate auth logic from producer logic
erlang
= peaceful sleepopen source
(scalable)
mo
web
iot
inte
modular design
kafboy
ekaf
auth logic
HTTP 500unauthorizedwrong api’s
metrics
HTTP 200
modular design
kafboy
ekaf
auth logic
HTTP 500unauthorizedwrong api’s
metrics
HTTP 200
one thing particularly distinguishes kafkasecret sauce? let’s take wild guesses
community?NO let’s take more wild guesses at the secret sauce
let’s illustrate with some examples
You: “Create topic foo with 3 partitions”
kafka: “OK! each partition is set on a different broker. one of the brokers is an elected leader,
push your messages to the leader host, port please. if leader goes down, other broker is elected.”
is it the fault-tolerance you get?
NO let’s take more wild guesses at the secret sauce
You: “create topic foo” with 3 partitions
partitionsleaders
election
You: “hey broker1, where should i push to?”
kafka: please dial 1-800-metadata to any broker. we all maintain this
data. you can make a metadata request with topic(s) you want, and
we will return it s partitions, and their broker hosts, ports, leader
info, etc
is it the separation of concerns?
NO let’s take more wild guesses at the secret sauce
You: “hey broker1, where should i push to?”
metadata on all brokers
gives host & port of leader for partitions + more info
You: “hey partition1 on broker1, for topic1here are 10 messages.”
kafka: “thank you for calling partition1. i will append it to a dir/file called topic1/
partition1.log.
Since its append only like a commit log, you get ordering within partition free!”
is it the producer speed you get?
NO let’s take more wild guesses at the secret sauce
You: “hey partition1 on broker1, for topic1here are 10 messages.”
append only commit-logordering within partition
make 1 partition for global ordering
You: “hey broker2, where should i consume from?
kafka: “thank you for connecting to the right broker. you can now read any
partition you want, and from any offset. All i care about is which offset
to start reading from in a topic/partition.log file”
You: “hey broker1, topic1, partition1i want all data from message offset ….10(from ZK) onwards”
kafka: “thank you for calling partition1. i’m going to sendfile the bytes you asked from kernel-space directly to the socket
of a consumer using zero copy, thus reducing context switches & minimal garbage collection. yes, i am badass.”
is it the consumer speed you get?
NO let’s take more wild guesses at the secret sauce
You: “hey broker1, topic1, partition1i want all data from message offset ….10(from ZK) onwards”
offset bytessendfile bytes from offset to socket
kernel space not userspace
You: “for topic1, i want 3 consumers all reading every msg. for topic2, i want the data split between 3 consumers”
kafka: “3 different pipelines/actions on the same input topic1? Nice!
I can see your team is growing.
Thank you for grokking the concept of consumer groups topic2. Make sure all
3 of your use the same group-id, and i’ll take of the rest!”
is it the consumer parallelism?
NO let’s take more wild guesses at the secret sauce
You: “for topic1, i want 3 consumers all reading every msg.for topic2, i want the data split between 3 consumers”
can broadcast to all consumerscan split b/w a group of consumers
binary protocol.but importantly a documented spec of what goes/comes over-the-wire
the one thing particularly distinguishes kafka & gives it a stellar status in the ecosystem is its
the creators of kafka built the brokers & the specof how to communicate with it (producers & consumers),
and let the community speak the protocol
a documented spec of what goes/comes over-the-wire
the creators of kafka built the brokers & the spec
a documented spec of what goes/comes over-the-wire
eg: do you know what is sent by a namenode/jobtracker/datanode over the wire?
(PS: where is the spec? come meet me after)
protocols win over api’s/drivers
the creators of kafka focussed on the spec
If you knew what data a namenode/jobtracker/datanode actually communicates over the wire. It opens up a new world.
allows diff language clients to express themselves best
its just data over TCP sockets.
more freedom
more integrations & faster adoption
000300000000000 10007636c69656e7 4310000000100066
576656e7473
0,3,0,0,0,0,0,1,0,7,99,108,
105,101,110,116,49,0,0,0,1,0,
6,101,118,101,110,116,115
1. open a tcp socket to any kafka 0.8+ broker:port2. send these 29 bytes that asks for metadata for topic
“events”
these 29 bytes
(or in hex)
and you’llget backmetadata
for the topic“events”
always.
0,3, 0,0,
0,0,0,1, 0,7,
99,108,105,101,110,116,49, 0,0,0,1,
0,6, 101,118,101,110,116,115
[2 bytes] metadata code = 3 [2 bytes] api version = 0 [4 bytes] int id (for replies) [2 bytes] client id length = 7 [7 bytes] “client1” [4 bytes] no# of topics = 1 [2 bytes] topic[0] length = 6 [6 bytes] “events”
what this packet means
<<0,0,0,0,0,0,0,3,0,0,0,3,0,25,118,97,103,114,97,110,116, 45,117,98,117,110,116,117,45,112,114,101,99,105,115,101,
45,54,52,0,0,35,133,0,0,0,1,0,25,118,97,103,114,97,110, 116,45,117,98,117,110,116,117,45,112,114,101,99,105,115,
101,45,54,52,0,0,35,131,0,0,0,2,0,25,118,97,103,114,97, 110,116,45,117,98,117,110,116,117,45,112,114,101,99,105,
115,101,45,54,52,0,0,35,132,0,0,0,3,0,0,0,2,97,49,0,0,0, 2,0,0,0,0,0,0,0,0,0,3,0,0,0,1,0,0,0,3,0,0,0,1,0,0,0,3,0, 0,0,0,0,1,0,0,0,1,0,0,0,1,0,0,0,1,0,0,0,1,0,0,0,1,0,0,0, 2,97,50,0,0,0,2,0,0,0,0,0,0,0,0,0,1,0,0,0,1,0,0,0,1,0,0, 0,1,0,0,0,1,0,0,0,0,0,1,0,0,0,2,0,0,0,1,0,0,0,2,0,0,0,1, 0,0,0,2,0,0,0,2,97,51,0,0,0,2,0,0,0,0,0,0,0,0,0,3,0,0,0, 2,0,0,0,3,0,0,0,1,0,0,0,2,0,0,0,3,0,0,0,1,0,0,0,0,0,1,0, 0,0,1,0,0,0,2,0,0,0,1,0,0,0,2,0,0,0,2,0,0,0,1,0,0,0,2>>
% 0,0,0,1, % this is a response to req id 1 % 0,0,0,2, % number of brokers % 0,0,0,1, % broker[0] id % then broker name, host, port % 0,0,0,3, % topics len % 0,0,0,6, % topic[0] name len % ……………….. % topic[0] name events % 0,0,0,2, % topic[0] partitions len
% 0,0, % topic[0] partition1 error code % 0,0,0,0, % topic[0] partition1 % 0,0,0,1 % topic[0] partition1 leaderid % 0,0,0,1, % topic[0] partition1 replicas len % 0,0,0,3, % topic[0] partition1 replica1 % 0,0,0,1, % topic[0] partition1 isr len % 0,0,0,3, % topic[0] partition1 isr1
% another partition data here % etc
metadataresponsedecoded
similiarly all other operationsencode the request as a packet send over tcp socket
sync produce handle response<<0,0,0,0,0,0,0,3,0,0,0,3,0,25,118,97,103,114,97,110,116,
45,117,98,117,110,116,117,45,112,114,101,99,105,115,101, 45,54,52,0,0,35,133,0,0,0,1,0,25,118,97,103,114,97,110,
116,45,117,98,117,110,116,117,45,112,114,101,99,105,115, 101,45,54,52,0,0,35,131,0,0,0,2,0,25,118,97,103,114,97,
110,116,45,117,98,117,110,116,117,45,112,114,101,99,105, 115,101,45,54,52,0,0,35,132,0,0,0,3,0,0,0,2,97,49,0,0,0,
2,0,0,0,0,0,0,0,0,0,3,0,0,0,1,0,0,0,3,0,0,0,1,0,0,0,3,0, 0,0,0,0,1,0,0,0,1,0,0,0,1,0,0,0,1,0,0,0,1,0,0,0,1,0,0,0, 2,97,50,0,0,0,2,0,0,0,0,0,0,0,0,0,1,0,0,0,1,0,0,0,1,0,0, 0,1,0,0,0,1,0,0,0,0,0,1,0,0,0,2,0,0,0,1,0,0,0,2,0,0,0,1, 0,0,0,2,0,0,0,2,97,51,0,0,0,2,0,0,0,0,0,0,0,0,0,3,0,0,0, 2,0,0,0,3,0,0,0,1,0,0,0,2,0,0,0,3,0,0,0,1,0,0,0,0,0,1,0, 0,0,1,0,0,0,2,0,0,0,1,0,0,0,2,0,0,0,2,0,0,0,1,0,0,0,2>>
<<0,0,0,0,0,0,0,3,0,0,0,3,0,25,118,97,103,114,97,110,116, 45,117,98,117,110,116,117,45,112,114,101,99,105,115,101,
45,54,52,0,0,35,133,0,0,0,1,0,25,118,97,103,114,97,110, 116,45,117,98,117,110,116,117,45,112,114,101,99,105,115,
101,45,54,52,0,0,35,131,0,0,0,2,0,25,118,97,103,114,97, 110,116,45,117,98,117,110,116,117,45,112,114,101,99,105,
115,101,45,54,52,0,0,35,132,0,0,0,3,0,0,0,2,97,49,0,0,0, 2,0,0,0,0,0,0,0,0,0,3,0,0,0,1,0,0,0,3,0,0,0,1,0,0,0,3,0, 0,0,0,0,1,0,0,0,1,0,0,0,1,0,0,0,1,0,0,0,1,0,0,0,1,0,0,0, 2,97,50,0,0,0,2,0,0,0,0,0,0,0,0,0,1,0,0,0,1,0,0,0,1,0,0, 0,1,0,0,0,1,0,0,0,0,0,1,0,0,0,2,0,0,0,1,0,0,0,2,0,0,0,1, 0,0,0,2,0,0,0,2,97,51,0,0,0,2,0,0,0,0,0,0,0,0,0,3,0,0,0, 2,0,0,0,3,0,0,0,1,0,0,0,2,0,0,0,3,0,0,0,1,0,0,0,0,0,1,0, 0,0,1,0,0,0,2,0,0,0,1,0,0,0,2,0,0,0,2,0,0,0,1,0,0,0,2>>
async produce<<0,0,0,0,0,0,0,3,0,0,0,3,0,25,118,97,103,114,97,110,116,
45,117,98,117,110,116,117,45,112,114,101,99,105,115,101, 45,54,52,0,0,35,133,0,0,0,1,0,25,118,97,103,114,97,110,
116,45,117,98,117,110,116,117,45,112,114,101,99,105,115, 101,45,54,52,0,0,35,131,0,0,0,2,0,25,118,97,103,114,97,
110,116,45,117,98,117,110,116,117,45,112,114,101,99,105, 115,101,45,54,52,0,0,35,132,0,0,0,3,0,0,0,2,97,49,0,0,0,
2,0,0,0,0,0,0,0,0,0,3,0,0,0,1,0,0,0,3,0,0,0,1,0,0,0,3,0, 0,0,0,0,1,0,0,0,1,0,0,0,1,0,0,0,1,0,0,0,1,0,0,0,1,0,0,0, 2,97,50,0,0,0,2,0,0,0,0,0,0,0,0,0,1,0,0,0,1,0,0,0,1,0,0, 0,1,0,0,0,1,0,0,0,0,0,1,0,0,0,2,0,0,0,1,0,0,0,2,0,0,0,1, 0,0,0,2,0,0,0,2,97,51,0,0,0,2,0,0,0,0,0,0,0,0,0,3,0,0,0, 2,0,0,0,3,0,0,0,1,0,0,0,2,0,0,0,3,0,0,0,1,0,0,0,0,0,1,0, 0,0,1,0,0,0,2,0,0,0,1,0,0,0,2,0,0,0,2,0,0,0,1,0,0,0,2>>
no response
MessageSet => [Offset MessageSize Message] Offset => int64
MessageSize => int32 Messages
Message => Crc MagicByte Attributes Key Value
Crc => int32 MagicByte => int8 Attributes => int8
Key => bytes Value => bytes
the joy of knowing how your data is encoded and sent over tcp socketkeeps things simple, lets you sleep better easier to debug, test, add middle-wares to audit, etc
MQTT endoded messages crypto encoded <<0,0,0,0,0,0,0,3,0,0,0,3,0,25,118,97,103,114,97,110,116,
45,117,98,117,110,116,117,45,112,114,101,99,105,115,101, 45,54,52,0,0,35,133,0,0,0,1,0,25,118,97,103,114,97,110,
116,45,117,98,117,110,116,117,45,112,114,101,99,105,115, 101,45,54,52,0,0,35,131,0,0,0,2,0,25,118,97,103,114,97,
110,116,45,117,98,117,110,116,117,45,112,114,101,99,105, 115,101,45,54,52,0,0,35,132,0,0,0,3,0,0,0,2,97,49,0,0,0,
2,0,0,0,0,0,0,0,0,0,3,0,0,0,1,0,0,0,3,0,0,0,1,0,0,0,3,0, 0,0,0,0,1,0,0,0,1,0,0,0,1,0,0,0,1,0,0,0,1,0,0,0,1,0,0,0, 2,97,50,0,0,0,2,0,0,0,0,0,0,0,0,0,1,0,0,0,1,0,0,0,1,0,0, 0,1,0,0,0,1,0,0,0,0,0,1,0,0,0,2,0,0,0,1,0,0,0,2,0,0,0,1, 0,0,0,2,0,0,0,2,97,51,0,0,0,2,0,0,0,0,0,0,0,0,0,3,0,0,0, 2,0,0,0,3,0,0,0,1,0,0,0,2,0,0,0,3,0,0,0,1,0,0,0,0,0,1,0, 0,0,1,0,0,0,2,0,0,0,1,0,0,0,2,0,0,0,2,0,0,0,1,0,0,0,2>>
<<0,0,0,0,0,0,0,3,0,0,0,3,0,25,118,97,103,114,97,110,116, 45,117,98,117,110,116,117,45,112,114,101,99,105,115,101,
45,54,52,0,0,35,133,0,0,0,1,0,25,118,97,103,114,97,110, 116,45,117,98,117,110,116,117,45,112,114,101,99,105,115,
101,45,54,52,0,0,35,131,0,0,0,2,0,25,118,97,103,114,97, 110,116,45,117,98,117,110,116,117,45,112,114,101,99,105,
115,101,45,54,52,0,0,35,132,0,0,0,3,0,0,0,2,97,49,0,0,0, 2,0,0,0,0,0,0,0,0,0,3,0,0,0,1,0,0,0,3,0,0,0,1,0,0,0,3,0, 0,0,0,0,1,0,0,0,1,0,0,0,1,0,0,0,1,0,0,0,1,0,0,0,1,0,0,0, 2,97,50,0,0,0,2,0,0,0,0,0,0,0,0,0,1,0,0,0,1,0,0,0,1,0,0, 0,1,0,0,0,1,0,0,0,0,0,1,0,0,0,2,0,0,0,1,0,0,0,2,0,0,0,1, 0,0,0,2,0,0,0,2,97,51,0,0,0,2,0,0,0,0,0,0,0,0,0,3,0,0,0, 2,0,0,0,3,0,0,0,1,0,0,0,2,0,0,0,3,0,0,0,1,0,0,0,0,0,1,0, 0,0,1,0,0,0,2,0,0,0,1,0,0,0,2,0,0,0,2,0,0,0,1,0,0,0,2>>
thrift/avro encoded messages<<0,0,0,0,0,0,0,3,0,0,0,3,0,25,118,97,103,114,97,110,116,
45,117,98,117,110,116,117,45,112,114,101,99,105,115,101, 45,54,52,0,0,35,133,0,0,0,1,0,25,118,97,103,114,97,110,
116,45,117,98,117,110,116,117,45,112,114,101,99,105,115, 101,45,54,52,0,0,35,131,0,0,0,2,0,25,118,97,103,114,97,
110,116,45,117,98,117,110,116,117,45,112,114,101,99,105, 115,101,45,54,52,0,0,35,132,0,0,0,3,0,0,0,2,97,49,0,0,0,
2,0,0,0,0,0,0,0,0,0,3,0,0,0,1,0,0,0,3,0,0,0,1,0,0,0,3,0, 0,0,0,0,1,0,0,0,1,0,0,0,1,0,0,0,1,0,0,0,1,0,0,0,1,0,0,0, 2,97,50,0,0,0,2,0,0,0,0,0,0,0,0,0,1,0,0,0,1,0,0,0,1,0,0, 0,1,0,0,0,1,0,0,0,0,0,1,0,0,0,2,0,0,0,1,0,0,0,2,0,0,0,1, 0,0,0,2,0,0,0,2,97,51,0,0,0,2,0,0,0,0,0,0,0,0,0,3,0,0,0, 2,0,0,0,3,0,0,0,1,0,0,0,2,0,0,0,3,0,0,0,1,0,0,0,0,0,1,0, 0,0,1,0,0,0,2,0,0,0,1,0,0,0,2,0,0,0,2,0,0,0,1,0,0,0,2>>
AOL/YAHOO/* packets<<0,0,0,0,0,0,0,3,0,0,0,3,0,25,118,97,103,114,97,110,116,
45,117,98,117,110,116,117,45,112,114,101,99,105,115,101, 45,54,52,0,0,35,133,0,0,0,1,0,25,118,97,103,114,97,110,
116,45,117,98,117,110,116,117,45,112,114,101,99,105,115, 101,45,54,52,0,0,35,131,0,0,0,2,0,25,118,97,103,114,97,
110,116,45,117,98,117,110,116,117,45,112,114,101,99,105, 115,101,45,54,52,0,0,35,132,0,0,0,3,0,0,0,2,97,49,0,0,0,
2,0,0,0,0,0,0,0,0,0,3,0,0,0,1,0,0,0,3,0,0,0,1,0,0,0,3,0, 0,0,0,0,1,0,0,0,1,0,0,0,1,0,0,0,1,0,0,0,1,0,0,0,1,0,0,0, 2,97,50,0,0,0,2,0,0,0,0,0,0,0,0,0,1,0,0,0,1,0,0,0,1,0,0, 0,1,0,0,0,1,0,0,0,0,0,1,0,0,0,2,0,0,0,1,0,0,0,2,0,0,0,1, 0,0,0,2,0,0,0,2,97,51,0,0,0,2,0,0,0,0,0,0,0,0,0,3,0,0,0, 2,0,0,0,3,0,0,0,1,0,0,0,2,0,0,0,3,0,0,0,1,0,0,0,0,0,1,0, 0,0,1,0,0,0,2,0,0,0,1,0,0,0,2,0,0,0,2,0,0,0,1,0,0,0,2>>
brokers focus on the commit logs, etcand delegate several opinionated areas to the client, onus is on the client to make smart decisions
compression queue’ing<<0,0,0,0,0,0,0,3,0,0,0,3,0,25,118,97,103,114,97,110,116,
45,117,98,117,110,116,117,45,112,114,101,99,105,115,101, 45,54,52,0,0,35,133,0,0,0,1,0,25,118,97,103,114,97,110,
116,45,117,98,117,110,116,117,45,112,114,101,99,105,115, 101,45,54,52,0,0,35,131,0,0,0,2,0,25,118,97,103,114,97,
110,116,45,117,98,117,110,116,117,45,112,114,101,99,105, 115,101,45,54,52,0,0,35,132,0,0,0,3,0,0,0,2,97,49,0,0,0,
2,0,0,0,0,0,0,0,0,0,3,0,0,0,1,0,0,0,3,0,0,0,1,0,0,0,3,0, 0,0,0,0,1,0,0,0,1,0,0,0,1,0,0,0,1,0,0,0,1,0,0,0,1,0,0,0, 2,97,50,0,0,0,2,0,0,0,0,0,0,0,0,0,1,0,0,0,1,0,0,0,1,0,0, 0,1,0,0,0,1,0,0,0,0,0,1,0,0,0,2,0,0,0,1,0,0,0,2,0,0,0,1, 0,0,0,2,0,0,0,2,97,51,0,0,0,2,0,0,0,0,0,0,0,0,0,3,0,0,0, 2,0,0,0,3,0,0,0,1,0,0,0,2,0,0,0,3,0,0,0,1,0,0,0,0,0,1,0, 0,0,1,0,0,0,2,0,0,0,1,0,0,0,2,0,0,0,2,0,0,0,1,0,0,0,2>>
<<0,0,0,0,0,0,0,3,0,0,0,3,0,25,118,97,103,114,97,110,116, 45,117,98,117,110,116,117,45,112,114,101,99,105,115,101,
45,54,52,0,0,35,133,0,0,0,1,0,25,118,97,103,114,97,110, 116,45,117,98,117,110,116,117,45,112,114,101,99,105,115,
101,45,54,52,0,0,35,131,0,0,0,2,0,25,118,97,103,114,97, 110,116,45,117,98,117,110,116,117,45,112,114,101,99,105,
115,101,45,54,52,0,0,35,132,0,0,0,3,0,0,0,2,97,49,0,0,0, 2,0,0,0,0,0,0,0,0,0,3,0,0,0,1,0,0,0,3,0,0,0,1,0,0,0,3,0, 0,0,0,0,1,0,0,0,1,0,0,0,1,0,0,0,1,0,0,0,1,0,0,0,1,0,0,0, 2,97,50,0,0,0,2,0,0,0,0,0,0,0,0,0,1,0,0,0,1,0,0,0,1,0,0, 0,1,0,0,0,1,0,0,0,0,0,1,0,0,0,2,0,0,0,1,0,0,0,2,0,0,0,1, 0,0,0,2,0,0,0,2,97,51,0,0,0,2,0,0,0,0,0,0,0,0,0,3,0,0,0, 2,0,0,0,3,0,0,0,1,0,0,0,2,0,0,0,3,0,0,0,1,0,0,0,0,0,1,0, 0,0,1,0,0,0,2,0,0,0,1,0,0,0,2,0,0,0,2,0,0,0,1,0,0,0,2>>
detect downtime<<0,0,0,0,0,0,0,3,0,0,0,3,0,25,118,97,103,114,97,110,116,
45,117,98,117,110,116,117,45,112,114,101,99,105,115,101, 45,54,52,0,0,35,133,0,0,0,1,0,25,118,97,103,114,97,110,
116,45,117,98,117,110,116,117,45,112,114,101,99,105,115, 101,45,54,52,0,0,35,131,0,0,0,2,0,25,118,97,103,114,97,
110,116,45,117,98,117,110,116,117,45,112,114,101,99,105, 115,101,45,54,52,0,0,35,132,0,0,0,3,0,0,0,2,97,49,0,0,0,
2,0,0,0,0,0,0,0,0,0,3,0,0,0,1,0,0,0,3,0,0,0,1,0,0,0,3,0, 0,0,0,0,1,0,0,0,1,0,0,0,1,0,0,0,1,0,0,0,1,0,0,0,1,0,0,0, 2,97,50,0,0,0,2,0,0,0,0,0,0,0,0,0,1,0,0,0,1,0,0,0,1,0,0, 0,1,0,0,0,1,0,0,0,0,0,1,0,0,0,2,0,0,0,1,0,0,0,2,0,0,0,1, 0,0,0,2,0,0,0,2,97,51,0,0,0,2,0,0,0,0,0,0,0,0,0,3,0,0,0, 2,0,0,0,3,0,0,0,1,0,0,0,2,0,0,0,3,0,0,0,1,0,0,0,0,0,1,0, 0,0,1,0,0,0,2,0,0,0,1,0,0,0,2,0,0,0,2,0,0,0,1,0,0,0,2>>
load balancing to partitions<<0,0,0,0,0,0,0,3,0,0,0,3,0,25,118,97,103,114,97,110,116,
45,117,98,117,110,116,117,45,112,114,101,99,105,115,101, 45,54,52,0,0,35,133,0,0,0,1,0,25,118,97,103,114,97,110,
116,45,117,98,117,110,116,117,45,112,114,101,99,105,115, 101,45,54,52,0,0,35,131,0,0,0,2,0,25,118,97,103,114,97,
110,116,45,117,98,117,110,116,117,45,112,114,101,99,105, 115,101,45,54,52,0,0,35,132,0,0,0,3,0,0,0,2,97,49,0,0,0,
2,0,0,0,0,0,0,0,0,0,3,0,0,0,1,0,0,0,3,0,0,0,1,0,0,0,3,0, 0,0,0,0,1,0,0,0,1,0,0,0,1,0,0,0,1,0,0,0,1,0,0,0,1,0,0,0, 2,97,50,0,0,0,2,0,0,0,0,0,0,0,0,0,1,0,0,0,1,0,0,0,1,0,0, 0,1,0,0,0,1,0,0,0,0,0,1,0,0,0,2,0,0,0,1,0,0,0,2,0,0,0,1, 0,0,0,2,0,0,0,2,97,51,0,0,0,2,0,0,0,0,0,0,0,0,0,3,0,0,0, 2,0,0,0,3,0,0,0,1,0,0,0,2,0,0,0,3,0,0,0,1,0,0,0,0,0,1,0, 0,0,1,0,0,0,2,0,0,0,1,0,0,0,2,0,0,0,2,0,0,0,1,0,0,0,2>>
clientskeeps things simplea subset of what clients need to do include:
1. bootstrap 2. open sockets 3. encode packets 4. decode packets 5. send over tcp 6. route responses 7. handle failures, events 8. state machines
distributed systems need to be built as state machinesdon’t trust a client that blocksuntil an operation is complete
distributed systems need to be built as state machines
// send// wait for responseBAD even if this thread/processis doing nothing until timeoutthe idling is not efficient
distributed systems need to be built as state machines
// listening for socket states// listening for responses // send, and continue// on_recv, route response// after_timeout, route timeoutGOOD
distributed systems need to be built as state machines
set concurrency options for ekaf
set the hostname of a load balancer over your brokers
http://github.com/helpshift/ekaf
distributed systems need to be built as state machines
ekaf hits the ground running with 1 call // publish(topic, message) if no state machine, it flows from request metadata -> worker pool creation -> socket connecting -> ready state
http://github.com/helpshift/ekaf
distributed systems need to be built as state machines
if topic state machine has metadatait knows which broker for each partition
if state already has socket, queue it
http://github.com/helpshift/ekaf
distributed systems need to be built as state machines
all messages in states before ready are queued.
if queue hits size OR hits flush timeout. send it
http://github.com/helpshift/ekaf
tests include broker downtime , adding a broker, etc & a mini kafka broker for tests
http://github.com/helpshift/kafkamocker
flushed queue worker up
worker down downtime saved
downtime replayed time to connect max downtime q
Callbacks used for Metrics
last year 6k/minnow 6k/sec
only scaled up
servers
hello worlddriven development
response time against endpoint to just echo
ekaf @Layer (ex-Apple engineers) “The art of powering the Internet’s next messaging system” https://www.youtube.com/watch?v=mv2MBYU8Yls#t=33m5s
ekaf@ a chinese social network1 pull request about to be merged
and elsewhere
back to pipelinesinvolving a kafka producer and consumer
SDK nginx auth/api#erlang
active user analytics pipeline @helpshift
PGS3
kafkahttpproducer#kafboy (uses ekaf)
to disk
hyperloglogcounts
#clojure
kafka consumer #clj-kafka
~1 billion devices
HA
EMR (internal jobs)
(dashboards)
mail delivery @helpshift
actually sent
#clj-kafka
kafka consumer#clj-kafka
[WIP] ES indexing @helpshift
actually indexed
#clj-kafka
ES bulk index
docs
audit/action trails @helpshift
PG
#clj-kafka
kafka consumer#clj-kafka
old objectnew object
diff emit/ignorerows
few rulesobjects are namespace
must have id
Storm @helpshift
iTunesPlay
reviewsdistributedcrawler #go#master#worker farm#controller
the reviews storm pipeline @helpshift
PG
kafkaproducer#shopify/sarama
deduplication
tokenization
topic extraction
sentiment analysis
stormkafka spout
example storm topology read up more on spouts and bolts (any Q’s?)
• PG and multiple bolts tip • Metrics at every bolt • Local statsite -> grafana • Avoid metric explosion
instrumenting tips
[WIP] segment population
query ESjob tracking
#clj-kafka
kafka consumer / storm
elasticsearch query
representing segment
scheduler
S3
countsin PG
“find users who match level 2 , who did not find the easter egg”
fit your use case population count
moving average
a note on samza’s state
Kafka is your glue to have different teams break down complex end-to-end problems into smaller more manageable oneslike a “checkpoint” in a long hike/game/drive
seperation of concerns helps you design distributed systems
numbers
your job
good sleep
your product
kafka + storm
Small Snapshot of Helpshift
Hay Day Boom Beach Clash of Clans Deer Hunter High School Story
Family GuyFlipboard Circa Wordpress Misfit Microsoft Outlook
APP + API
DBMONITORING
OTHER
ROUTING
HAProxy
Our SDK is being embedded in a growing list of popular apps
#kafka#storm
Thanks!
I’m @bhaskerkode aka BoskyProduct Engg @Helpshift [email protected]
Find this talk at http://bit.ly/fifthel15-kafka-storm