+ All Categories
Home > Technology > Kafka as a message queue

Kafka as a message queue

Date post: 21-Jan-2018
Category:
Upload: softwaremill
View: 1,279 times
Download: 0 times
Share this document with a friend
21
KAFKA AS A MQ CAN YOU DO IT, AND SHOULD YOU DO IT? Adam Warski, Apache Kafka London Meetup
Transcript

KAFKA AS A MQ CAN YOU DO IT, AND SHOULD YOU DO IT?

Adam Warski, Apache Kafka London Meetup

@adamwarski, SoftwareMill, Kafka London Meetup

THE PLAN

➤ Acknowledgments in plain Kafka

➤ Why selective acknowledgments?

➤ Why not …MQ?

➤ Kmq implementation

➤ Demo

➤ Performance

@adamwarski, SoftwareMill, Kafka London Meetup

➤ Offset commits:

➤ Using this, we can implement:

➤ at-least-once processing

➤ at-most-once processing

topic

msg25msg24

ACKNOWLEDGMENTS IN PLAIN KAFKA

msg18

partition 1

partition 2

partition 3

msg19 msg20 msg21 msg22 msg23

commit offset: 20commit offset: 24

@adamwarski, SoftwareMill, Kafka London Meetup

WHY SELECTIVE ACKNOWLEDGMENTS?

➤ Integrating with external systems

➤ e.g. HTTP/REST endpoints

➤ email

➤ other messaging

➤ Individual calls might fail

➤ should be retried

➤ without retrying the whole batch

➤ without delaying subsequent batches

@adamwarski, SoftwareMill, Kafka London Meetup

WHY NOT …MQ?

➤ Typical usage scenario for a message queue

➤ RabbitMQ, ActiveMQ, Artemis, SQS …

➤ Kafka:

➤ proven & reliable clustering & replication mechanisms

➤ performance

➤ convenience: reduce operational complexity

@adamwarski, SoftwareMill, Kafka London Meetup

AMAZON SQS

➤ Message queue as-a-service

➤ Simple API: ➤ CreateQueue

➤ SendMessage

➤ ReceiveMessage

➤ DeleteMessage

➤ Received messages are blocked for a period of time

➤ visibility timeout

@adamwarski, SoftwareMill, Kafka London Meetup

KMQ: IMPLEMENTATION

➤ Two topics:

➤ queue: messages to process

➤ markers: for each message, start/end markers

➤ same number of partitions

➤ A number of queue clients

➤ here data is processed

➤ A number of redelivery trackers

@adamwarski, SoftwareMill, Kafka London Meetup

QUEUE CLIENT

1. Read message from queue

2. Write start [offset] to markers

➤ wait for send to complete!

3. Commit offset to queue

4. Process the message

5. Write end [offset] markers

markers topic

partition 1

partition 2

partition 3

queue topic

partition 1

partition 2

partition 3

msg37

4. process message

fail processing, wait for redelivery

msg39msg40

1. read messages from

topic

start marker offset: 39

2. write start markers

msg38

3. commit offsets

offset: 38

success, confirm message processed

end marker offset: 37

5. write end markers

redelivery tracker

// started, not ended markersoffset=10, time=1488010644offset=15, time=1488141843offset=24, time=1488289812…

marker stream

every second trigger

redeliver timed out messages

read & redeliver messagemsg10

@adamwarski, SoftwareMill, Kafka London Meetup

REDELIVERY TRACKER

➤ A Kafka application

➤ consumes the markers topic

➤ Multiple instances for fail-over

➤ Uses Kafka’s auto-partition-assignment

@adamwarski, SoftwareMill, Kafka London Meetup

REDELIVERY TRACKER

➤ In-memory priority queue

➤ by Kafka’s marker timestamp

➤ messages with start markers, but no end markers

➤ Checks for messages to redeliver at regular intervals

➤ redelivery: seek + send

➤ in order

DEMO

@adamwarski, SoftwareMill, Kafka London Meetup

PERFORMANCE

➤ 3-node Kafka cluster

➤ m4.2xlarge servers (8 CPUs, 32GiB RAM)

➤ single AZ

➤ 100 byte messages, sent in batches of up to 10

➤ Up to 8 sender/receiver nodes

➤ 64 to 160 partitions ➤ replication-factor=3

➤ min.insync.replicas=2

➤ acks=all (-1)

@adamwarski, SoftwareMill, Kafka London Meetup

PLAIN KAFKA KMQ

@adamwarski, SoftwareMill, Kafka London Meetup

LATENCY

➤ Plain Kafka: ~50 milliseconds

➤ kmq: 50ms - 130ms

@adamwarski, SoftwareMill, Kafka London Meetup

WHAT IF MESSAGES ARE DROPPED?

➤ 50% drop rate

@adamwarski, SoftwareMill, Kafka London Meetup

KMQ INTERNALS➤ RedeliveryTracker

➤ Implemented in Scala, with a Java API

➤ Uses Akka

➤ One tracking actor per markers topic partition

➤ One redeliver actor per queue topic partition

➤ Started/stopped when partitions are revoked/assigned ➤ KmqClient

➤ Single Java class

➤ + marker value classes

@adamwarski, SoftwareMill, Kafka London Meetup

ABOUT ME

➤ Software engineer, co-founder @

➤ Custom software development: Scala/Kafka/Java/Cassandra/…

➤ Open-source: sttp, QuickLens, ElasticMQ, Envers, MacWire, …

➤ Blog @ softwaremill.com/blog

➤ Twitter @ twitter.com/adamwarski

@adamwarski, SoftwareMill, Kafka London Meetup

SUMMARY

➤ Individual, selective message acknowledgments

➤ similar to SQS

➤ Alternative to batch/up-to-offset acknowledgments in plain Kafka

➤ Storage overhead: additional meta-data topic

➤ Performance overhead: comparable

➤ Integrating with external systems

@adamwarski, SoftwareMill, Kafka London Meetup

LINKS

➤ GitHub: https://github.com/softwaremill/kmq

➤ Introductory blog: https://softwaremill.com/using-kafka-as-a-message-queue/

➤ Message queue performance: https://softwaremill.com/mqperf/

➤ @adamwarski / [email protected]

THANK YOU!


Recommended