Apache Kafka Event-Streaming Platform for .NET Developers · 1 Apache Kafka Event-Streaming...

Post on 09-Aug-2020

10 views 0 download

transcript

1

Apache Kafka Event-Streaming Platform for .NET DevelopersOctober, 2019

@gamussa | #SpringOne | @ConfluentINc

#SpringOne @s1p

@gamussa | #SpringOne | @ConfluentINc

2

@gamussa | #springone | @ConfluentINc

3

I build highly scalable

Hello World

apps

@gamussa | #SpringOne | @ConfluentINc

4

A company is build on DATA FLOWS

but All we have is DATA STORES

@gamussa | #SpringOne | @ConfluentINc

5

Pre-Streaming

@gamussa | #SpringOne | @ConfluentINc

6

@gamussa | #SpringOne | @ConfluentINc

8

New World Streaming first• DB/DWH + Many more

distributed data systems

• Monolith -> Microservices

• Batch -> Real-time

@gamussa | #SpringOne | @ConfluentINc

9

Origins in Stream Processing

Serving Layer

(Microservices, Elastic, etc.)

Java Apps with Kafka Streams or KSQL

Continuous Computation

High Throughput Streaming platform

API based clustering

@gamussa | #SpringOne | @ConfluentINc

10

Streaming Platform

Storage

Pub / Sub

Processing

@gamussa | #SpringOne | @ConfluentINc

11

Storage

@gamussa | #SpringOne | @ConfluentINc

12

● DB - table ● Hadoop - file ● Kafka - ?

Core

Abstraction

@gamussa | #SpringOne | @ConfluentINc

14

LOG

@gamussa | #SpringOne | @ConfluentINc

15

The log is a simple idea

Messages are added at the end of the log

Old New

@gamussa | #SpringOne | @ConfluentINc

16

Messages are added at the end of the log

Old New

The log is a simple idea

@gamussa | #SpringOne | @ConfluentINc

17

Pub / Sub

@gamussa | #SpringOne | @ConfluentINc

18

Time

@gamussa | #SpringOne | @ConfluentINc

19

C2 C3C1

Time

@gamussa | #SpringOne | @ConfluentINc

20

TimeA

B

C

D

hash(key) % numPartitions = N

@gamussa | #SpringOne | @ConfluentINc

21

Messages will be produced in a round robin fashion

Time

@gamussa | #SpringOne | @ConfluentINc

22

Consumers have a position all of their own

Old New

Robin is here

Scan Viktor is here

Scan

Ricardo is here

Scan

@gamussa | #SpringOne | @ConfluentINc

23

Old New

Robin is here

Scan Viktor is here

Scan

Ricardo is here

Scan

Consumers have a position all of their own

@gamussa | #SpringOne | @ConfluentINc

24

Old New

Robin is here

Scan Viktor is here

Scan

Ricardo is here

Scan

Consumers have a position all of their own

@gamussa | #SpringOne | @ConfluentINc

25

Only Sequential Access

Old NewRead to offset & scan

CONSUMER GROUP COORDINATORCONSUMERS

CONSUMER GROUP

@gamussa | #SpringOne | @ConfluentINc

27

C

@gamussa | #SpringOne | @ConfluentINc

28

CCC1

CCC2

@gamussa | #SpringOne | @ConfluentINc

29

C C

C C

@gamussa | #SpringOne | @ConfluentINc

30

0 1

2 3

@gamussa | #SpringOne | @ConfluentINc

31

0 1

2 3

@gamussa | #SpringOne | @ConfluentINc

32

0, 3 1

2 3

@gamussa | #SpringOne | @ConfluentINc

33

Linearly Scalable Architecture

Single topic: - Many producers machines - Many consumer machines - Many Broker machines No Bottleneck!!

Producers

Consumers

@gamussa | #SpringOne | @ConfluentINc

34

Replicate to get fault

replicate

msg

msg

leader

Machine A

Machine B

@gamussa | #SpringOne | @ConfluentINc

35

Partition Leadership and Replication

Broker 1

Topic1 partition1

Broker 2 Broker 3 Broker 4

Topic1 partition1

Topic1 partition1

Leader Follower

Topic1 partition2

Topic1 partition2

Topic1 partition2

Topic1 partition3

Topic1 partition4

Topic1 partition3

Topic1 partition3

Topic1 partition4

Topic1 partition4

@gamussa | #SpringOne | @ConfluentINc

36

Replication provides resiliency

A replica takes over on machine failure

@gamussa | #SpringOne | @ConfluentINc

37

Partition Leadership and Replication - node failure

Broker 1

Topic1 partition1

Broker 2 Broker 3 Broker 4

Topic1 partition1

Topic1 partition1

Leader Follower

Topic1 partition2

Topic1 partition2

Topic1 partition2

Topic1 partition3

Topic1 partition4

Topic1 partition3

Topic1 partition3

Topic1 partition4

Topic1 partition4

38

Similar to a traditional messaging system (ActiveMQ, Rabbit etc) but with: (a) Far better scalability (b) Built in fault tolerance / HA (c) Storage

The log is a type of durable messaging system

@gamussa | #SpringOne | @ConfluentINc

Stop! Demo time!

@gamussa | #SpringOne | @ConfluentINc

40

Processing

@gamussa | #SpringOne | @ConfluentINc

41

Streaming

is the toolset for dealing with events

as they move!

@gamussa | #SpringOne | @ConfluentINc

42

authorization_attempts possible_fraud

What exactly is Stream Processing?

@gamussa | #SpringOne | @ConfluentINc

43

CREATE STREAM possible_fraud AS SELECT card_number, count(*) FROM authorization_attempts WINDOW TUMBLING (SIZE 5 MINUTE) GROUP BY card_number HAVING count(*) > 3;

authorization_attempts possible_fraud

What exactly is Stream Processing?

@gamussa | #SpringOne | @ConfluentINc

44

CREATE STREAM possible_fraud AS SELECT card_number, count(*) FROM authorization_attempts WINDOW TUMBLING (SIZE 5 MINUTE) GROUP BY card_number HAVING count(*) > 3;

authorization_attempts possible_fraud

What exactly is Stream Processing?

@gamussa | #SpringOne | @ConfluentINc

45

CREATE STREAM possible_fraud AS SELECT card_number, count(*) FROM authorization_attempts WINDOW TUMBLING (SIZE 5 MINUTE) GROUP BY card_number HAVING count(*) > 3;

authorization_attempts possible_fraud

What exactly is Stream Processing?

@gamussa | #SpringOne | @ConfluentINc

46

CREATE STREAM possible_fraud AS SELECT card_number, count(*) FROM authorization_attempts WINDOW TUMBLING (SIZE 5 MINUTE) GROUP BY card_number HAVING count(*) > 3;

authorization_attempts possible_fraud

What exactly is Stream Processing?

@gamussa | #SpringOne | @ConfluentINc

47

CREATE STREAM possible_fraud AS SELECT card_number, count(*) FROM authorization_attempts WINDOW TUMBLING (SIZE 5 MINUTE) GROUP BY card_number HAVING count(*) > 3;

authorization_attempts possible_fraud

What exactly is Stream Processing?

@gamussa | #SpringOne | @ConfluentINc

48

CREATE STREAM possible_fraud AS SELECT card_number, count(*) FROM authorization_attempts WINDOW TUMBLING (SIZE 5 MINUTE) GROUP BY card_number HAVING count(*) > 3;

authorization_attempts possible_fraud

What exactly is Stream Processing?

@gamussa | #SpringOne | @ConfluentINc

49

Lower the bar to enter the world of streaming

User Population

Codi

ng S

ophi

stic

atio

n Core developers who use Java/Scala

Core developers who don’t use Java/Scala

Data engineers, architects, DevOps/SRE

BI analysts

streams

@gamussa | #SpringOne | @ConfluentINc

50

KSQL #FTW

4 Headless1 UI 2 CLI

ksql>

3 REST

POST /query

@gamussa | #SpringOne | @ConfluentINc

51

Interaction with Kafka

Kafka(data)

KSQL (processing)

Application (processing) Jva/KStreams, .NET

Does not run on Kafka brokers

Does not run on Kafka brokers

@gamussa | #SpringOne | @ConfluentINc

52

Find your local Meetup Group https://cnfl.io/kafka-meetups

Join us in Slackhttp://cnfl.io/slack

Grab Stream Processing books https://cnfl.io/book-bundle

@@gamussa | #SpringOne | @ConfluentINc

Thanks!@gamussa viktor@confluent.io