+ All Categories
Home > Technology > A Gentle Introduction To Storm And Kafka

A Gentle Introduction To Storm And Kafka

Date post: 12-Apr-2017
Category:
Upload: mammoth-data
View: 115 times
Download: 0 times
Share this document with a friend
23
The Leader in Big Data Consulting
Transcript
Page 1: A Gentle Introduction To Storm And Kafka

The Leader in Big Data Consulting

Page 2: A Gentle Introduction To Storm And Kafka

www.mammothdata.com | @mammothdataco

A Gentle Introduction of Kafka and Storm

{Percona University | Raleigh}

Page 3: A Gentle Introduction To Storm And Kafka

www.mammothdata.com | @mammothdatacowww.mammothdata.com | @mammothdataco

Open Software Integrators

Open Software Integrators is a Big Data consulting and services company specializing in Hadoop, Cassandra, MongoDB and other NoSQL technologies. OSI focuses on executive strategy, initial install, design and implementation.

Founded January 2008 by Andrew C. Oliver

Based in downtown Durham, NC

Partnered with Hortonworks, MongoDB, DataStax, Cloudera, Couchbase, Cloudbees & Neo Technology

Page 4: A Gentle Introduction To Storm And Kafka

www.mammothdata.com | @mammothdatacowww.mammothdata.com | @mammothdataco

A Gentle Introduction

What Kafka and Storm are?What they can be used for?What they excel at?

Page 5: A Gentle Introduction To Storm And Kafka

www.mammothdata.com | @mammothdataco

Kafka

Kafka and Storm

Page 6: A Gentle Introduction To Storm And Kafka

www.mammothdata.com | @mammothdatacowww.mammothdata.com | @mammothdataco

What is Apache Kafka?

Kafka is a distributed, partitioned, replicated commit log service.

Page 7: A Gentle Introduction To Storm And Kafka

www.mammothdata.com | @mammothdatacowww.mammothdata.com | @mammothdataco

The Commit Log

An append-only, immutable sequence of records ordered by time.

firstrecord

next writtenrecord

Page 8: A Gentle Introduction To Storm And Kafka

www.mammothdata.com | @mammothdatacowww.mammothdata.com | @mammothdataco

Kafka is:

● fast● durable● distributed● scalable

Page 9: A Gentle Introduction To Storm And Kafka

www.mammothdata.com | @mammothdatacowww.mammothdata.com | @mammothdataco

Kafka Abstractions

● Topic: feeds of messages in categories● Broker: a host running Kafka● Producer: a process that publishes messages● Consumer: a process that pulls messages● Partition: portion of a topic’s stream of messages

Page 10: A Gentle Introduction To Storm And Kafka

www.mammothdata.com | @mammothdatacowww.mammothdata.com | @mammothdataco

What Kafka is used for:

Enterprise-grade event streaming

Page 11: A Gentle Introduction To Storm And Kafka

www.mammothdata.com | @mammothdatacowww.mammothdata.com | @mammothdataco

What Kafka is not good at:

Doing anything other than being a commit log.

Page 12: A Gentle Introduction To Storm And Kafka

www.mammothdata.com | @mammothdataco

Storm

Kafka and Storm

Page 13: A Gentle Introduction To Storm And Kafka

www.mammothdata.com | @mammothdatacowww.mammothdata.com | @mammothdataco

What is Apache Storm?

Storm is a distributed, real time computation system

Page 14: A Gentle Introduction To Storm And Kafka

www.mammothdata.com | @mammothdatacowww.mammothdata.com | @mammothdataco

Stream Processing

● AKA Event Sourcing ● Command and Query Responsibility Segregation● Complex Event Processing● etc.

Several process fail into the domain of stream processing.

Page 15: A Gentle Introduction To Storm And Kafka

www.mammothdata.com | @mammothdataco

● Simple API● Guaranteed data processing● Fault tolerant● Scalable● Usable with any language

What Storm Does

Page 16: A Gentle Introduction To Storm And Kafka

www.mammothdata.com | @mammothdataco

Three abstractions:● Spouts● Bolts● Topology

Storm Abstractions

SpoutSpout

BoltBoltBolt

Bolt

Page 17: A Gentle Introduction To Storm And Kafka

www.mammothdata.com | @mammothdataco

Processes:● UI● Nimbus● Supervisor● Worker

Storm Processes

SupervisorWorker

Worker

SupervisorWorker

Worker

Zookeeper

Web UI Nimbus

Page 18: A Gentle Introduction To Storm And Kafka

www.mammothdata.com | @mammothdataco

● Worker process● Executors● Tasks

Storm Parallelism Model

Page 19: A Gentle Introduction To Storm And Kafka

www.mammothdata.com | @mammothdataco

Use Case: Security

Kafka and Storm

Page 20: A Gentle Introduction To Storm And Kafka

www.mammothdata.com | @mammothdataco

Security customer analytics platform ● Pulling data from customer sites, ● Placed data in a SQL database ● Performing analysis to spot anomalous traffic ● Pushing results back to client to blocking traffic sources

Use Case: Security

Page 21: A Gentle Introduction To Storm And Kafka

www.mammothdata.com | @mammothdataco

Original system mean turn around time: 4.5 hoursStorm / Kafka solution, maximum processing time:

2.6 seconds

Use Case: Security

Page 22: A Gentle Introduction To Storm And Kafka

www.mammothdata.com | @mammothdataco

Thank You

Kafka and Storm

Page 23: A Gentle Introduction To Storm And Kafka

www.mammothdata.com | @mammothdataco

Kafka: http://kafka.apache.org/Storm: http://storm.apache.org/

Links


Recommended