Kafka connect-london-meetup-2016

Post on 06-Jan-2017

997 views 0 download

transcript

Stream All ThingsReal-time Data Integration at Scale with Apache Kafka

By Gwen Shapira

Hadoop Cluster II

Storage Processing

SolR

Hadoop Cluster I

ClientClientFlume Agents

Hbase / Memory

Spark Streaming

HDFS

Hive/Impala

Map/Reduce

Spark

Search

Automated & Manual

Analytical Adjustments and Pattern detection

Fetching & Updating Profiles

Adjusting NRT Stats

HDFSEventSink

SolR Sink

Batch Time Adjustments

Automated & Manual

Review of NRT Changes and

Counters

Local Cache

Kafka

Clients:(Swipe here!)

Web App

Data Integrationgetting data to all the right places

IntroducingKafka ConnectLarge-scale streaming data import/export for Kafka

Offsets automatically committed and restored

On restart: task checks offsets & rewinds

At least once delivery – flush data, then commit

Exactly once for connectors that support it (e.g. HDFS)

Delivery Guarantees

Abstract serialization: 1 connector, many serialization formats

Convert between Kafka Connect Data API (Connectors) and serialized bytes (Kafka)

JSON and Avro are currently well supported

Converters

Confluent Open Source – HDFS, JDBC

Connector Hub: connectors.confluent.io

Examples: MySQL, MongoDB, Twitter, Solr, S3, MQTT, Bloomberg, Apache Ignite, Attunity, Couchbase, Vertica, Cassandra, Hbase, Kudu, Mixpanel, Systlog, Twitter and more

Connectors Today

Jenkins connector – Aravind Yarram (Equifax)

Twitter semantic analysis and visualization – Ashish Singh (Cloudera)

Brain monitoring device connector – Silicon Valley Data Science

DynamoDB, Cassandra, Slack, Splunk, and many more

Connectors from the Hackathon

Improved connector control via REST API, standardized configs, metrics

Single record transformations

Data pipelines in an app - embedded mode & Kafka Streams integration

Many more connectors

Coming soon…

THANK YOU!Gwen Shapira | gwen@confluent.io | @gwenshap

Visit us in the Confluent Booth (#217)

Kafka: The Definitive Guide = Book Giveaway and Signing

Making Sense of Stream Processing = Book Giveaway

Kafka Training with Confluent University

Kafka Developer and Operations Courses

Visit www.confluent.io/training

Want more Kafka?

Download Confluent Platform Enterprise at http://www.confluent.io/product

Apache Kafka 0.10 upgrade documentation at http://docs.confluent.io/3.0.0/upgrade.html

Kafka Summit recordings now available at http://kafka-summit.org/schedule/