Home >Software >Confluent: Streaming operational data with Kafka – Couchbase Connect 2016

Confluent: Streaming operational data with Kafka – Couchbase Connect 2016

Date post:15-Apr-2017
Category:
View:907 times
Download:0 times
Share this document with a friend
Transcript:
  • 1Confidential

    State of the Streaming Platform 2016Whats new in Apache Kafka and the Confluent Platform

    David Tucker, ConfluentDavid Ostrovsky, Couchbase

  • 3Confidential

    Who are we ?

    David TuckerDirector of Partner Engineering, Confluent

    Background : Architect and designer

    HP Alliances: 4 CEOs, 3 enterprise hardware platforms Saw the Hadoop light; led partner engineering at MapR Better living through data (bigger, faster, better)

    Expertise Data management solutions Cloud services and orchestration

    David OstrovskySenior Solutions Architect, Couchbase

    Background: Consultant and author

    Hadoop and data processing Wrote a couple of books about Couchbase Big data nerd

    Experise Databases and administration Streaming data processing

  • 4Confidential

    What does Kafka do?

  • 5Confidential

    Kafka is much more thana pub-sub messaging system

  • 6Confidential

    Before: Many Ad Hoc Pipelines

    Search Security

    Fraud Detection Application

    User Tracking Operational Logs Operational Metrics

    Hadoop App Data Warehouse

    Espresso Cassandra Oracle

    Databases

    Storage

    Interfaces

    Monitoring App

    Databases

    Storage

    Interfaces

  • 7Confidential

    After: Streaming Platform with Kafka

    Distributed Fault Tolerant Stores Messages

    Search Security

    Fraud Detection Application

    User Tracking Operational Logs Operational MetricsEspresso Couchbase Oracle

    Hadoop App Monitoring App Data Warehouse

    Kafka

    Processes Streams

    Kafka StreamsKafka Streams

  • 8Confidential

    Apache Kafka:A distributed streaming platform

  • 9Confidential

    From Big Data to Stream Data

    Stream Data will beBig AND Fast (Kappa)

    Volume of Data

    Valu

    e of

    Dat

    a

    Age of Data

    Valu

    e of

    Dat

    a

    Streams

    Hadoop

    DB

    Speedtable

    Batchtable

    Streams DB

    Table 1

    Table 2

    Job 1

    Job 2

    Big Data wasThe More the Better

    Stream Data isThe Faster the Better

    Stream Data can beBig or Fast (Lambda)

    Apache Kafka is the Enabling Technology of this Transition

  • 10Confidential

    Confluent Platform, the Enterprise Streaming Platform

    Commercial

    Open source

    External

    Auto-Data Balancing

  • 11Confidential

    How do I get streams of datainto and out of my apps?

    Connect Clients REST

  • 12Confidential

    Apache KafkaTM Connect Streaming Data Capture

    Fault tolerant Manage hundreds of data

    sources and sinks Preserves data schema Part of Apache Kafka project Integrated within Confluent

    Platforms Control Center

    Kafka Brokers

    MySQL

    Couchbase

    JDBC

    HDFS

    Couchbase

    ElasticKafka Connect

    ConnectorConnector

    ConnectorConnector

    Connector Connector

    Sources Sinks

  • 13Confidential

    Kafka Connect Library of Connectors

    Databases Datastore / File Store Analytics Applications / Other

    JDBC*Couchbase

    Datastax / CassandraGoldenGate

    JustOneDynamoDBMongoDB

    HbaseInfluxDB

    KuduRethinkDB

    HDFS*Apache Ignite

    FTPSyslog

    Hazelcast

    Elasticsearch*Veritca

    Mixpanel

    AttunityAWS / S3

    Bloomberg TickerStriimSolr

    SyncsortTwitter

    * Denotes Connectors developed at Confluent and distributed with the Confluent Platform. Extensive validation and testing has been performed.

  • 14Confidential

    Kafka Clients

    Ruby Proxy http/REST

    Stdin/stdout

    Apache Kafka Native Clients

    Confluent Native Clients

    Community Supported Clients

  • 15Confidential

    REST Proxy: Enable Any Application to Access Kafka Data

    REST/HTTP

    REST Proxy

    Schema Registry

    Native Kafka Java Applications

    Legacy Applications

    Provides a RESTful interface to a Kafka cluster

    Simplifies message creation and consumption

    Simplifies administrative actions

  • 16Confidential

    How do I maintain my data formats

    and ensure compatibility?

  • 17Confidential

    The Challenge of Data Compatibility at Scale

    App 1 Many sources without a policy causes mayhem

    in a centralized data pipeline

    Ensuring downstream systems can use the

    data is key to an operational stream pipeline

    Example: date formats

    Even within a single application, different

    formats can be presented

    App 2

    App 3

  • 18Confidential

    App 2

    !

    Confluent: Schema Registry

    App 1

    !

    Define the expected fields for each Kafka topic

    Automatically handle schema changes (e.g. new fields)

    Kafka Topic

    HDFS

    Couchbase

    Elastic

    Example Consumers

    Prevent backwards incompatible changes

    Support multi-datacenter environments

    Schema Registry

    Serializer

    Serializer

  • 19Confidential

    How do I build stream processing apps?

  • 20Confidential

    Architecture of Kafka Streams, a Part of Apache KafkaTM

    Key Benefits Available as high-level DSL and

    low-level API, delivering maximum flexibility for application design

    No additional cluster required Easy to run as a service Security and permissions fully

    integrated from Kafka

    Example Use Cases Microservices Continuous queries Continuous transformations Event-triggered processes

    Topic Topic TopicKafka

    StreamsTopic Topic Topic

    Kafka Cluster

    Producer

    Kafka Connect

    Consumer Consumer

    Kafka Connect

  • 22Confidential

    Kafka Streams simplifies your architecture, decouples your teams

    App

    App

    App

    1 Capture businessevents in Kafka 2Must process events withseparate cluster (e.g. Spark) 4

    Other apps access latest resultsby querying these DBs3

    Must share latest results throughseparate systems (e.g. MySQL)

    App

    App

    App

    1 Capture businessevents in Kafka 2Process events with standardJava apps that use Kafka Streams 3

    Now other apps can directlyquery the latest results

    Before: Undue complexity, heavy footprint, many technologies, split ownership with conflicting priorities

    With Kafka Streams: simplified, app-centric architecture, puts app owners in control

    KafkaStreams

    Your App

    Your Job

  • 26Confidential

    How do I manage and monitor my streaming

    platform at scale?

  • 27Confidential

    Confluent Control Center: End-to-end Monitoring

    See exactly where your messages are going in your Kafka cluster

  • 28Confidential

    Confluent Control Center: Connector Management

  • 29Confidential

    Control Center: Multi-Datacenter Management & Replication

    Manage multi-cluster deployments

    Centralized configuration & monitoring Replicate clusters or selected topics Replication of topic configuration Configurable topic re-names

    The Kafka Advantage

    Reliable Highly available Scalable Cloud Ready

  • 30Confidential

    Confluent Control Center: Alerting

    Alerts

    Configure alerts on incomplete data delivery, high latency, Kafka connector status, and more

    Manage alerts for different users and applications from a web UI

    Manage alerts for different users and applications from a web UI

    User authentication

    Control access to Confluent Control Center

    Integrates with existing enterprise authentication systems

  • 34Confidential

    Demo

  • 35Confidential

    Demo Scenario: Streaming Data Pipeline

    Twitter feed with sentiment data

    Twitter Source connector configured to publish data to Kafka topic

    Kafka Streams application augments twitter records with senitment analysis

    K-Streams output saved to Couchbase

    Couchbase Source Connector configured to pull data from Couchbase bucket back to Kafka topic

    2nd stage Kafka Streams app saves data to another Couchbase bucket and then on to Elasticsearch

  • 36Confidential

    Couchbase Connect Demonstration

    Kafka Connect

    Apache Kafka Brokers

    K-Streams app(s)

    1

    43

    2

    7

    6

    5

    8

  • 38Confidential

    Thank You

of 29/29
1 Confidential State of the Streaming Platform 2016 What’s new in Apache Kafka and the Confluent Platform David Tucker, Confluent David Ostrovsky, Couchbase
Embed Size (px)
Recommended