You will learn how to
• Realize the value of streaming data ingest with Kafka
• Turn databases into live feeds for streaming ingest and processing
• Accelerate data delivery to enable real-time analytics
• Reduce skill and training requirements for data ingest
About Confluent• Founded by the creators of Apache
Kafka• Founded September 2014• Technology developed while at
LinkedIn• 73% of active Kafka committers
Cheryl DalrympleCFO
Jay KrepsCEO
Neha NarkhedeCTO, VP Engineering
Luanne DauberCMO
Leadership
Todd BarnettVP WW Sales
Jabari NortonVP Business Dev
What does Kafka do? Producers
Consumers
Kafka Connect
Kafka Connect
Topic
Your interfaces to the world
Connected to your systems in real time
Before: Many Ad Hoc Pipelines
Search Security
Fraud Detection Application
User Tracking Operational Logs Operational Metrics
Hadoop Search Monitoring Data Warehouse
Espresso Cassandra Oracle
After: Stream Data Platform with KafkaDistributed Fault Tolerant Stores Messages
Search Security
Fraud Detection Application
User Tracking Operational Logs Operational MetricsEspresso Cassandra Oracle
Hadoop Log Search Monitoring Data Warehouse
Kafka
Processes Streams
People Using Kafka TodayFinancial Services
Entertainment & Media
Consumer Tech
Travel & Leisure
Enterprise Tech
Telecom Retail
Common Kafka Use Cases
Data transport and integration• Log data• Database changes• Sensors and device data• Monitoring streams• Call data records• Stock ticker data
Real-time stream processing• Monitoring• Asynchronous applications• Fraud and security
1. Ad-hoc pipelines2. Extreme processing3. Loss of metadata
Data Integration Anti-Patterns
Tight Coupling
Agility
Because at the heart of EVERY system……there is a LOG,and Kafka is a scalable and reliable system to manage LOGs
Why is Kafka such a great fit?
• Database data is available for any application• No impact on production• Database TABLES turned into a STREAM of events• Ready for the next challenge? Stream processing
applications
What’s next?
Confluent Platform with Attunity Connectivity
Confluent Platform
Alerting
Monitoring
Real-time Analytics
Custom Application
Transformations
Real TimeApplications
Apache Kafka CoreConnectors
Control Center Clients & Developer Tools
Hadoop
ERP
CRM
Data Warehouse
RDBMS
Data Integration
Connectors
Database Changes Mobile DevicesloTLogs Website Events
Confluent Platform Confluent Platform Enterprise External Product
Support, Services and Consulting
Kafka Streams
Source Sink
Confluent Platform: It’s Kafka ++Feature Benefit Apache Kafka Confluent Platform
3.0Confluent Enterprise
3.0
Apache Kafka High throughput, low latency, high availability, secure distributed message system
Kafka Connect Advanced framework for connecting external sources and destinations into Kafka
Java Client Provides easy integration into Java applications
Kafka Streams Simple library that enables streaming application development within the Kafka framework
Additional Clients Supports non-Java clients; C, C++, Python, etc.
Rest Proxy Provides universal access to Kafka from any network connected device via HTTP
Schema Registry Central registry for the format of Kafka data – guarantees all data is always consumable
Pre-Built Connectors HDFS, JDBC and other connectors fully Certified and fully supported by Confluent
Confluent Control Center Includes Connector Management and Stream Monitoring
Support Connection and Monitoring command center provides advanced functionality and control Community Community 24x7x365
Free Free Subscription
Confluent Control Center
Configures Kafka Connect data pipelines
Monitors all pipelines from end-to-end
About Attunity
Overview
Global operations, US HQ 2000 customers in 65 countriesNASDAQ traded, fast growing
Global Footprint
Data Integration and Big Data Management
1. Accelerate data delivery and availability
2. Automate data readiness for analytics 3. Optimize data management with
intelligence
Attunity Replicate Attunity Compose Attunity Visibility
Universal Data Availability Data Warehouse Automation
Data Usage Profiling & Analytics
Move data to any platform
Automate ETL/EDW
Optimizeperformance and
cost
On Premises / Cloud
Hadoop FilesRDBMS EDW SAP Mainframe
Attunity Product Suite
Stream your databases to Kafka with Attunity Replicate:• Easily – configurable and automated solution, with a
few clicks you can turn databases into live feeds for Kafka• Continuously – capture and stream data changes
efficiently, in real-time, and with low impact • Heterogeneously – using the same platform for
many source database systems (Oracle, SQL, DB2, Mainframe, many more…)
Attunity Replicate for Kafka
Attunity Replicate architecture
Transfer
TransformFilterBatch
CDC Incremental
In-Memory
File Channel
Batch
Hadoop
Files
RDBMS
Data Warehouse
Mainframe
Cloud
On-prem
Cloud
On-prem
Hadoop
Files
RDBMS
Data Warehouse
Kafka
Persistent Store
Demand•Easy ingest and CDC•Real-time processing•Real-time monitoring•Real-time Hadoop•Scalable to 1000’s applications•One publisher – multiple consumers
Attunity Replicate•Direct integration using Kafka APIs
•In-memory optimized data streaming
•Support for multi-topic and multi-partitioned data publication
•Full load and CDC•Integrated management and monitoring via GUI
Kafka and real-time streaming
CDC
Attunity Replicate for Kafka - Architecture
MSG
n 2 1
MSG MSG
Data Streaming
Transaction logs
In memory optimized metadata management and data transport
Message broker
Message broker
Bulk Load
MSG
n 2 1
MSG MSG
Data Streaming
T1/P0
T2/P1
T3/P0
Broker 1M0 M1 M2 M3 M4 M5 M6 M7 M8
M0 M1 M2 M3 M4 M5
M0 M1 M2 M3 M4 M5 M6 M7
T1/P1
T2/P0
Broker 2
M0 M1 M2 M3 M4
M0 M1 M2 M3 M4 M5 M6
"table": "table-name",
"schema": "schema-name",
"op": "operation-type",
"ts": "change-timestamp",
"data": [{"col1": "val1"}, {"col2": "val2"}, …., {"colN": "valN"}]
"bu_data": [{"col1": "val1"}, {"col2": "val2"}, …., {"colN": "valN"}],
Easily create and manage Kafka endpoints
Eliminate manual coding•Drag and drop interface for all sources and targets
•Monitor and control data stream through web console
•Bulk load or CDC•Multi-topic and multi-partitioned data publication
Attunity Replicate
Command Line
Zero-footprint architectureLower impact on IT•No software agents on
sources and targets for mainstream databases
•Replicate data from 100’s of source systems with easy configuration
•No software upgrades required at each database source or target
Hadoop
Files
RDBMS
EDW
Mainframe
• Log based• Source specific
optimization
Hadoop
Files
RDBMS
EDW
Kafka
Heterogeneous – Broad support for sources and targets
RDBMS
OracleSQL ServerDB2 LUWDB2 iSeriesDB2 z/OSMySQLSybase ASEInformix
Data Warehouse
ExadataTeradataNetezzaVerticaActian VectorActian Matrix
HortonworksClouderaMapRPivotal
Hadoop
IMS/DBSQL M/PEnscribeRMSVSAM
Legacy
AWS RDSSalesforce
Cloud
RDBMS
OracleSQL ServerDB2 LUWMySQLPostgreSQLSybase ASEInformix
Data Warehouse
ExadataTeradataNetezzaVerticaPivotal DB (Greenplum)Pivotal HAWQActian VectorActian MatrixSybase IQ
HortonworksClouderaMapRPivotal
Hadoop
MongoDB
NoSQL
AWS RDS/Redshift/EC2Google Cloud SQLGoogle Cloud DataprocAzure SQL Data WarehouseAzure SQL Database
Cloud
Kafka
Message Broker
targets
sources