Streaming Data Integration - For Women in Big Data Meetup

Post on 16-Apr-2017

502 views 1 download

transcript

1Confidential

Streaming Data Integrationwith Apache Kafka

2Confidential

About Gwen

Gwen Shapira – System Architect @Confluent

PMC @ Apache Kafka

Moving data round since 2000

Previously:

• Software Engineer @ Cloudera

• Oracle Database Consultant

Find me:

• gwen@confluent.io

• @gwenshap

3Confidential

The Plan

1. What is Data Integration About?2. How things changed?3. What is difficult and important?4. How we solve things in Kafka?

4Confidential

Data Integration

Making sure the right dataGets to the right places

5Confidential

10 years ago…

 

InformaticaDataStageManual Optimizations

6Confidential

5 years ago…

7Confidential

8Confidential

9Confidential

Today…

• Everything streaming• Everything real-time• Everything in-memory• Everything containers• Everything clouds

10Confidential

These Things Matter

• Reliability – Losing data is (usually) not OK. • Exactly Once vs At Least Once

• Timeliness • Push vs Pull• High throughput, Varying throughput

• Compression, Parallelism, Back Pressure

• Data Formats• Flexibility, Structure

• Security• Error Handling

11Confidential

12Confidential

After: Stream Data Platform with Kafka Distribute

d Fault Tolerant Stores Messages

Search Security

Fraud Detection Application

User Tracking Operational Logs Operational MetricsEspresso Cassandra Oracle

Hadoop Log Search Monitoring Data Warehouse

Kafka

Processes Streams

13Confidential

14Confidential

14

15Confidential

15

16Confidential

16

17Confidential

17

18Confidential

IntroducingKafka Connect

Large-scale streaming data import/export for Kafka

19Confidential

20Confidential

Overview of Connect

1. Install a cluster of Workers2. Download / Build and install Connector Plugins3. Use REST API to Start and Configure Connectors4. Connectors start Tasks. Tasks run inside Workers and copy data.

21Confidential

22Confidential

23Confidential

24Confidential

25Confidential

26Confidential

27Confidential

28Confidential

30Confidential

31Confidential

32Confidential

Questions?