+ All Categories
Home > Technology > Apache Flink and More @ MesosCon Asia 2017

Apache Flink and More @ MesosCon Asia 2017

Date post: 21-Jan-2018
Category:
Upload: till-rohrmann
View: 318 times
Download: 2 times
Share this document with a friend
40
Till Rohrmann [email protected] @stsffap Apache Flink® and More Jörg Schad [email protected] @joerg_schad
Transcript
Page 1: Apache Flink and More @ MesosCon Asia 2017

Till Rohrmann [email protected] @stsffap

Apache Flink® and More

Jörg Schad [email protected] @joerg_schad

Page 2: Apache Flink and More @ MesosCon Asia 2017
Page 3: Apache Flink and More @ MesosCon Asia 2017
Page 4: Apache Flink and More @ MesosCon Asia 2017

MapReduce is crunching Data

Page 5: Apache Flink and More @ MesosCon Asia 2017

We need to turn faster!

Page 6: Apache Flink and More @ MesosCon Asia 2017

SMACK Stack

EVENTSUbiquitous data

streams from connected devices

INGEST

Apache Kafka

STORE

Apache Spark

ANALYZE

Apache Cassandra

ACT

Akka

Ingest millions of events per second

Distributed & highly scalable database

Real-time and batch process

data

Visualize data and build data driven

applications

Mesos/ DC/OS

Sensors

Devices

Clients

Page 7: Apache Flink and More @ MesosCon Asia 2017

Evolution of Data Analytics

Batch Event ProcessingMicro-Batch

Days Hours Minutes Seconds Microseconds

Solves problems using predictive and prescriptive analytics

Reports what has happened using descriptive analytics

Predictive User Interface

Real-time Pricing and Routing

Real-time Advertising

Billing,Chargeback

Product recommendations

Page 8: Apache Flink and More @ MesosCon Asia 2017

8

Page 9: Apache Flink and More @ MesosCon Asia 2017

9

Original creators of Apache Flink®

Providers of the dA Platform, a supported

Flink distribution

Page 10: Apache Flink and More @ MesosCon Asia 2017

Apache Flink In a Nutshell

10

Event-driven applications (event sourcing, CQRS)

Stateful, event-driven,event-time-aware processing

Batch Processing (data sets)

Stream Processing / Analytics (data streams, windows, …)

Page 11: Apache Flink and More @ MesosCon Asia 2017

Apache Flink Stack

11

DataStream API Stream Processing

DataSet API Batch Processing

Runtime Distributed Streaming Data Flow

Libraries

Streaming and batch as first class citizens.

Page 12: Apache Flink and More @ MesosCon Asia 2017

Programming Model

12

Computation

Computation

Computation

Computation

Source Source

SinkSink

Transformation

state

state

state

state

Page 13: Apache Flink and More @ MesosCon Asia 2017

API & Execution

13

7

SourceDataStream<String> lines = env.addSource(new FlinkKafkaConsumer010(…));

DataStream<Event> events = lines.map(line -> parse(line));

DataStream<Statistic> stats = stream .keyBy("id") .timeWindow(Time.seconds(5)) .sum(new MyAggregationFunction());

stats.addSink(new BucketingSink(path));

keyBy()/ window()/

apply()

Transformation

Transformation

Sink

Streaming Dataflowmap()Source Sink

Page 14: Apache Flink and More @ MesosCon Asia 2017

Distributed Runtime

14

Page 15: Apache Flink and More @ MesosCon Asia 2017

Levels of Abstraction

15

Process Function (events, state, time)

DataStream API (streams, windows)

Table API (dynamic tables)

Stream SQL

low-level (stateful stream processing)

stream processing & analytics

declarative DSL

high-level language

Page 16: Apache Flink and More @ MesosCon Asia 2017

What Is Flink Good For?

16

Page 17: Apache Flink and More @ MesosCon Asia 2017

17

Detecting fraud in real time

As fraudsters get better, need to update models without downtime

Live 24/7 service

Credit card transactions

Notifications and alerts

Evolving fraud models built by data scientists

@

Page 18: Apache Flink and More @ MesosCon Asia 2017

18

▪ Athena X ▪ SQL to define metrics ▪ Thresholds and actions to trigger

▪ Blends analytics andactions Streams from

Hadoop, Kafka, etc

SQL, thresholds, actions

Analytics Alerts

Derived streams

@

Page 19: Apache Flink and More @ MesosCon Asia 2017

19

▪ Route events to Kafka, ES, Hive ▪ Complex interaction sessions rules ▪ Mix of stateless / small state / large state

▪ Stream Processing as a Service • Launching, monitoring, scaling, updating • DSL to define jobs

@

Page 20: Apache Flink and More @ MesosCon Asia 2017

20

▪ Blink based on Flink ▪ A core system in Alibaba Search

• Machine learning, search, recommendations • A/B testing of search algorithms • Online feature updates to boost conversion rate

▪ Alibaba is a major contributor to Flink ▪ Contributing many changes back to open source

@

Page 21: Apache Flink and More @ MesosCon Asia 2017

21

Complete social network Implemented using event sourcing andCQRS (Command Query Responsibility Segregation)

@

Page 22: Apache Flink and More @ MesosCon Asia 2017

Apache Flink & Apache Mesos

22

Page 23: Apache Flink and More @ MesosCon Asia 2017

Why Apache Mesos?

▪ Mesos offers full functionality to implement fault tolerant and elastic distributed applications

▪ 30% of survey respondents were running Flink on Mesos (prior to proper Mesos support, September 2016)

23

Page 24: Apache Flink and More @ MesosCon Asia 2017

Flink’s Mesos Integration

24▪ Kudos to Eron Wright ( EronWright) for this work

Apache Flink Framework

Mesos Master

Mesos App Master

Flink MesosResourceManager

JobManager

Mesos Task

TaskManager

Mesos Task

TaskManager

Allocate Resources

Launch Mesos tasks

Register

Execute Job

Page 25: Apache Flink and More @ MesosCon Asia 2017

Resource Manager Components

▪ Monitors connection to Mesos

25

Connection Monitor Launch Coordinator

▪ Resource offer processing and task scheduling

▪ Gathers offers and matches them to tasks using Fenzo

Task MonitorReconciliation Coordinator

▪ Monitors Mesos tasks ▪ Triggers reconciliation ▪ Makes sure tasks are properly

killed

▪ Reconciles tasks view between ResourceManager and Mesos Master

Page 26: Apache Flink and More @ MesosCon Asia 2017

Component Interplay

26

ResourceManager

Connection Monitor

Launch Coordinator

Task MonitorReconciliation Coordinator

Mesos MasterResource offers

Launch tasks

Monitor tasks

Status messages

Trigger reconciliation

Status messages

Mesos Task

Reconcile tasks

Start TaskManagers

Recover tasks

Kill task

Page 27: Apache Flink and More @ MesosCon Asia 2017

Fenzo▪ Developed by Netflix ▪ Generic task scheduler for frameworks ▪ Matching between tasks and resource offers

• Pluggable fitness evaluator

27

Fenzo

Mesos

Launch Coordinator

Periodic resource offers

Tell Fenzo offered resources & tasks

Fenzo returns resource task matchings

Tasks to launch

Page 28: Apache Flink and More @ MesosCon Asia 2017

Datacenter

Page 29: Apache Flink and More @ MesosCon Asia 2017

NAIVE APPROACH

Typical Datacentersiloed, over-provisioned servers,

low utilization

Industry Average 12-15% utilization

mySQL

microservice

Cassandra

Flink

Kafka

Page 30: Apache Flink and More @ MesosCon Asia 2017

© 2017 Mesosphere, Inc. All Rights Reserved. 30

Page 31: Apache Flink and More @ MesosCon Asia 2017

Apache Mesos

Typical Datacentersiloed, over-provisioned servers,

low utilization

Industry Average 12-15% utilization

mySQL

microservice

Cassandra

Flink

Kafka

Mesos automated schedulers, workload multiplexing

onto the same machines

Page 32: Apache Flink and More @ MesosCon Asia 2017

Why Mesos?● 2-level scheduling● Fault-tolerant, battle-tested● Scalable to 10,000+ nodes● Created by Mesosphere founder @

UC Berkeley; used in production by 100+ web-scale companies [1]

[1] http://mesos.apache.org/documentation/latest/powered-by-mesos/

APACHE MESOS

Page 33: Apache Flink and More @ MesosCon Asia 2017
Page 34: Apache Flink and More @ MesosCon Asia 2017

DC/OS

Datacenter Operating System (DC/OS)

Distributed Systems Kernel (Mesos)

Big Data + Analytics EnginesMicroservices (in containers)

StreamingBatchMachine Learning

Analytics

Functions & Logic Search

Time SeriesSQL / NoSQL

Databases

Modern App Components

Any Infrastructure (Physical, Virtual, Cloud)

Page 35: Apache Flink and More @ MesosCon Asia 2017

© 2016 Mesosphere, Inc. All Rights Reserved.

DEMO

Page 36: Apache Flink and More @ MesosCon Asia 2017

Conclusion

36

Page 37: Apache Flink and More @ MesosCon Asia 2017

Conclusion

▪ Apache Flink runs on Mesos using Fenzo

▪ DC/OS offers easy to use Flink package ▪ Contributions welcome!

DC/OS Office Hour June 29th

37

Page 38: Apache Flink and More @ MesosCon Asia 2017

Thank you! @stsffap

@joerg_schad @ApacheFlink @dataArtisans

@dcos

Page 39: Apache Flink and More @ MesosCon Asia 2017

39

Page 40: Apache Flink and More @ MesosCon Asia 2017

We are hiring! data-artisans.com/careers


Recommended