Date post: | 16-Apr-2017 |
Category: |
Data & Analytics |
Upload: | dataartisans |
View: | 119 times |
Download: | 0 times |
1
Stefan Richter@stefanrrichter
29.10.2016
A look at Flink 1.2 and beyond
Agenda
▪ Flink 1.2 feature overview & walkthrough ▪ Taking a closer look at two features: ▪ Queryable state ▪ Dynamic scaling
2
Feature OverviewFlink Release 1.2
3
Flink 1.1+ ongoing development
4Session Windows(Stream) SQL
Libraryenhancements
MetricSystem
Metrics &Visualization
Dynamic Scaling
Savepointcompatibility Checkpoints
to savepoints
Connectors in Flink Stream SQLWindows
Large stateMaintenance
Fine grainedrecovery
Side in-/outputsWindow DSL
Security
Mesos &others
Dynamic ResourceManagement
Authentication
Queryable StateApache Bahir connectors
Operations
EcosystemApplication
FeaturesBroader
Audience
Flink 1.1+ ongoing development
4Session Windows(Stream) SQL
Libraryenhancements
MetricSystem
Metrics &Visualization
Dynamic Scaling
Savepointcompatibility Checkpoints
to savepoints
Connectors in Flink Stream SQLWindows
Large stateMaintenance
Fine grainedrecovery
Side in-/outputsWindow DSL
Security
Mesos &others
Dynamic ResourceManagement
Authentication
Queryable StateApache Bahir connectors
Operations
EcosystemApplication
FeaturesBroader
Audience
Flink 1.1+ ongoing development
4Session Windows(Stream) SQL
Libraryenhancements
MetricSystem
Metrics &Visualization
Dynamic Scaling
Savepointcompatibility Checkpoints
to savepoints
Connectors in Flink Stream SQLWindows
Large stateMaintenance
Fine grainedrecovery
Side in-/outputsWindow DSL
Security
Mesos &others
Dynamic ResourceManagement
Authentication
Queryable StateApache Bahir connectors
Operations
EcosystemApplication
FeaturesBroader
Audience
Flink 1.1+ ongoing development
4Session Windows(Stream) SQL
Libraryenhancements
MetricSystem
Metrics &Visualization
Dynamic Scaling
Savepointcompatibility Checkpoints
to savepoints
Connectors in Flink Stream SQLWindows
Large stateMaintenance
Fine grainedrecovery
Side in-/outputsWindow DSL
Security
Mesos &others
Dynamic ResourceManagement
Authentication
Queryable StateApache Bahir connectors
Operations
EcosystemApplication
FeaturesBroader
Audience
Flink 1.1+ ongoing development
4Session Windows(Stream) SQL
Libraryenhancements
MetricSystem
Metrics &Visualization
Dynamic Scaling
Savepointcompatibility Checkpoints
to savepoints
Connectors in Flink Stream SQLWindows
Large stateMaintenance
Fine grainedrecovery
Side in-/outputsWindow DSL
Security
Mesos &others
Dynamic ResourceManagement
Authentication
Queryable StateApache Bahir connectors
Operations
EcosystemApplication
FeaturesBroader
Audience
Flink 1.2 Improvements
5Session Windows(Stream) SQL
Libraryenhancements
MetricSystem
Operations
EcosystemApplication
Features
Metrics &Visualization
Dynamic Scaling
Savepointcompatibility Checkpoints
to savepoints
Connectors in Flink Stream SQLWindows
Large stateMaintenance
Fine grainedrecovery
Side in-/outputsWindow DSL
BroaderAudience
Security
Mesos &others
Dynamic ResourceManagement
Authentication
Queryable StateApache Bahir connectors
Security / Authentication - Flink 1.2
6
Authorized data access Secured clusters with Kerberos-based authentication • Kafka, ZooKeeper, HDFS, YARN, HBase, …
Encrypted traffic between Flink Processes • RPC, Data Exchange, Web UI, … - „SSL for all connections“
Largely contributed by
Prevent malicious users to hook into Flink jobs
Cluster Management - Flink 1.1
7
Standalone
Flink on Yarn
Cluster Management - Flink 1.2
8Mesos integration contributed by
Standalone
Flink on Yarn
Flink on Mesos
Cluster Management - Beyond 1.2
9
Efforts to seamlessly interoperate with various cluster managers.
Generalized abstraction (FLIP-6).
Driven by and
Cluster Management - Beyond (ct’d)
10
TaskManagerJobManager
(1) Register
(2) Deploy Tasks
ResourceManager
(1) Request slots
TaskManager
JobManager
(2) Start TaskManager
(3) Register
(4) Deploy TasksDispatcher
(0) Start JobManager
Cluster Management - Beyond (ct’d)
10
TaskManagerJobManager
(1) Register
(2) Deploy Tasks
ResourceManager
(1) Request slots
TaskManager
JobManager
(2) Start TaskManager
(3) Register
(4) Deploy TasksDispatcher
(0) Start JobManager
Metrics
11
Metrics
▪ Rates
11
Metrics
▪ Rates▪ Latency (operator)
11
Metrics
▪ Rates▪ Latency (operator)▪ Visualization in WebUI
11
Savepoint / Checkpoint Robustness
12
Savepoint / Checkpoint Robustness
▪ Resume job from checkpoints
12
C S
Savepoint / Checkpoint Robustness
▪ Resume job from checkpoints
▪ Use older checkpoint on failed recovery
12
C1 C2 C3t
✘
Savepoint / Checkpoint Robustness
▪ Resume job from checkpoints
▪ Use older checkpoint on failed recovery
▪ Skip failed Checkpoints
12
C1 C2 C3t
✘
Savepoint / Checkpoint Robustness
▪ Resume job from checkpoints
▪ Use older checkpoint on failed recovery
▪ Skip failed Checkpoints▪ Backwards compatible
12
S1.1 1.2
Processing Function
13
Stream SQL
Streaming API
Processing Function
Window Operator
Timer Handling
?
Problem: Implement custom windowing?
Processing Function
13
Stream SQL
Streaming API
Processing Function
Window Operator
Timer Handling
Interface ProcessingFunction:
void flatMap(I value, Context ctx, Collector<O> out) throws Exception;
void onTimer(long timestamp, OnTimerContext ctx, Collector<O> out) throws Exception
Table API & Stream SQL
14
Example:
Table API & Stream SQL
▪ Group-windows
14
Example:
table .groupBy('user') .window(Session withGap 10.minutes on 'rowtime') .select('uid', 'product.count')
Table API & Stream SQL
▪ Group-windows▪ More SQL operations
14
Example:
EXISTS, VALUES, LIMIT
Table API & Stream SQL
▪ Group-windows▪ More SQL operations▪ More built-in scalar functions
14
Example:
CURRENT_DATE, INITCAP, NULLIF
Table API & Stream SQL
▪ Group-windows▪ More SQL operations▪ More built-in scalar functions▪ More datatypes & better
integration
14
Example:
pojo.get('field') pojo.flatten()
Table API & Stream SQL
▪ Group-windows▪ More SQL operations▪ More built-in scalar functions▪ More datatypes & better
integration▪ User-defined scalar functions
14
Example:
table. select('uid', parseName('userJson'))
Many more improvements…
15
Many more improvements…
▪ Kafka 0.10 (with watermarks)
15
Many more improvements…
▪ Kafka 0.10 (with watermarks)▪ Bucketing Sink: divides output into different file w.r.t. user
logic
15
Many more improvements…
▪ Kafka 0.10 (with watermarks)▪ Bucketing Sink: divides output into different file w.r.t. user
logic▪ Detached execution: first step in programatically controlled
job
15
Many more improvements…
▪ Kafka 0.10 (with watermarks)▪ Bucketing Sink: divides output into different file w.r.t. user
logic▪ Detached execution: first step in programatically controlled
job ▪ Async IO operator: non-blocking queries to external systems
15
Many more improvements…
▪ Kafka 0.10 (with watermarks)▪ Bucketing Sink: divides output into different file w.r.t. user
logic▪ Detached execution: first step in programatically controlled
job ▪ Async IO operator: non-blocking queries to external systems▪ Improved scalability, robustness + bugfixes
15
Queryable StateFlink 1.2
16
Queryable State - Motivation
17
Realtime Queries
Periodically (every second)flush new aggregates
to Redis
Queryable State - Motivation
18
Number ofKeys
Queryable State - Motivation
19
Realtime QueriesWhere is the bottleneck?
Queryable State - Motivation
19
Writes to the key/valuestore take too long
Realtime QueriesWhere is the bottleneck?
Queryable State - Idea
20
Realtime Queries
Archive Database
Optional + only at end of windows
“Streamprocessor as a database“
Queryable State - Performance
21
Number ofKeys
Queryable State - Implementation
22
Query Client
StateRegistry
window() /
sum()
Job Manager Task Manager
ExecutionGraph
State Location Server
deploy
status
Query: /job/operation/state-name/key
StateRegistry
Task Manager
(1) Get location of "key-partition" for "operator" of" job"
(2) Look uplocation
(3)Respond location
(4) Querystate-name and key
localstate
register
window() /
sum()
Queryable State Enablers
23
Queryable State Enablers
▪ Flink has state as a first class citizen
23
Queryable State Enablers
▪ Flink has state as a first class citizen▪ State is fault tolerant (exactly once semantics)
23
Queryable State Enablers
▪ Flink has state as a first class citizen▪ State is fault tolerant (exactly once semantics)▪ State is partitioned (sharded) together with the
operators that create/update it
23
Queryable State Enablers
▪ Flink has state as a first class citizen▪ State is fault tolerant (exactly once semantics)▪ State is partitioned (sharded) together with the
operators that create/update it▪ State is continuous (not mini batched)
23
Queryable State Enablers
▪ Flink has state as a first class citizen▪ State is fault tolerant (exactly once semantics)▪ State is partitioned (sharded) together with the
operators that create/update it▪ State is continuous (not mini batched)▪ State is scalable (e.g., embedded RocksDB state
backend)
23
Dynamic ScalingFlink 1.2
24
Motivation - Changing Workloads
25
Motivation - Changing Workloads
25
Motivation - Changing Workloads
25
Motivation - Resource Adaption
26
time
Workload Resources
Motivation - Resource Adaption
26
time
Workload Resources
time
Workload Resources
Motivation - Resource Adaption
26
+
time
Workload Resources
time
Workload Resources
Basic Idea
27
• Spread work across more workers to decrease workload
Scaling Stateless Jobs
28
Scale Up Scale DownSource
Mapper
Sink
• Scale up: Deploy new tasks • Scale down: Cancel running tasks
Scaling Stateful Jobs
29
?
• Problem 1: Which state to assign to new task? • Problem 2: Read + filter whole state?
Non-keyed vs Keyed State
30
• State bound to an operator + key • E.g. Keyed UDF and window state • „SELECT count(*) FROM t GROUP BY t.key“
• State bound only to operator • E.g. Source state
KeyedNon-keyed
Non-keyed vs Keyed State
30
• State bound to an operator + key • E.g. Keyed UDF and window state • „SELECT count(*) FROM t GROUP BY t.key“
• State bound only to operator • E.g. Source state
KeyedNon-keyed
Repartitioning Non-keyed state
31
#1 #2
#3 #4
#1 #2
#3 #4
Flink 1.1:
T snapshot() void restore(T)
Flink 1.2:
List<T> snapshot() void restore(List<T>)
Idea: break up state into finer granules that can be redistributed independently
Example: Kafka Source Flink 1.1
32
partitionId: 1, offset: 42
partitionId: 3, offset: 10
partitionId: 6, offset: 27?
Operator state is black box. How to repartition?
Example: Kafka Source Flink 1.2
33
partitionId: 1, offset: 42
partitionId: 3, offset: 10
partitionId: 6, offset: 27
???
Return a list of sub-states which can be freely repartitioned.
partitionId: 1, offset: 42
partitionId: 6, offset: 27
Example: Kafka Source Flink 1.2
34
partitionId: 3, offset: 10
Scale Out
partitionId: 1, offset: 42
partitionId: 6, offset: 27
Example: Kafka Source Flink 1.2
34
partitionId: 3, offset: 10
Scale Out
Example: Kafka Source Flink 1.2
35
partitionId: 1, offset: 42
partitionId: 6, offset: 27
partitionId: 3, offset: 10
Scale In
Example: Kafka Source Flink 1.2
35
partitionId: 1, offset: 42
partitionId: 6, offset: 27
partitionId: 3, offset: 10
Scale In
Non-keyed vs Keyed State
36
• State bound to an operator + key • E.g. Keyed UDF and window state • „SELECT count(*) FROM t GROUP BY t.key“
• State bound only to operator • E.g. Source state
KeyedNon-keyed
Repartitioning Keyed State
▪ Split key space into key groups
▪ Every key falls into exactly one key group
▪ Assign key groups to tasks
37
Key space
Key group #1 Key group #2
Key group #3Key group #4
One key
Repartitioning Keyed State (ct’d)
▪ Rescaling changes key group assignment
▪ Maximum parallelism defined by #key groups
38
Current State in Flink 1.2
▪ Manual rescaling 1. Take savepoint 2. Restart job with adjusted parallelism and
savepoint
39
Next Steps beyond Flink 1.2
▪ Rescaling individual operators w/o restart ▪ Refactor Flink deployment and process
model (previously discussed) ▪ On-the-fly Scaling
40
Autoscaling Policies
41
• Latency • Throughput • Resource utilization
• Kubernetes on GCE, EC2 and Mesos (marathon-autoscale) already support auto-scaling
Conclusion
42
Conclusion
▪ Many great features in Flink 1.2 ▪ Walkthrough ▪ Queryable State & Dynamic Scaling
42
Conclusion
▪ Many great features in Flink 1.2 ▪ Walkthrough ▪ Queryable State & Dynamic Scaling
▪ Glimpse beyond the 1.2 release
42
43
Thank you!@stefanrrichter @ApacheFlink @dataArtisans
Questions?
44