Date post: | 16-Apr-2017 |
Category: |
Data & Analytics |
Upload: | joe-percivall |
View: | 351 times |
Download: | 3 times |
The Avant-garde of Apache NiFiJoe Percivall - @JPercivallHadoop Summit – Melbourne
31 August 2016
2 © Hortonworks Inc. 2011 – 2016. All Rights Reserved
About Me• Software Engineer at Hortonworks
• Apache NiFi committer and PMC member
• Github: github.com/JPercivall
3 © Hortonworks Inc. 2011 – 2016. All Rights Reserved
Agenda• Intro to NiFi
• What’s new in NiFi 1.0.0
• Intro to MiNiFi
• MiNiFi Architecture
• NiFi & MiNiFi Demo
4 © Hortonworks Inc. 2011 – 2016. All Rights Reserved
Agenda• Intro to Apache NiFi
• What’s new in NiFi 1.0.0
• Intro to MiNiFi
• MiNiFi Architecture
• NiFi & MiNiFi Demo
5 © Hortonworks Inc. 2011 – 2016. All Rights Reserved
Let’s Connect A to BProducers A.K.A Things
AnythingAND
Everything
Internet!
Consumers• User• Storage• System• …More Things
6 © Hortonworks Inc. 2011 – 2016. All Rights Reserved
Why is moving data effectively hard?
Standards Formats “Exactly Once” Delivery Protocols Veracity of Information Validity of Information Ensuring Security Overcoming Security
Compliance Schemas Consumers Change Credential Management “That [person|team|group]” Network “Exactly Once” Delivery
7 © Hortonworks Inc. 2011 – 2016. All Rights Reserved
• Web-based User Interface for creating, monitoring, & controlling data flows
• Directed graphs of data routing and transformation
• Highly configurable - modify data flow at runtime, dynamically prioritize data
• Easily extensible through development of custom components
• Data Provenance tracks data through entire system
[1] https://nifi.apache.org/
Dataflow
Apache NiFi
8 © Hortonworks Inc. 2011 – 2016. All Rights Reserved
Apache NiFiKey Features
• Guaranteed delivery• Data buffering
- Backpressure- Pressure release
• Prioritized queuing• Flow specific QoS
- Latency vs. throughput- Loss tolerance
• Data provenance• Supports push and pull
models
• Recovery/recording a rolling log of fine-grained history
• Visual command and control
• Flow templates• Pluggable/multi-role
security• Designed for extension• Clustering
9 © Hortonworks Inc. 2011 – 2016. All Rights Reserved
Simplified ExampleLet’s consider the needs of a courier service
Physical Store
Gateway Server
Mobile Devices
Registers
Server Cluster
Distribution Center Core Data Center at HQ
Server Cluster
On Delivery Routes
Trucks Deliverers
Delivery Truck: Creative Stall, https://thenounproject.com/creativestall/Deliverer: Rigo Peter, https://thenounproject.com/rigo/Cash Register: Sergey Patutin, https://thenounproject.com/bdesign.by/Hand Scanner: Eric Pearson, https://thenounproject.com/epearson001/
10 © Hortonworks Inc. 2011 – 2016. All Rights Reserved
Great! I am collecting all this data! Let’s use it!Finding our needles in the haystack
Physical Store
Gateway Server
Mobile Devices
Registers
Server Cluster
Distribution Center
Kafka
Core Data Center at HQ
Server Cluster
Others
Storm / Spark / Flink / Apex
Kafka
Storm / Spark / Flink / Apex
Trucks Deliverers
Delivery Truck: Creative Stall, https://thenounproject.com/creativestall/Deliverer: Rigo Peter, https://thenounproject.com/rigo/Cash Register: Sergey Patutin, https://thenounproject.com/bdesign.by/Hand Scanner: Eric Pearson, https://thenounproject.com/epearson001/
On Delivery Routes
11 © Hortonworks Inc. 2011 – 2016. All Rights Reserved
Let’s revisit our courier service from the perspective of NiFi
Physical Store
Gateway Server
Mobile Devices
Registers
Server Cluster
Distribution Center
Kafka
Core Data Center at HQ
Server Cluster
Others
Storm / Spark / Flink / Apex
Kafka
Storm / Spark / Flink / Apex
Trucks Deliverers
Delivery Truck: Creative Stall, https://thenounproject.com/creativestall/Deliverer: Rigo Peter, https://thenounproject.com/rigo/Cash Register: Sergey Patutin, https://thenounproject.com/bdesign.by/Hand Scanner: Eric Pearson, https://thenounproject.com/epearson001/
NiFi NiFi NiFi NiFi NiFi NiFi
On Delivery Routes
12 © Hortonworks Inc. 2011 – 2016. All Rights Reserved
Fundamental Terminology
FlowFile• Unit of data moving through the system• Content + Attributes (key/value pairs)
Processor• Performs the work, can access FlowFiles
Connection• Links between processors• Queues that can be dynamically prioritized
git clone https://github.com/JPercivall/nifi-developer-tutorial.git
13 © Hortonworks Inc. 2011 – 2016. All Rights Reserved
Agenda• Intro to NiFi
• What’s new in NiFi 1.0.0
• Intro to MiNiFi
• MiNiFi Architecture
• NiFi & MiNiFi Demo
14 © Hortonworks Inc. 2011 – 2016. All Rights Reserved
Apache NiFi-1.0.0
Zero Master Clustering UI Refresh Multi-tenant authorization and internal
authorization/policy management
15+ new components
Over 450 tickets closed!
15 © Hortonworks Inc. 2011 – 2016. All Rights Reserved
Zero Master Clustering
16 © Hortonworks Inc. 2011 – 2016. All Rights Reserved
Zero Master Clustering
17 © Hortonworks Inc. 2011 – 2016. All Rights Reserved
UI Refresh & Multi-tenant Authorization
18 © Hortonworks Inc. 2011 – 2016. All Rights Reserved
Agenda• Intro to NiFi
• What’s new in NiFi 1.0.0
• Intro to MiNiFi
• MiNiFi Architecture
• NiFi & MiNiFi Demo
19 © Hortonworks Inc. 2011 – 2016. All Rights Reserved
Revisit: Courier service from the perspective of NiFi
Physical Store
Gateway Server
Mobile Devices
Registers
Server Cluster
Distribution Center Core Data Center at HQ
Server Cluster
Trucks Deliverers
Delivery Truck: Creative Stall, https://thenounproject.com/creativestall/Deliverer: Rigo Peter, https://thenounproject.com/rigo/Cash Register: Sergey Patutin, https://thenounproject.com/bdesign.by/Hand Scanner: Eric Pearson, https://thenounproject.com/epearson001/
NiFi NiFi NiFi NiFi NiFi NiFi
On Delivery Routes
20 © Hortonworks Inc. 2011 – 2016. All Rights Reserved
Courier service from the perspective of NiFi & MiNiFi
Physical Store
Gateway Server
Mobile Devices
Registers
Server Cluster
Distribution Center Core Data Center at HQ
Server Cluster
Trucks Deliverers
Delivery Truck: Creative Stall, https://thenounproject.com/creativestall/Deliverer: Rigo Peter, https://thenounproject.com/rigo/Cash Register: Sergey Patutin, https://thenounproject.com/bdesign.by/Hand Scanner: Eric Pearson, https://thenounproject.com/epearson001/
Client Libraries
Client Libraries
MiNiFi
MiNiFi NiFi NiFi NiFi NiFi NiFi NiFi
Client Libraries
On Delivery Routes
21 © Hortonworks Inc. 2011 – 2016. All Rights Reserved
Apache NiFi MiNiFiKey Features
• Guaranteed delivery• Data buffering
- Backpressure- Pressure release
• Prioritized queuing• Flow specific QoS
- Latency vs. throughput- Loss tolerance
• Data provenance
• Recovery/recording a rolling log of fine-grained history
• Designed for extension
• Design and Deploy• Warm re-deploys
22 © Hortonworks Inc. 2011 – 2016. All Rights Reserved
Apache NiFi MiNiFiKey Features
• Guaranteed delivery• Data buffering
- Backpressure- Pressure release
• Prioritized queuing• Flow specific QoS
- Latency vs. throughput- Loss tolerance
• Data provenance
• Recovery/recording a rolling log of fine-grained history
• Designed for extension
• Design and Deploy• Warm re-deploys
23 © Hortonworks Inc. 2011 – 2016. All Rights Reserved
Visual Command and Controlvs.
Design and Deploy
24 © Hortonworks Inc. 2011 – 2016. All Rights Reserved
Created to more effectively collect data at the edge
25 © Hortonworks Inc. 2011 – 2016. All Rights Reserved
Agenda• Intro to NiFi
• What’s new in NiFi 1.0.0
• Intro to MiNiFi
• MiNiFi Architecture
• NiFi & MiNiFi Demo
26 © Hortonworks Inc. 2011 – 2016. All Rights Reserved
NiFi vs MiNiFi Java Processes
NiFi Framework
Components
MiNiFi
NiFi Framework
User Interface
Components
NiFi
27 © Hortonworks Inc. 2011 – 2016. All Rights Reserved
NiFi Java Processes
Bootstrap
NiFi
UI
bootstrap.conf
nifi.properties
flow.xml.gzreads &modifies
reads
reads
starts
NiFi MiNiFi
28 © Hortonworks Inc. 2011 – 2016. All Rights Reserved
MiNiFi Java Processes
MiNiFi
Bootstrap
ConfigurationChange Notifier(s)
bootstrap.conf
nifi.properties
flow.xml.gzreads
reads
starts
config.ymltransforms
reads
into
NiFi MiNiFi
29 © Hortonworks Inc. 2011 – 2016. All Rights Reserved
Same Extensible Framework (nars)
In minifi-0.0.1, the nifi-0.6.1 standard processors are bundled (~20mb)– Tailing a Log– UpdateAttribute– Routing by content or attributes– PutEmail
Allows MiNiFi to use NiFi processors
30 © Hortonworks Inc. 2011 – 2016. All Rights Reserved
Simple Config.ymlTail a rolling file -> Site to Site
31 © Hortonworks Inc. 2011 – 2016. All Rights Reserved
Agenda• Intro to NiFi
• What’s new in NiFi 1.0.0
• Intro to MiNiFi
• MiNiFi Architecture
• NiFi & MiNiFi Demo
32 © Hortonworks Inc. 2011 – 2016. All Rights Reserved
Courier service from the perspective of NiFi & MiNiFi
Physical Store
Gateway Server
Mobile Devices
Registers
Server Cluster
Distribution Center Core Data Center at HQ
Server Cluster
Trucks Deliverers
Delivery Truck: Creative Stall, https://thenounproject.com/creativestall/Deliverer: Rigo Peter, https://thenounproject.com/rigo/Cash Register: Sergey Patutin, https://thenounproject.com/bdesign.by/Hand Scanner: Eric Pearson, https://thenounproject.com/epearson001/
Client Libraries
Client Libraries
MiNiFi
MiNiFi NiFi NiFi NiFi NiFi NiFi NiFi
Client Libraries
On Delivery Routes
33 © Hortonworks Inc. 2011 – 2016. All Rights Reserved
Questions?
34 © Hortonworks Inc. 2011 – 2016. All Rights Reserved
Thank you!
35 © Hortonworks Inc. 2011 – 2016. All Rights Reserved
Learn more and join us!
Apache NiFi sitehttp://nifi.apache.org
Subproject MiNiFi sitehttp://nifi.apache.org/minifi/
Subscribe to and collaborate [email protected]@nifi.apache.org
Submit Ideas or Issueshttps://issues.apache.org/jira/browse/NIFI
Follow us on Twitter@apachenifi
36 © Hortonworks Inc. 2011 – 2016. All Rights Reserved
Back-up
37 © Hortonworks Inc. 2011 – 2016. All Rights Reserved
Matured at NSA 2006-2014
Brief history of the Apache NiFi Community
• Contributors from Government and several commercial industries
• Releases on a 6-8 week schedule
• Apache NiFi 1.0.0. release on the horizon• Zero-Master Clustering
Code developed at NSA
2006
Today
Achieved TLP
status in just 7 months
July 2015
Code available open source
ASL v2
November 2014
38 © Hortonworks Inc. 2011 – 2016. All Rights Reserved
A bit more complex Config.ymlTail a rolling File -> Secure Site to Site with Provenance
39 © Hortonworks Inc. 2011 – 2016. All Rights Reserved
MiNiFi 0.0.1-Java
Declarative configuration of processing flows through a YAML configuration file Exporting of provenance events to another NiFi instance via a Reporting Task over Site to
Site Flow change configuration watcher implementations that provide reloading a NiFi
instance when receiving an updated flow over REST or changes on a file system Providing a mechanism to query an instance's status
<40mb binary distribution
Release Notes
40 © Hortonworks Inc. 2011 – 2016. All Rights Reserved
Change notifier update
MiNiFi
Bootstrap
ConfigurationChange Notifiers
1. Initial state– Both running
41 © Hortonworks Inc. 2011 – 2016. All Rights Reserved
Change notifier update
MiNiFi
Bootstrap
ConfigurationChange Notifiers
user creates new configuration2. User sends update through
notifier– HTTP(S) post request– Change watched file
42 © Hortonworks Inc. 2011 – 2016. All Rights Reserved
Change notifier update
MiNiFi
Bootstrap
ConfigurationChange Notifiers
3. Bootstrap validation– Basic validation– Rest notifier will respond
accordingly– Results logged
validate new configuration
user creates new configuration
43 © Hortonworks Inc. 2011 – 2016. All Rights Reserved
Change notifier update
MiNiFi
Bootstrap
ConfigurationChange Notifiers
config.ymlsaves new
4. Bootstrap saves and transforms
– Copy old config.yml to a swap file
validate new configuration
user creates new configuration
nifi.properties
flow.xml.gz
transforms into
44 © Hortonworks Inc. 2011 – 2016. All Rights Reserved
Change notifier update
MiNiFi
Bootstrap
ConfigurationChange Notifiers
nifi.properties
flow.xml.gz
attempt restart
config.ymlsaves new
reads
transforms into
5. Bootstrap attempts restart– MiNiFi reads in the new
nifi.properties and flow.xml.gz
validate new configuration
user creates new configuration
45 © Hortonworks Inc. 2011 – 2016. All Rights Reserved
Change notifier update
6. Success or Fail– Successful restart continue
processing– Failure, rollback to old
config– Existing Data is mapped or
orphaned
MiNiFi
Bootstrap
ConfigurationChange Notifiers
nifi.properties
flow.xml.gz
attempt restart
config.ymlsaves new
reads
transforms into
validate new configuration
user creates new configuration