Date post: | 11-Apr-2017 |
Category: |
Technology |
Upload: | spring-by-pivotal |
View: | 706 times |
Download: | 0 times |
Spring with Apache NiFiNiFi likes to move it…at least once
© Hortonworks Inc. 2011 – 2015. All Rights Reserved
Oleg Zhurakousky, Hortonworks; Twitter@z_oleg; GitHub@olegz
Page 2 © Hortonworks Inc. 2014-2016
Agenda
• Vision
• The Key Concepts
• Demo!
• Proposed Roadmap
Page 3 © Hortonworks Inc. 2014-2016
….what many think their architecture looks like
Dataflow
Process and Analyze DataAcquire Data
Store Data
Page 4 © Hortonworks Inc. 2014-2016
….what it really looks like
Page 5 © Hortonworks Inc. 2014-2016
Modern data processing concerns
• Multiple sources of data• Geo distribution• Multiple protocols for data transport• New technologies and products• New data processing paradigms
• Streaming, Event Sourcing
• Security and encryption• New type of users
Page 6 © Hortonworks Inc. 2014-2016
Modern data concerns summary
• Multiple sources of data• Geo distribution• Multiple protocols for data transport• New technologies and products• New data processing paradigms
• Streaming, Event Sourcing
• Security and encryption• New type of users
Modern applications are now data-centric rather then data-source-centric
Page 7 © Hortonworks Inc. 2014-2016
So what is NiFi?
NiFi is a technology that provides the ability to consolidate heterogeneous sources of data into one cohesive data flow while addressing concerns described before.
Page 8 © Hortonworks Inc. 2014-2016
Product or Framework?
• NiFi is a product• IDE for data flow design• Data Flow control and management• Out-of-the-box support for key concepts and features
of Flow-based Paradigm/Programming (FBP)
Page 9 © Hortonworks Inc. 2011 – 2015. All Rights Reserved
Three key concepts
• Manage the flow of information
• Data Provenance
• Secure the control plane and data plane
Page 10 © Hortonworks Inc. 2011 – 2015. All Rights Reserved
Apache NiFi – Key Features
• Guaranteed delivery• Data buffering
- Backpressure- Pressure release
• Prioritized queuing• Flow specific QoS
- Latency vs. throughput- Loss tolerance
• Data provenance• Supports push and
pull models
• Recovery/recording a rolling log of fine-grained history
• Visual command and control
• Flow templates• Pluggable/multi-role
security• Designed for extension• Clustering
Page 11 © Hortonworks Inc. 2014-2016
Product or Framework?
• NiFi is a product• UI for data flow design, control and management• Out-of-the-box support for key concepts of Flow-
based Paradigm/Programming (FBP)
• NiFi is a framework• Flow-based Programming (FBP)• Extension model• API
Page 12 © Hortonworks Inc. 2011 – 2015. All Rights Reserved
Flow Based Programming (FBP)
FBP Term NiFi Term DescriptionInformation Packet
FlowFile Each object moving through the system.
Black Box FlowFile Processor
Performs the work, doing some combination of data routing, transformation, or mediation between systems.
Bounded Buffer
Connection The linkage between processors, acting as queues and allowing various processes to interact at differing rates.
Scheduler Flow Controller
Maintains the knowledge of how processes are connected, and manages the threads and allocations thereof which all processes use.
Subnet Process Group
A set of processes and their connections, which can receive and send data via ports. A process group allows creation of entirely new component simply by composition of its components.
Page 13 © Hortonworks Inc. 2014-2016
“All about that connectivity”
Could be looked as one of the extension to Application layer of an OSI model:1. Asynchronous Processes2. Reliable Process Connectivity3. Provenance
NiFi Flow-based programming (FPB)Application Enterprise Integration Patterns (EIP)
Presentation
Session
Transport
Network
Data Link
Physical
Page 14 © Hortonworks Inc. 2011 – 2015. All Rights Reserved
Basics of Connecting Systems
For every connection, these must agree:1. Protocol2. Format3. Schema4. Priority5. Size of event6. Frequency of event7. Authorization access8. Relevance
P1
Producer
C1
Consumer
Page 15 © Hortonworks Inc. 2011 – 2015. All Rights Reserved
Extension / Integration Points
NiFi Term DescriptionFlow File Processor Push/Pull behavior. Custom UI
Flow File Comparator Used to establish priority of FlowFiles in a queue
Reporting Task Used to push data from NiFi to some external service (metrics, provenance, etc..)
Controller Service Used to enable reusable components / shared services throughout the flow
The REST API Allows clients to connect to pull information, change behavior, etc..
Native Ports Input/Output ports connect NiFi flows to the outside world without explicit third party messaging layer.
Page 16 © Hortonworks Inc. 2011 – 2015. All Rights Reserved
Let’s write some code!
DEMO
Page 17 © Hortonworks Inc. 2014-2016
Agenda
Introducing Apache NiFi
• A Simple Vision
• The Key Concepts
• Demo!
• Proposed roadmap
Page 18 © Hortonworks Inc. 2014-2016
Proposed Roadmap (in progress within community)
NiFi• Multi-tenant authorization• HA Data & Control• Enhanced user experience• Registry for Templates and Extensions• Version managed flows• Zero downtime upgrades
“MiNiFi”• Agent model• C&C• Secure data path• Scale down• Full data provenance
Page 19 © Hortonworks Inc. 2014-2016