The Avant-garde of Apache NiFi

Post on 16-Apr-2017

581 views 1 download

transcript

The Avant-garde of Apache NiFiJoe Percivall - @JPercivallHadoop Summit – Melbourne

31 August 2016

2 © Hortonworks Inc. 2011 – 2016. All Rights Reserved

About Me• Software Engineer at Hortonworks

• Apache NiFi committer and PMC member

• Github: github.com/JPercivall

3 © Hortonworks Inc. 2011 – 2016. All Rights Reserved

Agenda• Intro to NiFi

• What’s new in NiFi 1.0.0

• Intro to MiNiFi

• MiNiFi Architecture

• NiFi & MiNiFi Demo

4 © Hortonworks Inc. 2011 – 2016. All Rights Reserved

Agenda• Intro to Apache NiFi

• What’s new in NiFi 1.0.0

• Intro to MiNiFi

• MiNiFi Architecture

• NiFi & MiNiFi Demo

5 © Hortonworks Inc. 2011 – 2016. All Rights Reserved

Let’s Connect A to BProducers A.K.A Things

AnythingAND

Everything

Internet!

Consumers• User• Storage• System• …More Things

6 © Hortonworks Inc. 2011 – 2016. All Rights Reserved

Why is moving data effectively hard?

Standards Formats “Exactly Once” Delivery Protocols Veracity of Information Validity of Information Ensuring Security Overcoming Security

Compliance Schemas Consumers Change Credential Management “That [person|team|group]” Network “Exactly Once” Delivery

7 © Hortonworks Inc. 2011 – 2016. All Rights Reserved

• Web-based User Interface for creating, monitoring, & controlling data flows

• Directed graphs of data routing and transformation

• Highly configurable - modify data flow at runtime, dynamically prioritize data

• Easily extensible through development of custom components

• Data Provenance tracks data through entire system

[1] https://nifi.apache.org/

Dataflow

Apache NiFi

8 © Hortonworks Inc. 2011 – 2016. All Rights Reserved

Apache NiFiKey Features

• Guaranteed delivery• Data buffering

- Backpressure- Pressure release

• Prioritized queuing• Flow specific QoS

- Latency vs. throughput- Loss tolerance

• Data provenance• Supports push and pull

models

• Recovery/recording a rolling log of fine-grained history

• Visual command and control

• Flow templates• Pluggable/multi-role

security• Designed for extension• Clustering

9 © Hortonworks Inc. 2011 – 2016. All Rights Reserved

Simplified ExampleLet’s consider the needs of a courier service

Physical Store

Gateway Server

Mobile Devices

Registers

Server Cluster

Distribution Center Core Data Center at HQ

Server Cluster

On Delivery Routes

Trucks Deliverers

Delivery Truck: Creative Stall, https://thenounproject.com/creativestall/Deliverer: Rigo Peter, https://thenounproject.com/rigo/Cash Register: Sergey Patutin, https://thenounproject.com/bdesign.by/Hand Scanner: Eric Pearson, https://thenounproject.com/epearson001/

10 © Hortonworks Inc. 2011 – 2016. All Rights Reserved

Great! I am collecting all this data! Let’s use it!Finding our needles in the haystack

Physical Store

Gateway Server

Mobile Devices

Registers

Server Cluster

Distribution Center

Kafka

Core Data Center at HQ

Server Cluster

Others

Storm / Spark / Flink / Apex

Kafka

Storm / Spark / Flink / Apex

Trucks Deliverers

Delivery Truck: Creative Stall, https://thenounproject.com/creativestall/Deliverer: Rigo Peter, https://thenounproject.com/rigo/Cash Register: Sergey Patutin, https://thenounproject.com/bdesign.by/Hand Scanner: Eric Pearson, https://thenounproject.com/epearson001/

On Delivery Routes

11 © Hortonworks Inc. 2011 – 2016. All Rights Reserved

Let’s revisit our courier service from the perspective of NiFi

Physical Store

Gateway Server

Mobile Devices

Registers

Server Cluster

Distribution Center

Kafka

Core Data Center at HQ

Server Cluster

Others

Storm / Spark / Flink / Apex

Kafka

Storm / Spark / Flink / Apex

Trucks Deliverers

Delivery Truck: Creative Stall, https://thenounproject.com/creativestall/Deliverer: Rigo Peter, https://thenounproject.com/rigo/Cash Register: Sergey Patutin, https://thenounproject.com/bdesign.by/Hand Scanner: Eric Pearson, https://thenounproject.com/epearson001/

NiFi NiFi NiFi NiFi NiFi NiFi

On Delivery Routes

12 © Hortonworks Inc. 2011 – 2016. All Rights Reserved

Fundamental Terminology

FlowFile• Unit of data moving through the system• Content + Attributes (key/value pairs)

Processor• Performs the work, can access FlowFiles

Connection• Links between processors• Queues that can be dynamically prioritized

git clone https://github.com/JPercivall/nifi-developer-tutorial.git

13 © Hortonworks Inc. 2011 – 2016. All Rights Reserved

Agenda• Intro to NiFi

• What’s new in NiFi 1.0.0

• Intro to MiNiFi

• MiNiFi Architecture

• NiFi & MiNiFi Demo

14 © Hortonworks Inc. 2011 – 2016. All Rights Reserved

Apache NiFi-1.0.0

Zero Master Clustering UI Refresh Multi-tenant authorization and internal

authorization/policy management

15+ new components

Over 450 tickets closed!

15 © Hortonworks Inc. 2011 – 2016. All Rights Reserved

Zero Master Clustering

16 © Hortonworks Inc. 2011 – 2016. All Rights Reserved

Zero Master Clustering

17 © Hortonworks Inc. 2011 – 2016. All Rights Reserved

UI Refresh & Multi-tenant Authorization

18 © Hortonworks Inc. 2011 – 2016. All Rights Reserved

Agenda• Intro to NiFi

• What’s new in NiFi 1.0.0

• Intro to MiNiFi

• MiNiFi Architecture

• NiFi & MiNiFi Demo

19 © Hortonworks Inc. 2011 – 2016. All Rights Reserved

Revisit: Courier service from the perspective of NiFi

Physical Store

Gateway Server

Mobile Devices

Registers

Server Cluster

Distribution Center Core Data Center at HQ

Server Cluster

Trucks Deliverers

Delivery Truck: Creative Stall, https://thenounproject.com/creativestall/Deliverer: Rigo Peter, https://thenounproject.com/rigo/Cash Register: Sergey Patutin, https://thenounproject.com/bdesign.by/Hand Scanner: Eric Pearson, https://thenounproject.com/epearson001/

NiFi NiFi NiFi NiFi NiFi NiFi

On Delivery Routes

20 © Hortonworks Inc. 2011 – 2016. All Rights Reserved

Courier service from the perspective of NiFi & MiNiFi

Physical Store

Gateway Server

Mobile Devices

Registers

Server Cluster

Distribution Center Core Data Center at HQ

Server Cluster

Trucks Deliverers

Delivery Truck: Creative Stall, https://thenounproject.com/creativestall/Deliverer: Rigo Peter, https://thenounproject.com/rigo/Cash Register: Sergey Patutin, https://thenounproject.com/bdesign.by/Hand Scanner: Eric Pearson, https://thenounproject.com/epearson001/

Client Libraries

Client Libraries

MiNiFi

MiNiFi NiFi NiFi NiFi NiFi NiFi NiFi

Client Libraries

On Delivery Routes

21 © Hortonworks Inc. 2011 – 2016. All Rights Reserved

Apache NiFi MiNiFiKey Features

• Guaranteed delivery• Data buffering

- Backpressure- Pressure release

• Prioritized queuing• Flow specific QoS

- Latency vs. throughput- Loss tolerance

• Data provenance

• Recovery/recording a rolling log of fine-grained history

• Designed for extension

• Design and Deploy• Warm re-deploys

22 © Hortonworks Inc. 2011 – 2016. All Rights Reserved

Apache NiFi MiNiFiKey Features

• Guaranteed delivery• Data buffering

- Backpressure- Pressure release

• Prioritized queuing• Flow specific QoS

- Latency vs. throughput- Loss tolerance

• Data provenance

• Recovery/recording a rolling log of fine-grained history

• Designed for extension

• Design and Deploy• Warm re-deploys

23 © Hortonworks Inc. 2011 – 2016. All Rights Reserved

Visual Command and Controlvs.

Design and Deploy

24 © Hortonworks Inc. 2011 – 2016. All Rights Reserved

Created to more effectively collect data at the edge

25 © Hortonworks Inc. 2011 – 2016. All Rights Reserved

Agenda• Intro to NiFi

• What’s new in NiFi 1.0.0

• Intro to MiNiFi

• MiNiFi Architecture

• NiFi & MiNiFi Demo

26 © Hortonworks Inc. 2011 – 2016. All Rights Reserved

NiFi vs MiNiFi Java Processes

NiFi Framework

Components

MiNiFi

NiFi Framework

User Interface

Components

NiFi

27 © Hortonworks Inc. 2011 – 2016. All Rights Reserved

NiFi Java Processes

Bootstrap

NiFi

UI

bootstrap.conf

nifi.properties

flow.xml.gzreads &modifies

reads

reads

starts

NiFi MiNiFi

28 © Hortonworks Inc. 2011 – 2016. All Rights Reserved

MiNiFi Java Processes

MiNiFi

Bootstrap

ConfigurationChange Notifier(s)

bootstrap.conf

nifi.properties

flow.xml.gzreads

reads

starts

config.ymltransforms

reads

into

NiFi MiNiFi

29 © Hortonworks Inc. 2011 – 2016. All Rights Reserved

Same Extensible Framework (nars)

In minifi-0.0.1, the nifi-0.6.1 standard processors are bundled (~20mb)– Tailing a Log– UpdateAttribute– Routing by content or attributes– PutEmail

Allows MiNiFi to use NiFi processors

30 © Hortonworks Inc. 2011 – 2016. All Rights Reserved

Simple Config.ymlTail a rolling file -> Site to Site

31 © Hortonworks Inc. 2011 – 2016. All Rights Reserved

Agenda• Intro to NiFi

• What’s new in NiFi 1.0.0

• Intro to MiNiFi

• MiNiFi Architecture

• NiFi & MiNiFi Demo

32 © Hortonworks Inc. 2011 – 2016. All Rights Reserved

Courier service from the perspective of NiFi & MiNiFi

Physical Store

Gateway Server

Mobile Devices

Registers

Server Cluster

Distribution Center Core Data Center at HQ

Server Cluster

Trucks Deliverers

Delivery Truck: Creative Stall, https://thenounproject.com/creativestall/Deliverer: Rigo Peter, https://thenounproject.com/rigo/Cash Register: Sergey Patutin, https://thenounproject.com/bdesign.by/Hand Scanner: Eric Pearson, https://thenounproject.com/epearson001/

Client Libraries

Client Libraries

MiNiFi

MiNiFi NiFi NiFi NiFi NiFi NiFi NiFi

Client Libraries

On Delivery Routes

33 © Hortonworks Inc. 2011 – 2016. All Rights Reserved

Questions?

34 © Hortonworks Inc. 2011 – 2016. All Rights Reserved

Thank you!

35 © Hortonworks Inc. 2011 – 2016. All Rights Reserved

Learn more and join us!

Apache NiFi sitehttp://nifi.apache.org

Subproject MiNiFi sitehttp://nifi.apache.org/minifi/

Subscribe to and collaborate atdev@nifi.apache.orgusers@nifi.apache.org

Submit Ideas or Issueshttps://issues.apache.org/jira/browse/NIFI

Follow us on Twitter@apachenifi

36 © Hortonworks Inc. 2011 – 2016. All Rights Reserved

Back-up

37 © Hortonworks Inc. 2011 – 2016. All Rights Reserved

Matured at NSA 2006-2014

Brief history of the Apache NiFi Community

• Contributors from Government and several commercial industries

• Releases on a 6-8 week schedule

• Apache NiFi 1.0.0. release on the horizon• Zero-Master Clustering

Code developed at NSA

2006

Today

Achieved TLP

status in just 7 months

July 2015

Code available open source

ASL v2

November 2014

38 © Hortonworks Inc. 2011 – 2016. All Rights Reserved

A bit more complex Config.ymlTail a rolling File -> Secure Site to Site with Provenance

39 © Hortonworks Inc. 2011 – 2016. All Rights Reserved

MiNiFi 0.0.1-Java

Declarative configuration of processing flows through a YAML configuration file Exporting of provenance events to another NiFi instance via a Reporting Task over Site to

Site Flow change configuration watcher implementations that provide reloading a NiFi

instance when receiving an updated flow over REST or changes on a file system Providing a mechanism to query an instance's status

<40mb binary distribution

Release Notes

40 © Hortonworks Inc. 2011 – 2016. All Rights Reserved

Change notifier update

MiNiFi

Bootstrap

ConfigurationChange Notifiers

1. Initial state– Both running

41 © Hortonworks Inc. 2011 – 2016. All Rights Reserved

Change notifier update

MiNiFi

Bootstrap

ConfigurationChange Notifiers

user creates new configuration2. User sends update through

notifier– HTTP(S) post request– Change watched file

42 © Hortonworks Inc. 2011 – 2016. All Rights Reserved

Change notifier update

MiNiFi

Bootstrap

ConfigurationChange Notifiers

3. Bootstrap validation– Basic validation– Rest notifier will respond

accordingly– Results logged

validate new configuration

user creates new configuration

43 © Hortonworks Inc. 2011 – 2016. All Rights Reserved

Change notifier update

MiNiFi

Bootstrap

ConfigurationChange Notifiers

config.ymlsaves new

4. Bootstrap saves and transforms

– Copy old config.yml to a swap file

validate new configuration

user creates new configuration

nifi.properties

flow.xml.gz

transforms into

44 © Hortonworks Inc. 2011 – 2016. All Rights Reserved

Change notifier update

MiNiFi

Bootstrap

ConfigurationChange Notifiers

nifi.properties

flow.xml.gz

attempt restart

config.ymlsaves new

reads

transforms into

5. Bootstrap attempts restart– MiNiFi reads in the new

nifi.properties and flow.xml.gz

validate new configuration

user creates new configuration

45 © Hortonworks Inc. 2011 – 2016. All Rights Reserved

Change notifier update

6. Success or Fail– Successful restart continue

processing– Failure, rollback to old

config– Existing Data is mapped or

orphaned

MiNiFi

Bootstrap

ConfigurationChange Notifiers

nifi.properties

flow.xml.gz

attempt restart

config.ymlsaves new

reads

transforms into

validate new configuration

user creates new configuration