+ All Categories
Home > Software > Introduction to Apache NiFi - Seattle Scalability Meetup

Introduction to Apache NiFi - Seattle Scalability Meetup

Date post: 16-Apr-2017
Category:
Upload: saptak-sen
View: 2,194 times
Download: 5 times
Share this document with a friend
35
Introducing #ApacheNiFi Saptak Sen [@saptak] Technical Product Manager, Hortonworks © Hortonworks Inc. 2011 – 2015. All Rights Reserved #seasca le
Transcript
Page 1: Introduction to Apache NiFi - Seattle Scalability Meetup

Introducing #ApacheNiFi

Saptak Sen [@saptak]Technical Product Manager, Hortonworks

© Hortonworks Inc. 2011 – 2015. All Rights Reserved

#seascale

Page 2: Introduction to Apache NiFi - Seattle Scalability Meetup

Page 2 © Hortonworks Inc. 2011 – 2014. All Rights Reserved

Agenda

• New Data Sources and the Rise of the Internet of Anything• Introducing: Hortonworks DataFlow powered by Apache NiFi• Key concepts, architecture, and use cases• Demo• Q&A

Page 3: Introduction to Apache NiFi - Seattle Scalability Meetup

Page 3 © Hortonworks Inc. 2011 – 2014. All Rights Reserved

IoAT Data Grows Faster Than We Consume It

Much of the new data exists in-flight, between systems and devices as part of the Internet of AnythingNEW

TRADITIONAL

Ability to consume data

The OpportunityUnlock transformational business valuefrom a full fidelity of data and analyticsfor all data.

Geolocation

Server logs

Files & emails

ERP, CRM, SCM

Traditional Data Sources

Internet of Anything

Sensorsand machines

Clickstream

Social media

Page 4: Introduction to Apache NiFi - Seattle Scalability Meetup

Page 4 © Hortonworks Inc. 2011 – 2014. All Rights Reserved

Interconnectedness Demands User CentricityChanges Organizations into Data Companies

Hortonworks Data Platformfor rich historical insights

from data-at-restNEW Hortonworks DataFlow

for securely collecting, conducting, and curating

data-in-motion while ALSO driving value for data-at-rest

analytics and use cases

Source: Gartner - Architecture Options for Big Data Analytics on Hadoop, July 2015

Page 5: Introduction to Apache NiFi - Seattle Scalability Meetup

Page 5 © Hortonworks Inc. 2011 – 2015. All Rights Reserved

Simplistic View of IoAT & Data Flow

The Data Flow Thing

Process and Analyze DataAcquire Data

Store Data

Page 6: Introduction to Apache NiFi - Seattle Scalability Meetup

Page 6 © Hortonworks Inc. 2011 – 2015. All Rights Reserved

Global interactions with customers, business partners, and thingsspanning different volume, velocity, bandwidth, and latency needs

Realistic View of IoAT and Data Flow

Page 7: Introduction to Apache NiFi - Seattle Scalability Meetup

Page 7 © Hortonworks Inc. 2011 – 2015. All Rights Reserved

Meeting IoAT Edge Requirements

GATHER

DELIVER

PRIORITIZE

Track from the edge Through to the datacenter

Small Footprintsoperate with very little power

Limited Bandwidthcan create high latency

Data Availabilityexceeds transmission bandwidth

Data Must Be Securedthroughout its journey

Page 8: Introduction to Apache NiFi - Seattle Scalability Meetup

Page 8 © Hortonworks Inc. 2011 – 2014. All Rights Reserved

Hortonworks Acquires Onyara

Turn Internet of Anything Data Into Actionable Insights• Onyara is the creator of and key contributor to Apache NiFi,

an open source solution for processing and distributing data.

• Over the past 8 years, Onyara engineers developed the U.S. government software project called “Niagara Files”, the precursor to Apache NiFi.

• Apache NiFi was made available as an Apache Incubator project through the NSA Technology Transfer Program in the Fall of 2014.

NEW Hortonworks DataFlow offering will securely and easily collect, conduct and curate any data, from anything, anywhere.

Page 9: Introduction to Apache NiFi - Seattle Scalability Meetup

Page 9 © Hortonworks Inc. 2011 – 2015. All Rights Reserved

The IoAT Data Flow

Hortonworks Data Platformpowered by Apache Hadoop

Hortonworks Data Platformpowered by Apache Hadoop

EnrichContext

Store Data and Metadata

Internetof Anything

Hortonworks DataFlow powered by Apache NiFi

Perishable Insights

HistoricalInsights

Introducing Hortonworks DataFlow powered by Apache NiFi

Hortonworks DataFlow and the Hortonworks Data Platform deliver the industry’s most complete solution for management of Big Data.

Page 10: Introduction to Apache NiFi - Seattle Scalability Meetup

Page 10 © Hortonworks Inc. 2011 – 2015. All Rights Reserved

Apache NiFi: Three key concepts

• Manage the flow of information

• Data Provenance

• Secure the control plane and data plane

Page 11: Introduction to Apache NiFi - Seattle Scalability Meetup

Page 11 © Hortonworks Inc. 2011 – 2015. All Rights Reserved

Apache NiFi – Key Features

• Guaranteed delivery• Data buffering

- Backpressure- Pressure release

• Prioritized queuing• Flow specific QoS

- Latency vs. throughput- Loss tolerance

• Data provenance

• Recovery/recording a rolling log of fine-grained history

• Visual command and control

• Flow templates• Pluggable/multi-role

security• Designed for extension• Clustering

Page 12: Introduction to Apache NiFi - Seattle Scalability Meetup

Page 12 © Hortonworks Inc. 2011 – 2015. All Rights Reserved

Common Apache NiFi Use Cases

Predictive AnalyticsEnsure the highest value data is captured and available for analysisComplianceGain full transparency into provenance and flow of data

IoT OptimizationSecure, Prioritize, Enrich and Trace data at the edge

Fraud DetectionMove sales transaction data in real time to analyze on demand

Big Data IngestEasily and efficiently ingest data into Hadoop

Value ResourcesGain visibility into how data sources are used to determine value

Page 13: Introduction to Apache NiFi - Seattle Scalability Meetup

Page 13 © Hortonworks Inc. 2011 – 2015. All Rights Reserved

OS/Host

JVM

Flow Controller

Web Server

Processor 1 Extension N

FlowFileRepository

ContentRepository

ProvenanceRepository

Local Storage

OS/Host

JVM

Flow Controller

Web Server

Processor 1 Extension N

FlowFileRepository

ContentRepository

ProvenanceRepository

Local Storage

Architecture

OS/Host

JVM

Flow Controller

Web Server

Processor 1 Extension N

FlowFileRepository

ContentRepository

ProvenanceRepository

Local Storage

OS/Host

JVM

NiFi Cluster Manager – Request Replicator

Web Server

MasterNiFi Cluster Manager (NCM)

OS/Host

JVM

Flow Controller

Web Server

Processor 1 Extension N

FlowFileRepository

ContentRepository

ProvenanceRepository

Local Storage

SlavesNiFi Nodes

High Availability: Control plane vs Data plane…

Page 14: Introduction to Apache NiFi - Seattle Scalability Meetup

Page 14 © Hortonworks Inc. 2011 – 2015. All Rights Reserved

HDF – Powered by Apache NiFi

Page 15: Introduction to Apache NiFi - Seattle Scalability Meetup

Page 15 © Hortonworks Inc. 2011 – 2015. All Rights Reserved

Add processor for data intake1 Drag and drop processor icon from the top menu

Page 16: Introduction to Apache NiFi - Seattle Scalability Meetup

Page 16 © Hortonworks Inc. 2011 – 2015. All Rights Reserved

Choose the specific processor2 Choose one of the processors – currently 90 available – designed for extension

Page 17: Introduction to Apache NiFi - Seattle Scalability Meetup

Page 17 © Hortonworks Inc. 2011 – 2015. All Rights Reserved

Example: Pick Twitter Processor

Page 18: Introduction to Apache NiFi - Seattle Scalability Meetup

Page 18 © Hortonworks Inc. 2011 – 2015. All Rights Reserved

Configure the processor

3 Select processor and choose option to Configure

4

Adjust parameters as required

Page 19: Introduction to Apache NiFi - Seattle Scalability Meetup

Page 19 © Hortonworks Inc. 2011 – 2015. All Rights Reserved

Another processor for data output5 Drag and drop processor icon from the top menu

6 Example: choose PutHDFS processor

Page 20: Introduction to Apache NiFi - Seattle Scalability Meetup

Page 20 © Hortonworks Inc. 2011 – 2015. All Rights Reserved

Configure second processor7 Configure 2nd processor

Page 21: Introduction to Apache NiFi - Seattle Scalability Meetup

Page 21 © Hortonworks Inc. 2011 – 2015. All Rights Reserved

Connect processors, configure connection

8

Page 22: Introduction to Apache NiFi - Seattle Scalability Meetup

Page 22 © Hortonworks Inc. 2011 – 2015. All Rights Reserved

Click Start to begin processing

9

Page 23: Introduction to Apache NiFi - Seattle Scalability Meetup

Page 23 © Hortonworks Inc. 2011 – 2015. All Rights Reserved

See processors update with real time changes

10As data flows, GUI interface updates in real time.

Page 24: Introduction to Apache NiFi - Seattle Scalability Meetup

Page 24 © Hortonworks Inc. 2011 – 2015. All Rights Reserved

Dynamically adjust and tune data flow as needed

11 Dynamically adjust and tune dataflow as needed, in real time. Can also replicate data for testing and comparison.

Page 25: Introduction to Apache NiFi - Seattle Scalability Meetup

Page 25 © Hortonworks Inc. 2011 – 2015. All Rights Reserved

Understand the data path with Data Provenance

14 Select Data Provenance

Page 26: Introduction to Apache NiFi - Seattle Scalability Meetup

Page 26 © Hortonworks Inc. 2011 – 2015. All Rights Reserved

Trace lineage of a particular piece of data

15

Icon for Data Lineage

Page 27: Introduction to Apache NiFi - Seattle Scalability Meetup

Page 27 © Hortonworks Inc. 2011 – 2015. All Rights Reserved

Every change to data is tracked: processing, views

16

Provenance event is tracked

Page 28: Introduction to Apache NiFi - Seattle Scalability Meetup

Page 28 © Hortonworks Inc. 2011 – 2015. All Rights Reserved

Updates as changes happen

17 Updates as data flows

Page 29: Introduction to Apache NiFi - Seattle Scalability Meetup

Page 29 © Hortonworks Inc. 2011 – 2015. All Rights Reserved

Easily access and trace changes to dataflow

Page 30: Introduction to Apache NiFi - Seattle Scalability Meetup

Page 30 © Hortonworks Inc. 2011 – 2015. All Rights Reserved

Audit trail of Hortonworks DataFlow User Actions

Page 31: Introduction to Apache NiFi - Seattle Scalability Meetup

Page 31 © Hortonworks Inc. 2011 – 2015. All Rights Reserved

Page 32: Introduction to Apache NiFi - Seattle Scalability Meetup

Page 32 © Hortonworks Inc. 2011 – 2015. All Rights Reserved

Operations: Planned

Page 33: Introduction to Apache NiFi - Seattle Scalability Meetup

Page 33 © Hortonworks Inc. 2011 – 2015. All Rights Reserved

Page 34: Introduction to Apache NiFi - Seattle Scalability Meetup

Page 34 © Hortonworks Inc. 2011 – 2015. All Rights Reserved

Page 35: Introduction to Apache NiFi - Seattle Scalability Meetup

Page 35 © Hortonworks Inc. 2011 – 2015. All Rights Reserved

Q & A

Page 35 © Hortonworks Inc. 2011 – 2015. All Rights Reserved


Recommended