+ All Categories
Home > Technology > Hortonworks Data In Motion Webinar Series Pt. 2

Hortonworks Data In Motion Webinar Series Pt. 2

Date post: 09-Jan-2017
Category:
Upload: hortonworks
View: 534 times
Download: 10 times
Share this document with a friend
39
Make Your Big Data Ecosystem Work Better for You @MarkLochbihler Partner Engineering October 12 th , 2016 © Hortonworks Inc. 2011 – 2015. All Rights Reserved
Transcript

Make Your Big Data Ecosystem Work Better for You

@MarkLochbihlerPartner Engineering

October 12th, 2016© Hortonworks Inc. 2011 – 2015. All Rights Reserved

2 © Hortonworks Inc. 2011 – 2016. All Rights Reserved

Agenda • Hortonworks Connected Data Platforms• HDF 2.0 Platform• Data Ingestion into Hadoop made EASY• HDF 2.0 Platform Use Cases• HDF 2.0 Product Integration Certification• HDF Partner Ecosystem Solutions• More Information

3 © Hortonworks Inc. 2011 – 2016. All Rights Reserved

Agenda Hortonworks Connected Data Platforms

HDF 2.0 Platform

Data Ingestion into Hadoop made EASY

HDF 2.0 Platform Use Cases

HDF 2.0 Product Integration Certification

HDF Partner Ecosystem Solutions

More Information

4 © Hortonworks Inc. 2011 – 2016. All Rights Reserved

Actionable Intelligence fromConnected Data Platforms

Capturing perishable insights from data in motion

Ensuring rich, historical insightson data at rest

Necessary for moderndata applications

DATA AT RESTDATA IN MOTION

ACTIONABLEINTELLIGENCE

Modern Data Applications

Hortonworks DataFlow

Hortonworks Data Platform

5 © Hortonworks Inc. 2011 – 2016. All Rights Reserved

Modern Data AppsCustom or Off the Shelf

Real-Time Cyber Securityprotects systems with superior threat detectionSmart Manufacturingdramatically improves yields by managing more variables in greater detailConnected, Autonomous Carsdrive themselves and improve road safetyFuture Farmingoptimizing soil, seeds and equipment to measured conditions on each square footAutomatic Recommendation Enginesmatch products to preferences in milliseconds

DATA ATREST

DATA IN MOTION

ACTIONABLEINTELLIGENCE

Modern Data Applications

Hortonworks DataFlow

Hortonworks Data Platform

6 © Hortonworks Inc. 2011 – 2016. All Rights Reserved

Agenda Hortonworks Connected Data Platforms

HDF 2.0 Platform

Data Ingestion into Hadoop made EASY

HDF 2.0 Platform Use Cases

HDF 2.0 Product Integration Certification

HDF Partner Ecosystem Solutions

More Info

7 © Hortonworks Inc. 2011 – 2016. All Rights Reserved

Constrained High-latency Localized context

Hybrid – cloud / on-premises Low-latency Global context

CoreInfrastructure

Hortonworks DataFlow Manages Data in MotionRegional

InfrastructureSources

8 © Hortonworks Inc. 2011 – 2016. All Rights Reserved

Easy, Real-Time, Coding Free Data Movement

Dynamic data pipeline as not all data is equal

AWSAzure

Google CloudHadoop

9 © Hortonworks Inc. 2011 – 2016. All Rights Reserved

Expand To The Very Edge With MiNiFi

AWSAzure

Google CloudHadoop

Capture new sources of data

10 © Hortonworks Inc. 2011 – 2016. All Rights Reserved

Edge Intelligence with Apache MiNiFi

Guaranteed delivery Data buffering

‒ Backpressure‒ Pressure release

Prioritized queuing Flow specific QoS

‒ Latency vs. throughput‒ Loss tolerance

Data provenance

Recovery / recording a rolling log of fine-grained history

Designed for extension

Different from Apache NiFi Design and Deploy Warm re-deploys

Key Features

11 © Hortonworks Inc. 2011 – 2016. All Rights Reserved

Immediate Insights At Massive Scale

AWSAzure

Google CloudHadoop

Adapt to differing rates of data creation & delivery (Kafka) with real-time streaming analytics (Storm)

12 © Hortonworks Inc. 2011 – 2016. All Rights Reserved

Hortonworks DataFlow Management and Stream Processing

CoreInfrastructureSources

Constrained High-latency Localized context

Hybrid – cloud / on-premises Low-latency Global context

RegionalInfrastructure

13 © Hortonworks Inc. 2011 – 2016. All Rights Reserved

Flow management

D A T A I N M O T I O N D A T A A T R E S T

IoT Data Sources AWSAzure

Google CloudHadoop

NiFiKafka

Storm

Others…NiFi

NiFi NiFi

MiNiFi

MiNiFi

MiNiFi

MiNiFi

MiNiFi

MiNiFi

MiNiFi

NiFi

HDF 2.0: Data-in-Motion Platform

Enterprise Services

Ambari Ranger Other services

Flow management + Stream Processing

14 © Hortonworks Inc. 2011 – 2016. All Rights Reserved

Agenda Hortonworks Connected Data Platforms

HDF 2.0 Platform

Data Ingestion into Hadoop made EASY

HDF 2.0 Platform Use Cases

HDF 2.0 Product Integration Certification

HDF Partner Ecosystem Solutions

More Information

15 © Hortonworks Inc. 2011 – 2016. All Rights Reserved

Problems Today: Timely Access to Data and Decisions

HDF helps us to streamline the flow of data and build models and visualisations quickly, so that my team can work iteratively with business colleagues on building solutions that work for the business. -Royal Mailhttp://diginomica.com/2016/04/22/royal-mail-starts-to-deliver-on-hortonworks-data-in-motion-promise/

16 © Hortonworks Inc. 2011 – 2016. All Rights Reserved

Store Data

Process and Analyze Data

Acquire Data

Simplistic View of DataFlows: Easy, Definitive

Dataflow

17 © Hortonworks Inc. 2011 – 2016. All Rights Reserved

Realistic View of Dataflows: Complex, Convoluted

Store Data

Process and Analyze Data

Acquire Data

Store DataStore Data

Store Data

Store Data

Acquire Data

Acquire Data

Acquire Data

Dataflow

18 © Hortonworks Inc. 2011 – 2016. All Rights Reserved

HDPHORTONWORKSDATA PLATFORMPowered by Apache Hadoop

HDF Makes Big Data Ingest Fast, Easy

HDPHORTONWORKSDATA PLATFORMHDPHORTONWORKSDATA PLATFORMPowered by Apache Hadoop

Complicated, messy, and takes weeks to months to move the right data into Hadoop Streamlined, Efficient, Easy

19 © Hortonworks Inc. 2011 – 2016. All Rights Reserved

Agenda Hortonworks Connected Data Platforms

HDF 2.0 Platform

Data Ingestion into Hadoop made EASY

HDF 2.0 Platform Use Cases

HDF 2.0 Product Integration Certification

HDF Partner Ecosystem Solutions

More Information

20 © Hortonworks Inc. 2011 – 2016. All Rights Reserved

Use Cases for Data in Motion

Use Cases for Data-in-Motion Using DataFlow Mgmt• Data Ingestion • Edge Intelligence• First Mile Problem • Physical Data Movement • Simple event processing such as Route, Filter, Enrich,

Transform, etc.

When Only DataFlow Management is

Required

Use Cases for Data-in-Motion Using DataFlow Mgmt and Steam Processing• Flow Management to deliver data for Stream Processing• PLUS: Complex pattern matching on unbounded streams of

data.

When Both DataFlow Management and Stream Processing

21 © Hortonworks Inc. 2011 – 2016. All Rights Reserved

Data Ingestion: Optimize Log Analytics with Content Based Routing

22 © Hortonworks Inc. 2011 – 2016. All Rights Reserved

Example: Company X provides alerting services when users’ resting heart rate higher than a threshold

Real-Time Insights

Acquire Data

Company X Cloud Instance 1

Acquire Data

Company X Cloud Instance 2

Acquire Data

Company X Cloud Instance 3

Acquire Data Across Cloud

Instances

Parse, Filter, Validate, Enrich

and Route

Core Data Center

Analytics/Pattern Match

Data Store

Alerts

Dashboards/Visualization

Flow Management Stream ProcessingLegend:

23 © Hortonworks Inc. 2011 – 2016. All Rights Reserved

Data in Motion Needs Dataflow Management and Stream Processing

Acquire data from various Wearable Device’s Cloud Instances

Move Data from Customer Cloud Instances to on-premise instance

Perform Intelligent Routing & Filtering of data. The routing and filtering rules will be often changed at run-time.

Deliver the data data to various downstream systems. New downstream apps should will always appear and the data should be fed to it when it comes online.

Parse the device data to standardized format that downstream sysem can understand

Enrich the data with contextual information including patient/customer info (age, sex, etc..)

Recognize the Pattern when the resting heart rate exceeds a certain threshold (the insight), and then create an alert/notification.

Run a Outlier detection model on streaming heart rate that comes in. If the score is above certain threshold, alert on the heart rate.

Flow Management (NiFi, MiNiFi

andPartner

Integration)

StreamProcessing

(Storm, Kafka and Partner

Integration)

24 © Hortonworks Inc. 2011 – 2016. All Rights Reserved

Data in Motion(Cloud)

Data in Motion

(on-premises)

Data at Rest

(on-premises)

Edge Data

Data in Motion

Edge Analytics

Data at Rest

(Cloud)

Edge Data

Data at Rest

(on-premises)

Closed Loop Analytics

MachineLearning

Deep HistoricalAnalysis

The Future of DataArchitectural Transformation Enabled By Connected Data Platforms

On PremCloud

25 © Hortonworks Inc. 2011 – 2016. All Rights Reserved

Agenda Hortonworks Connected Data Platforms

HDF 2.0 Platform

Data Ingestion into Hadoop made EASY

HDF 2.0 Platform Use Cases

HDF 2.0 Product Integration Certification

HDF Partner Ecosystem Solutions

More Information

26 © Hortonworks Inc. 2011 – 2016. All Rights Reserved

http://hortonworks.com/partners/product-integration-certification/

• Announced August 10th, 2016

• As the adoption of HDF expands, enterprises are looking for proven integrations that mitigate deployment risk, pre-tested and certified. The HDF Certified badge is earned by partners with certified integrations with HDF

• Email [email protected] to get started

Product Integration Certification for HDP 2.0

27 © Hortonworks Inc. 2011 – 2016. All Rights Reserved

28 © Hortonworks Inc. 2011 – 2016. All Rights Reserved

Product Certification3 SIMPLE STEPS

Step 1 : Join Partnerworks

Step 2 : Complete HDF 2.0 Certification Kit Certification Report Reference Architecture Solutions Overview

Step 3: Joint Review w Partner Engineering

29 © Hortonworks Inc. 2011 – 2016. All Rights Reserved

Agenda Hortonworks Connected Data Platforms

HDF 2.0 Platform

Data Ingestion into Hadoop made EASY

HDF 2.0 Platform Use Cases

HDF 2.0 Product Integration Certification

HDF Partner Ecosystem Solutions

More Information

30 © Hortonworks Inc. 2011 – 2016. All Rights Reserved

Hortonworks DataFlow: Connecting Data Between EcosystemsHash

Extract

Merge

Duplicate

Scan

GeoEnrich

Replace

ConvertSplit

Translate

Route Content

Route Context

Route Text

Control Rate

Distribute Load

Generate Table Fetch

Jolt Transform JSON

Prioritized Delivery

Encrypt

Tail

Evaluate

Execute

HL7

FTP

UDP

XML

SFTP

HTTP

Syslog

Email

HTML

Image

AMQP

MQTT

All Apache project logos are trademarks of the ASF and the respective projects All other trademarks are the property of their respective owners.

Fetch

31 © Hortonworks Inc. 2011 – 2016. All Rights Reserved

Connected Vehicle Case – Big Data Primary Data Flow

Sensitive structured and

unstructured data

Hadoop Edge Nodes

HPE SecureData Hadoop Tools

Hadoop Cluster Teradata EDW

Sensitive structuredsources

Cognos

Analytics & Data Science

HPE SecureData Key Servers &

WS API’s

~2 Billion real time transactions/day

Other real-time data feeds – customer

data from dealerships,

manufacturers

Sqoop

Hive UDFs

Map Reduce

“Landing zone”

“Integration Controls”

Real time ingest

Existing data sets and 3rd party data, e.g.. accident data

UDFs

IBM DataStage

Driver Blood

Pressure Sensor Data

Exadata

TDEServer or

laptop log files

Public data sources such

as NHTSA

Storm Kafka

32 © Hortonworks Inc. 2011 – 2016. All Rights Reserved

Analytics and Hortonworks

D A T A I N M O T I O N

STORAG

ESTO

RAG

E

GROUP 2GROUP 1

GROUP 4GROUP 3

D A T A A T R E S TD A T A I N M O T I O N

INTERNETOF

ANYTHING

C L O U D

O N P R E M I S E

STORAG

ESTO

RAG

E

GROUP 2GROUP 1

GROUP 4GROUP 3

D A T A A T R E S T

ESP

ESP

ESP

HDP

HDP

USAGE CASE EXAMPLESCyber Security; Fraud; Predictive Maintenance; Customer Experience; Stream Data Management

Deep HistoricalAnalysis

MachineLearning

Edge Analytics, MLearning,Historical Analytics

33 © Hortonworks Inc. 2011 – 2016. All Rights Reserved

CDC MSG

n 2 1

MSG MSGData Streaming

Transaction logs

In memory optimized metadata management and data transport

Bulk Load

MSG

n 2 1

MSG MSGData Streaming

Message broker

Message broker

Data Streaming into Kafka HDF HDP

34 © Hortonworks Inc. 2011 – 2016. All Rights Reserved

Filtered data flowfrom edge to central sitePerform basic

operations like ingest, alert, filter, transform, etc.

SITE 1 • Data is read from multiple sensors

• The ‘Nifi’ installation on the edge node ingests the data from the sensors

SITE 2

CENTRAL NODES

SITE 3

35 © Hortonworks Inc. 2011 – 2016. All Rights Reserved

D A T A I N M O T I O N D A T A A T R E S T

Data Sources Polling & Protocol Translation Real Time Database Long Term Storage

DCS

PLC

MiNiFi

Meters

Vehicles

Analyzers

RTU

Data Access

Polling Engine

Protocol Proxy Service

Time Series Storage

ArchiveFiles

VariedSupport

Gov

erna

nce

& In

tegr

atio

n

Secu

rity

Ope

ratio

nsData Access

Data Management

36 © Hortonworks Inc. 2011 – 2016. All Rights Reserved

Edge

Gateway Server

MiNiFiMobile

Client Libraries

IoT Devices

Client Libraries

Server Cluster

NiFi

Devices

MiNiFi

Regional Center

NiFi NiFi

Core Data Center

Server Cluster

NiFi NiFi NiFi AWSAzure

Google Cloud

DBData WHIoT Devices

Client Libraries

eCompute eStorageeFabric® Data PlaneSoftware Defined Fabric

eNetwork

eFabric®Control Plane Zone based micro-segmentation for data security

AppHub – Application gallery for building, deploying and managing data pipelines

Seamless connectivity to public cloud services

www.midfinsystems.com

37 © Hortonworks Inc. 2011 – 2016. All Rights Reserved

Agenda Hortonworks Connected Data Platforms

HDF 2.0 Platform

Data Ingestion into Hadoop made EASY

HDF 2.0 Platform Use Cases

HDF 2.0 Product Integration Certification

HDF Partner Ecosystem Solutions

More Information

38 © Hortonworks Inc. 2011 – 2016. All Rights Reserved

More Information, Resources

Hortonworks Community Connection:Data Ingestion and Streaminghttps://community.hortonworks.com

Partnerworks: http://hortonworks.com/partners/

HDF Certification: http://hortonworks.com/partners/product-integration-certification/

Webinars: http://hortonworks.com/events-webcasts/

Sandbox: http://hortonworks.com/events-webcasts/

HDF: http://hortonworks.com/hdf/

HDP: http://hortonworks.com/hdp/

39 © Hortonworks Inc. 2011 – 2016. All Rights Reserved

Thank You


Recommended