The Elephant in the Clouds

Post on 16-Apr-2017

888 views 0 download

transcript

The Elephant in the CloudsSanjay RadiaChief Architect, Founder Hortonworks

© Hortonworks Inc. 2011 – 2016. All Rights Reserved

Why Hadoop in the Cloud?

Unlimited Elastic Scale

Ephemeral & Long-Running

IT & Business Agility

No UpfrontHW Costs

$0

© Hortonworks Inc. 2011 – 2016. All Rights Reserved

Today’s Hadoop Cloud Solutions

The Forrester WaveTM

Big Data Hadoop Cloud SolutionsQ2 2016Get it at //aka.ms/forresterwave

Rackspace

OracleAltiscaleQubole

Google

IBMAmazon Web Services

Microsoft

LeadersStrong

PerformersContendersChallengers

StrongWeak Strategy

Weak

Strong

CurrentOffering

MarketPresence

© Hortonworks Inc. 2011 – 2016. All Rights Reserved

Key Architectural Considerations for Hadoop in the Cloud

Shared Data& Storage

On-Demand Ephemeral Workloads

1010110101010101

010101010101010101010101010101010

Elastic Resource Management

Shared Metadata, Security & Governance

© Hortonworks Inc. 2011 – 2016. All Rights Reserved

Prescriptive On-Demand Ephemeral Workloads

On-DemandEphemeralWorkloads

Data ScienceR/W TablesCompute Fabric

ETL

R/W TablesCompute Fabric

WarehouseR/W TablesCompute Fabric

Search

R/W TablesCompute Fabric

© Hortonworks Inc. 2011 – 2016. All Rights Reserved

Shared Data and Storage

Understand and Leverage Unique Cloud Properties Shared data lake is cloud storage accessible

by all apps Cloud storage segregated from compute Built-in geo-distribution and DR

Focus Areas Address cloud storage consistency

and performance Enhance performance via memory

and local storage

Shared Data& Storage

1010110101010101

010101010101010101010101010101010

© Hortonworks Inc. 2011 – 2016. All Rights Reserved

Enhance Performance via Caching

Tabular Data: LLAP Read + Write-thru Cache Cache only the needed columns Shared across jobs / apps and across engines Spills to SSD when memory is full (anti-caching) Read & Write-through cache Security: Column-level and row-level

HDFS Caching for Non-tabular Data Cache data from cloud storage as needed Write-through cache

Workloads

Cloud Storage

LLAP R/W TablesHDFS Files

Cache

© Hortonworks Inc. 2011 – 2016. All Rights Reserved

Shared Data Requires Shared Metadata, Security, and Governance

Shared Metadata Across All Workloads Metadata considerations

– Tabular data metastore– Lineage and provenance metadata– Pipeline and job management metadata– Add upon ingest– Update as processing modifies data

Access / tag-based policies and audit logs Centrally stored to facilitate use across apps

– Ex. backed by Cloud RDS (or shared DB)

Classification

Prohibition

Time

Location

Streams

Pipelines

Feeds

Tables

Files Objects

SharedMetadata

Policies

© Hortonworks Inc. 2011 – 2016. All Rights Reserved

Elastic Resource Management in Context of Workload

Workload Management vs. Cluster Management Understand resource needs of different

workload types Add / remove resources to meet workload SLAs Manage compute power and high-performance

data-access (ex., LLAP) Pricing-aware: instances (spot, reserved),

data, bandwidthElasticResourceManagement

© Hortonworks Inc. 2011 – 2016. All Rights Reserved

Ram VenkateshSenior Director of EngineeringHortonworks

Demo of Cloud Tech PreviewEffectiveness of mobile ad spend (cross device attribution)

Clickstream ETL BI & Reporting Data Science

Data, Metadata, Security

Cloud Control Plane

© Hortonworks Inc. 2011 – 2016. All Rights Reserved

Vision: Connected Data Architecture Enables Enterprise Transformations

Data in Motion

Data in Motion

Data at Rest

Data at Rest

MachineLearning

Deep HistoricalAnalysis

C L O U D

D ATA C E N T E R

Stream Analytics

Edge Data

Edge Data

Edge Analytics

© Hortonworks Inc. 2011 – 2016. All Rights Reserved

Recommended Sessions…Thursday Hadoop & Cloud Storage: Object Store Integration in Production LLAP: Sub-Second Analytical Queries in Hive Zeppelin + Livy: Bringing multi tenancy to interactive data analysis

CHECK OUT HORTONWORKS CLOUD TECH PREVIEW!http://hortonworks.com/news-blogs/

© Hortonworks Inc. 2011 – 2016. All Rights Reserved

Thank You