Next Generation Big Data Platform at Netflix 2014

Post on 09-Jul-2015

2,148 views 1 download

Tags:

description

Next Generation Big Data Platform at Netflix - Presentation at re:Invent 2014

transcript

November 12, 2014 | Las Vegas, LV

Eva Tse, Netflix

Cloud

apps

Suro Ursula

CassandraAegisthus

Dimension Data

Event Data

15 min

Daily

S3

SS Tables

S3

Storage Compute Service Tools

S3v2.0

Storage Compute Service Tools

• Works well on S3

YARN-1864

YARN-2026

YARN-2012

YARN-2214

YARN-2360

YARN-2540

S3

S3

Tez Plan

Tez Execution Engine

Logical Plan

Physical Plan

MR Plan

MR Execution Engine

MRCompilerTezCompilerd

A Distributed SQL Query Engine for Big Data

techblog.netflix.com

21 committed PRs and 14 PRs in review

S3

v2.0

techblog.netflix.com

S3v2.0

d

Storage Compute Service Tools

YARN-1864

YARN-2026

YARN-2012

YARN-2214

YARN-2360

YARN-2540

HIVE-6783

HIVE-6785

HIVE-6938

HIVE-7800

PARQUET-100

PARQUET-106

PARQUET-2

PARQUET-22

PARQUET-70

PARQUET-75

PARQUET-92

PARQUET-99

PIG-3986

Talk Time Title

PFC-305 Wednesday, 1:15pm Embracing Failure: Fault Injection and Service Reliability

BDT-403 Wednesday, 2:15pm Next Generation Big Data Platform at Netflix

PFC-306 Wednesday, 3:30pm Performance Tuning EC2

DEV-309 Wednesday, 3:30pm From Asgard to Zuul, How Netflix’s proven Open Source

Tools can accelerate and scale your services

ARC-317 Wednesday, 4:30pm Maintaining a Resilient Front-Door at Massive Scale

PFC-304 Wednesday, 4:30pm Effective Inter-process Communications in the Cloud: The

Pros and Cons of Micro Services Architectures

ENT-209 Wednesday, 4:30pm Cloud Migration, Dev-Ops and Distributed Systems

APP-310 Friday, 9:00am Scheduling using Apache Mesos in the Cloud