+ All Categories
Home > Documents > highly available - Linux Foundation...

highly available - Linux Foundation...

Date post: 12-Feb-2020
Category:
Upload: others
View: 2 times
Download: 0 times
Share this document with a friend
23
MesosCon Europe 2017 Making and keeping Netflix highly available Katharina Probst Engineering Director
Transcript

MesosCon Europe 2017

Making and keeping Netflix highly available

Katharina ProbstEngineering Director

100+ MillionCustomers

By the numbers.

100+ MillionHappy customers.

But sometimes, something goes wrong.

We want

One unhappy customer

25 unhappy customers

100 unhappy customers

Guess how many?

900, and they’re already faceless

100+ MillionCustomers

By the numbers.

125+Million

Hours watched per

day

380Micro-

services in production

1000+Device types

< 10Core SREs

By the numbers.

What??

Insights are everything

Mantis overview

Micro-service Clusters Mantis

Stream processingCloud native service

● Configurable message delivery guarantees● Heterogeneous workloads

○ Real-time dashboarding, alerting○ Anomaly detection, metric generation○ Interactive exploration of streaming data

AnomalyDetection

Core architectural components

AWS EC2

Apache Mesos

Mantis Framework

FenzoFenzo Scheduler

Optimized for cloud

Scale underlying agent cluster

Fitness criteria, e.g.,● bin packing● spreading tasks across

EC2 AZs for high availability

Fenzo

Real-time SPS

Know something is wrong in seconds, not minutes

Breakdown by region

Breakdown by device type

Real-time SPS

Real-time metrics

Autoscaling

18+ million / sec messages at peak

Streaming on demand

Ad-hoc queries

Autoscaling

Almost 1,500 active jobs at peak

Faster detection

Faster insights into causes

Faster mitigation

What does this all mean?

Happier customers!

Contact

Katharina ProbstDirector of Engineering, Netflixwww.linkedin.com/in/katharinaprobst


Recommended