+ All Categories
Home > Technology > Tracking multi-tenant resource usage with "White Elephant"

Tracking multi-tenant resource usage with "White Elephant"

Date post: 13-Jan-2015
Category:
Upload: adam-faris
View: 718 times
Download: 1 times
Share this document with a friend
Description:
This is a 5 minute lightening talk on why one would want to use "White Elephant" for capacity planning on a Hadoop cluster. This talk was done for the LSPE group, hosted by Yahoo! in Sunnyvale on Sept 19, 2013. http://www.meetup.com/SF-Bay-Area-Large-Scale-Production-Engineering/events/129859402/
Popular Tags:
8
Tracking multi-tenant resource usage with "White Elephant” Adam Faris LinkedIn
Transcript
Page 1: Tracking multi-tenant resource usage with "White Elephant"

Tracking multi-tenant resource usage with "White Elephant”

Adam Faris LinkedIn

Page 2: Tracking multi-tenant resource usage with "White Elephant"

Why track usage?

Page 3: Tracking multi-tenant resource usage with "White Elephant"

– Use Hadoop to process logs– Creates small file problem for HDFS– WebHDFS + HAR = “Problem Solver”

Job History Logs

Page 4: Tracking multi-tenant resource usage with "White Elephant"

– Requirements– Provides Data Aggregation– Provides Dashboard– Open Sourced by LinkedIn Engineering

http://en.wikipedia.org/wiki/White_elephant

Page 5: Tracking multi-tenant resource usage with "White Elephant"

Failed Tasks

Page 6: Tracking multi-tenant resource usage with "White Elephant"

Reduce Shuffle Bytes

Page 7: Tracking multi-tenant resource usage with "White Elephant"

It can do more?

• Total task time• Total speculative time• CPU Hours • Plus more

• Helps determine capacity


Recommended