+ All Categories
Home > Data & Analytics > Hadoop & Spark Performance tuning using Dr. Elephant

Hadoop & Spark Performance tuning using Dr. Elephant

Date post: 06-Jan-2017
Category:
Upload: akshay-rai
View: 674 times
Download: 7 times
Share this document with a friend
46
Dr. Elephant github.com/linkedin/dr-elephant Akshay Rai Hadoop Dev Team
Transcript
Page 1: Hadoop & Spark Performance tuning using Dr. Elephant

Dr. Elephantgithub.com/linkedin/dr-elephant

Akshay RaiHadoop Dev Team

Page 2: Hadoop & Spark Performance tuning using Dr. Elephant

Introduction

Page 3: Hadoop & Spark Performance tuning using Dr. Elephant

Scaling Hadoop Infrastructure

Page 4: Hadoop & Spark Performance tuning using Dr. Elephant

Scale and Optimize Hardware● More users, more jobs, more resources

● Large investment in hardware

● Can’t keep upgrading and adding machines to solve problem forever

● Some tuning is needed to get things running

Page 5: Hadoop & Spark Performance tuning using Dr. Elephant

Users are more valuable than machines

What do we do?

Page 6: Hadoop & Spark Performance tuning using Dr. Elephant

Improve User Productivity

Page 7: Hadoop & Spark Performance tuning using Dr. Elephant

User Productivity● Freedom to experiment and run jobs on the cluster

● Build tools to help developers. (Hadoop DSL, Resolvers for Pig/Hive)

○ Improve developer lifecycle

○ Also reduce unnecessary resource wastage

Page 8: Hadoop & Spark Performance tuning using Dr. Elephant

The Tuning Problem

Page 9: Hadoop & Spark Performance tuning using Dr. Elephant

How easy is it to tune a job?● Problems are not obvious

● Critical information is scattered

● Inter-related settings

● Large parameter space

Page 10: Hadoop & Spark Performance tuning using Dr. Elephant

Here’s what we learned!

Page 11: Hadoop & Spark Performance tuning using Dr. Elephant

Expert Intervention● Not enough support resources available

● Poor coverage

● Difficult to prioritize efforts

● Delays user development

Random

Suggestions

Page 12: Hadoop & Spark Performance tuning using Dr. Elephant

Training is not at all easy● Too many users

● Diverse backgrounds

● Scope is large and evolving

● Other responsibilities are more important

Page 13: Hadoop & Spark Performance tuning using Dr. Elephant

Scaling Productivity is Hard!

Page 14: Hadoop & Spark Performance tuning using Dr. Elephant

Dr. Elephant to the Rescue

Page 15: Hadoop & Spark Performance tuning using Dr. Elephant

What does Dr. Elephant do?● Automated performance monitoring and tuning tool

● Help every user get the best performance from their jobs

● Highlights common mistakes

● Indicates best practices and tuning tips

● Provides a platform for other performance related tools

● Analyzes hundred thousand jobs every day

Page 16: Hadoop & Spark Performance tuning using Dr. Elephant

Architecture

Page 17: Hadoop & Spark Performance tuning using Dr. Elephant

Dashboard

Page 18: Hadoop & Spark Performance tuning using Dr. Elephant

Search

Page 19: Hadoop & Spark Performance tuning using Dr. Elephant

Job Page

Page 20: Hadoop & Spark Performance tuning using Dr. Elephant

MapReduce Report

Page 21: Hadoop & Spark Performance tuning using Dr. Elephant

Failed Job

Page 22: Hadoop & Spark Performance tuning using Dr. Elephant

Help Page

Page 23: Hadoop & Spark Performance tuning using Dr. Elephant

Tuning Tips

Page 24: Hadoop & Spark Performance tuning using Dr. Elephant

Awesome Features

Page 25: Hadoop & Spark Performance tuning using Dr. Elephant

Simplified analysis of a flow’s historical executions● Monitoring performance, resource usage and many others

● Comparing flows against previous executions

● Impact of tuning a specific parameter or a changing a line of code

Page 26: Hadoop & Spark Performance tuning using Dr. Elephant

Flow History

Page 27: Hadoop & Spark Performance tuning using Dr. Elephant

Job History

Page 28: Hadoop & Spark Performance tuning using Dr. Elephant

Heuristics

Page 29: Hadoop & Spark Performance tuning using Dr. Elephant

How does a Heuristic work?● Fetch Counters and Task Data

● Some logic to compute a value

● Compare value against threshold levels

Page 30: Hadoop & Spark Performance tuning using Dr. Elephant

Heuristic Severity

Severity Color Description

CRITICAL The job is in critical state and must be tuned

SEVERE There is scope for improvement

MODERATE There is scope for further improvement

LOW There is scope for few minor improvements

NONE The job is safe. No tuning necessary

Page 31: Hadoop & Spark Performance tuning using Dr. Elephant

Example | Mapper Data Skew

Page 32: Hadoop & Spark Performance tuning using Dr. Elephant

Mapper Skew Problem● Number of Mappers depend on the number of splits

● Varying size of splits can cause skewness in the Mapper Input

Page 33: Hadoop & Spark Performance tuning using Dr. Elephant

Solution to Mapper Skewness● Each Mapper should process the same amount of data

● Combine the small chunks and feed it to a single Mapper

Page 34: Hadoop & Spark Performance tuning using Dr. Elephant

Example | Spark Executor Load Balance

Page 35: Hadoop & Spark Performance tuning using Dr. Elephant

Spark Driver

Executor 1

Executor 2

Executor 3

RDD

Partition 1

Partition 2

Partition 3

Page 36: Hadoop & Spark Performance tuning using Dr. Elephant

Custom Heuristics

Page 37: Hadoop & Spark Performance tuning using Dr. Elephant

Adding a New Heuristic1. Create a new heuristic and test it.

2. Create a new view for the heuristic. For example, helpMapperSpill.scala.html

3. Add the details of the heuristic in the HeuristicConf.xml file.

<heuristic>

<applicationtype>mapreduce</applicationtype>

<heuristicname>Mapper GC</heuristicname>

<classname>com.linkedin.dre.mapreduce.heuristics.MapperGC</classname>

<viewname>views.html.help.mapreduce.helpGC</viewname>

</heuristic>

4. Run Dr. Elephant. It should now include the new heuristics.

Page 38: Hadoop & Spark Performance tuning using Dr. Elephant

Configuring Heuristics/Threshold levels<heuristics>

<heuristic>

<applicationtype>mapreduce</applicationtype>

<heuristicname>Mapper Data Skew</heuristicname>

<classname>com.linkedin.dre.mapreduce.heuristics.MapperDataSkew</classname>

<viewname>views.html.help.mapreduce.helpMapperDataSkew</viewname>

<params>

<num_tasks_severity>10, 50, 100, 200</num_tasks_severity>

<deviation_severity>2, 4, 8, 16</deviation_severity>

<files_severity>1/8, 1/4, 1/2, 1</files_severity>

</params>

</heuristic>

</heuristics>

Page 39: Hadoop & Spark Performance tuning using Dr. Elephant

Elephagent

Page 40: Hadoop & Spark Performance tuning using Dr. Elephant

Workflow monitoring and reports● Performance characteristics change

○ Data Growth

○ Data distribution change

○ Hardware change

○ Incremental software change

● Monitor performance on each execution

● Compare behaviour across revisions

● Cost to Serve analysis

Page 41: Hadoop & Spark Performance tuning using Dr. Elephant

Production Reviews | JIRA Bot● Separate cluster for critical workloads

● Audit before deployment

● Improved accuracy

● Faster turnaround

● Higher throughput

Page 42: Hadoop & Spark Performance tuning using Dr. Elephant

Future Plans

Page 43: Hadoop & Spark Performance tuning using Dr. Elephant

Upcoming● Job Resource Usage and Wastage

● Job Wait time

● Real time analysis of a job

● Workflow DAG visualization

● Improved Spark heuristics

Page 45: Hadoop & Spark Performance tuning using Dr. Elephant

Thank You

Page 46: Hadoop & Spark Performance tuning using Dr. Elephant

©2014 LinkedIn Corporation. All Rights Reserved.

©2014 LinkedIn Corporation. All Rights Reserved.

© 2016


Recommended