+ All Categories
Home > Documents > HiBench - Home: SPEC Research Group...Jul 22, 2015  · HiBench. Lv, Qi ([email protected]) July 22,...

HiBench - Home: SPEC Research Group...Jul 22, 2015  · HiBench. Lv, Qi ([email protected]) July 22,...

Date post: 22-May-2020
Category:
Upload: others
View: 3 times
Download: 0 times
Share this document with a friend
25
HiBench Lv, Qi ([email protected] ) July 22, 2015 the cross platforms micro-benchmark suite for big data
Transcript
Page 1: HiBench - Home: SPEC Research Group...Jul 22, 2015  · HiBench. Lv, Qi (qi.lv@intel.com) July 22, 2015. the cross platforms micro-benchmark suite for big data

HiBench

Lv, Qi ([email protected])July 22, 2015

the cross platforms micro-benchmark suite for big data

Page 2: HiBench - Home: SPEC Research Group...Jul 22, 2015  · HiBench. Lv, Qi (qi.lv@intel.com) July 22, 2015. the cross platforms micro-benchmark suite for big data

About US

Closely partnered with large web sites and ISVs on better user experiences Key contributions for better customer adoption. E.g.,

Usability, Scalability and Performance

More utilities to improve the stability & scalability HiMeter: the light-weight workflow based big data

performance analysis tool

Page 3: HiBench - Home: SPEC Research Group...Jul 22, 2015  · HiBench. Lv, Qi (qi.lv@intel.com) July 22, 2015. the cross platforms micro-benchmark suite for big data

Agenda

• Why we need big data benchmarking systems?WHY

• What is HiBench?WHAT• How to use HiBench?HOW

Page 4: HiBench - Home: SPEC Research Group...Jul 22, 2015  · HiBench. Lv, Qi (qi.lv@intel.com) July 22, 2015. the cross platforms micro-benchmark suite for big data

Big data ecosystem is complex

Hadoop

MR1

MR2

Spark

Scala

Java

Python

Deployment

Standalone

YARN

Application

SQL

MachineLearning

Graphx

Page 5: HiBench - Home: SPEC Research Group...Jul 22, 2015  · HiBench. Lv, Qi (qi.lv@intel.com) July 22, 2015. the cross platforms micro-benchmark suite for big data

Frequent Questions from our Partners Which framework is better? Hadoop MR1/MR2 Spark scala/java/python Standalone/YARN

How many resources needed? CPU cores, memory, network bandwidth

Is the cluster configured properly? Executor number, partition number tuning

Page 6: HiBench - Home: SPEC Research Group...Jul 22, 2015  · HiBench. Lv, Qi (qi.lv@intel.com) July 22, 2015. the cross platforms micro-benchmark suite for big data

Meet HiBench Micro-bench oriented

Summarized from real application Regression test

Reputation AMP lab Yahoo IBM Pivotal

Page 7: HiBench - Home: SPEC Research Group...Jul 22, 2015  · HiBench. Lv, Qi (qi.lv@intel.com) July 22, 2015. the cross platforms micro-benchmark suite for big data

First Glance of HiBenchCo

re

SortwordcountterasortSleep M

LLib

KMeansBayes

Grap

hx

Pagerank

SQL

AggregationJoinScan

Stre

amin

g

Identifygrepwordcountproject…

Page 8: HiBench - Home: SPEC Research Group...Jul 22, 2015  · HiBench. Lv, Qi (qi.lv@intel.com) July 22, 2015. the cross platforms micro-benchmark suite for big data

HiBench RoadMap

HiBench 1.0 (2012.6)•initial release

HiBench 2.0 (2013.9)•CDH, hadoop2

support

HiBench 3.0 (2014.10)•YARN support,

Sparkbench

HiBench 4.0 (2015.3)•Workload

abstraction framework

HiBench 5.0 (2015.8)•StreamingBench

Page 9: HiBench - Home: SPEC Research Group...Jul 22, 2015  · HiBench. Lv, Qi (qi.lv@intel.com) July 22, 2015. the cross platforms micro-benchmark suite for big data

Key Features

Workload abstraction Typical workloads in classic application domains Micro-bench workloads oriented

Comparison between frameworks & configurations MR1 / MR2, standalone / YARN sequence / text, compression options / disable

Scalable configuration Global configuration for different scales Dedicated configuration for individual workloads

Metrics Durations Throughputs, Throughput per nodes

Page 10: HiBench - Home: SPEC Research Group...Jul 22, 2015  · HiBench. Lv, Qi (qi.lv@intel.com) July 22, 2015. the cross platforms micro-benchmark suite for big data

Showcasing how to explore the answer Cluster configuration

E5-2697 @ 2.7G 24C48T Memory: 192 GB Disks: 8 SSDs Network: 10 GbE Node size: 4

Software stack Spark: master (1.3.0-SNAPSHOT) Hadoop1.0.4(MR1) / CDH5.3 (MR2) JDK: oracle-1.8.0_25

Page 11: HiBench - Home: SPEC Research Group...Jul 22, 2015  · HiBench. Lv, Qi (qi.lv@intel.com) July 22, 2015. the cross platforms micro-benchmark suite for big data

Comparison of language APIs (spark)

Page 12: HiBench - Home: SPEC Research Group...Jul 22, 2015  · HiBench. Lv, Qi (qi.lv@intel.com) July 22, 2015. the cross platforms micro-benchmark suite for big data

MR1 vs MR2(CDH5.3)

Page 13: HiBench - Home: SPEC Research Group...Jul 22, 2015  · HiBench. Lv, Qi (qi.lv@intel.com) July 22, 2015. the cross platforms micro-benchmark suite for big data

Impact of Network bandwidth

Page 14: HiBench - Home: SPEC Research Group...Jul 22, 2015  · HiBench. Lv, Qi (qi.lv@intel.com) July 22, 2015. the cross platforms micro-benchmark suite for big data

Impact of Network bandwidth

Page 15: HiBench - Home: SPEC Research Group...Jul 22, 2015  · HiBench. Lv, Qi (qi.lv@intel.com) July 22, 2015. the cross platforms micro-benchmark suite for big data

Data volume scalability Spark/scala

Page 16: HiBench - Home: SPEC Research Group...Jul 22, 2015  · HiBench. Lv, Qi (qi.lv@intel.com) July 22, 2015. the cross platforms micro-benchmark suite for big data

Data volume scalability Spark/java

Page 17: HiBench - Home: SPEC Research Group...Jul 22, 2015  · HiBench. Lv, Qi (qi.lv@intel.com) July 22, 2015. the cross platforms micro-benchmark suite for big data

Data volume scalability Spark/python

Page 18: HiBench - Home: SPEC Research Group...Jul 22, 2015  · HiBench. Lv, Qi (qi.lv@intel.com) July 22, 2015. the cross platforms micro-benchmark suite for big data

Q & A

Available at:https://github.com/intel-hadoop/HiBench

Page 19: HiBench - Home: SPEC Research Group...Jul 22, 2015  · HiBench. Lv, Qi (qi.lv@intel.com) July 22, 2015. the cross platforms micro-benchmark suite for big data

Backup

Page 20: HiBench - Home: SPEC Research Group...Jul 22, 2015  · HiBench. Lv, Qi (qi.lv@intel.com) July 22, 2015. the cross platforms micro-benchmark suite for big data

Data volume scalability – hadoop1

Page 21: HiBench - Home: SPEC Research Group...Jul 22, 2015  · HiBench. Lv, Qi (qi.lv@intel.com) July 22, 2015. the cross platforms micro-benchmark suite for big data

Report configuration example

All configurations are classified accordingly

Some configurations are auto probe & generated

Page 22: HiBench - Home: SPEC Research Group...Jul 22, 2015  · HiBench. Lv, Qi (qi.lv@intel.com) July 22, 2015. the cross platforms micro-benchmark suite for big data

Troubleshooting

Configuration issue Check configuration parsing

sequence to confirm your configuration is parsed properly

Page 23: HiBench - Home: SPEC Research Group...Jul 22, 2015  · HiBench. Lv, Qi (qi.lv@intel.com) July 22, 2015. the cross platforms micro-benchmark suite for big data

Troubeshooting(2)

Pay attention to highlighted yellow and red message: Yellow: warning Red: Error

If you doubt it’s a configuration issue, please check report/<workload>/<language api>/conf/sparkbench/sparkbench.conf to double confirm that.

Page 24: HiBench - Home: SPEC Research Group...Jul 22, 2015  · HiBench. Lv, Qi (qi.lv@intel.com) July 22, 2015. the cross platforms micro-benchmark suite for big data

System utilization chart

Chart CPU chart

Sys/User/IOwait/ Others=nice+irq+softirq

Network chart Recv, send bytes Recv, send packets Errors=send_err+recv_err+send_

drop+recv_drop

Page 25: HiBench - Home: SPEC Research Group...Jul 22, 2015  · HiBench. Lv, Qi (qi.lv@intel.com) July 22, 2015. the cross platforms micro-benchmark suite for big data

System utilization chart(2)

Chart Disk chart

Read, write bytes Read, write IOPS

Memory chart Used, buffer/cache, free

System load chart Load5/10/15 Running processes All process numbers(with threads)


Recommended