+ All Categories
Home > Data & Analytics > Cornami Accelerates Performance on SPARK: Spark Summit East talk by Paul Master

Cornami Accelerates Performance on SPARK: Spark Summit East talk by Paul Master

Date post: 12-Apr-2017
Category:
Upload: spark-summit
View: 170 times
Download: 1 times
Share this document with a friend
20
© 2017 – All Rights Reserved
Transcript
Page 1: Cornami Accelerates Performance on SPARK: Spark Summit East talk by Paul Master

© 2017 – All Rights Reserved

Page 2: Cornami Accelerates Performance on SPARK: Spark Summit East talk by Paul Master

© 2017 – All Rights Reserved

Acceleration of Generic SPARK Workloads via a “Sea of Cores”

Scalable Compute Fabric

Paul MasterCTO

[email protected]

Page 3: Cornami Accelerates Performance on SPARK: Spark Summit East talk by Paul Master

© 2017 – All Rights Reserved

Using “Linear Increases in Performance” to“Process Exponentially more Data”

Year to Year Trends

Page 4: Cornami Accelerates Performance on SPARK: Spark Summit East talk by Paul Master

© 2017 – All Rights Reserved

Growth of Data

Exponential

Page 5: Cornami Accelerates Performance on SPARK: Spark Summit East talk by Paul Master

© 2017 – All Rights Reserved

Transistors Per Chip

Exponential

Page 6: Cornami Accelerates Performance on SPARK: Spark Summit East talk by Paul Master

© 2017 – All Rights Reserved

35 Years of Microprocessor Trend Data

Page 7: Cornami Accelerates Performance on SPARK: Spark Summit East talk by Paul Master

© 2017 – All Rights Reserved

CPU Performance

Flat

Page 8: Cornami Accelerates Performance on SPARK: Spark Summit East talk by Paul Master

© 2017 – All Rights Reserved

The Workloads Have Changed

• The Run Time characteristics of these two types of workloads is NOT the same.

• So let’s look at a non-traditional processor architecture that is tuned for Big Data / ML workloads.

etc…

Page 9: Cornami Accelerates Performance on SPARK: Spark Summit East talk by Paul Master

© 2017 – All Rights Reserved

So, the Architecture Has to Change

Intel Haswell CPU200m transistors per core

etc…

“Sea of Cores”150-200k transistors per core

Page 10: Cornami Accelerates Performance on SPARK: Spark Summit East talk by Paul Master

© 2017 – All Rights Reserved

Hardware:

• At 22nm one can fit 52 ARM A7’s in the space of 1 Intel Haswell core

• What interconnect(s), what type/size of caches, what coherency, what I/O….

Software:• We need is a way of parallelizing workloads to run

across many, many small cores instead of a few large cores

• Its got to be software acceleration (<CR> and go)

• We need a large “software” base of real world applications that matter

Oh Wait! We have

Page 11: Cornami Accelerates Performance on SPARK: Spark Summit East talk by Paul Master

© 2017 – All Rights Reserved

Usage Model for Dense Computational Fabric

Page 12: Cornami Accelerates Performance on SPARK: Spark Summit East talk by Paul Master

© 2017 – All Rights Reserved

Performance Comparison between a standard dual socket 1U Server (16 core) vs. a 1U Server with a

TruStream Dense Computational Fabric (1000 cores)

Live Demonstration:Yahoo Streaming Benchmark

Page 13: Cornami Accelerates Performance on SPARK: Spark Summit East talk by Paul Master

© 2017 – All Rights Reserved

What is it?

Yahoo Streaming Benchmark Measures Real-Time Mobile Advertising performance

Executive Summary…“Due to a lack of real-world streaming benchmarks, we developed one to compare Apache Flink, Apache Storm and Apache Spark Streaming.”

Page 14: Cornami Accelerates Performance on SPARK: Spark Summit East talk by Paul Master

© 2017 – All Rights Reserved

Mobile ad revenue was about 84% of total ad revenue, an increase from about 76% of total ad revenue in the year-ago quarter.

That means Facebook Inc. (NASDAQ:FB) had about $5.42 billion on mobile ad revenue, which was well ahead of the StreetAccount average of $4.84 billion.”http://www.valuewalk.com/2016/07/facebook-inc-fb-earnings-beat/

Page 15: Cornami Accelerates Performance on SPARK: Spark Summit East talk by Paul Master

© 2017 – All Rights Reserved

• Our TruStream™ Compute Fabric is a dense computational fabric consisting of a Non-Von Neumann “Sea of Cores” produced a stepwise increase in the Yahoo Streaming Benchmark Performance.

• Greater than 40x speedup - This is the fastest Yahoo Streaming Benchmark Result reported to date.

• This was done by transparently accelerating the generic SCALA code within the Spark Framework.

Stepwise Performance Increase

Page 16: Cornami Accelerates Performance on SPARK: Spark Summit East talk by Paul Master

© 2017 – All Rights Reserved

Growth of Data

2017Exponential

Page 17: Cornami Accelerates Performance on SPARK: Spark Summit East talk by Paul Master

© 2017 – All Rights Reserved

Transistors Per Chip

Exponential 2017

Page 18: Cornami Accelerates Performance on SPARK: Spark Summit East talk by Paul Master

© 2017 – All Rights Reserved

CORNAMI’s “Sea of Cores” Solution

FlatExponential 2017

Page 19: Cornami Accelerates Performance on SPARK: Spark Summit East talk by Paul Master

© 2017 – All Rights Reserved

Thank You!

Paul [email protected]

Page 20: Cornami Accelerates Performance on SPARK: Spark Summit East talk by Paul Master

© 2017 – All Rights Reserved

RESULTS:


Recommended