+ All Categories
Home > Software > Spark Summit 2015 keynote: Making Big Data Simple with Spark

Spark Summit 2015 keynote: Making Big Data Simple with Spark

Date post: 06-Aug-2015
Category:
Upload: databricks
View: 3,365 times
Download: 2 times
Share this document with a friend
Popular Tags:
14
Making big Data Simple with Spark Ion Stoica and Ali Ghodsi June 15, 2015
Transcript

Making big Data Simple with Spark

Ion Stoica and Ali Ghodsi June 15, 2015

More than 5,000 people trained over past year

2

Alleviating Data Scientist Scarcity Challenge

“Intro to Big Data with Apache Spark” •  Anthony Joseph, UC Berkeley •  Started June 1st

“Scalable Machine Learning”

•  Ameet Talwalkar, UCLA •  To start July 5th

More than 5,000 people trained over past year

3

Alleviating Data Scientist Scarcity Challenge

“Intro to Big Data with Apache Spark” •  Anthony Joseph, UC Berkeley •  Started June 1st, over 64K registered students

“Scalable Machine Learning”

•  Ameet Talwalkar, UCLA •  To start July 5th, over 26K registered students

4

…  

Spark Core Python, Java, Scala, R

Spark Streaming real-time

Spark SQL interactive

MLlib machine learning

GraphX graph

a  

Fast • Expressive • General

Spark Significantly Simplifies Big Data Processing

5

Still need to set up and manage your own Spark cluster

Still more complex to operate than existing single node tools (R, Python)

But Big Data Processing Remains Complex...

Databricks Truly Makes Big Data Simple A hosted end-to-end platform from ingest to production

6

Cluster Manager

Jobs Notebooks Third-Party Apps Dashboards

June 2014: Unveiling •  Over 3,500 sign ups

November 2014: Limited Availability

Today •  Over 150 organizations using Databricks

Databricks: The Journey Thus Far

7

Better products Update customers’ databases weekly instead of monthly

What can Databricks and Spark do for organizations?

8

Faster time to market Create new products in 3 weeks rather than 2 months

Democratize data access within enterprises Increase number of data analysts by 4x and number of data projects by 6x

9

General Availability starting today!

www.databricks.com

Ease of use Increase user productivity

10

Key Areas of Focus

1

2

Integration with existing (small and big) data tools Make non-Spark experts instantly productive

3

Security Enable mission-critical applications

11

Cluster manager with multiple Spark versions

From notebooks to dashboards and jobs with just a few clicks

Lunch and monitor jobs, including streaming

Ease of Use

Notebooks

Dashboards

Jobs

12

Best-of-breed apps Versioning R Notebooks

Integration

+

13

Run in your own Amazon account

Access Control Lists

Security

Encryption at rest

14

Demo


Recommended