+ All Categories
Home > Software > Databricks: Exploring all the ways to analyze data with Spark – Couchbase Connect 2016

Databricks: Exploring all the ways to analyze data with Spark – Couchbase Connect 2016

Date post: 15-Feb-2017
Category:
Upload: couchbase
View: 143 times
Download: 0 times
Share this document with a friend
14
Distributed Analytics with Apache Spark and Couchbase Jason Pohl (Databricks) Michael Nitschinger (Couchbase)
Transcript
Page 1: Databricks: Exploring all the ways to analyze data with Spark – Couchbase Connect 2016

Distributed Analytics with Apache Spark and Couchbase

Jason Pohl (Databricks)Michael Nitschinger (Couchbase)

Page 2: Databricks: Exploring all the ways to analyze data with Spark – Couchbase Connect 2016

OUR PRODUCT• Creators of Apache Spark. Contribute

75% of the code - 10x more than others

• Trained 20K Spark users

• Largest number of customers deploying Spark (300+)

• Just-in-Time Data Platform

• Empower your organization to swiftly build and deploy advanced analytics

WHY US

Who is Databricks?

Page 3: Databricks: Exploring all the ways to analyze data with Spark – Couchbase Connect 2016

open source data processing engine built around speed, ease of use, and sophisticated analytics

largest open source data project with 1000+ contributors

Page 4: Databricks: Exploring all the ways to analyze data with Spark – Couchbase Connect 2016

UNIFIED ENGINE ACROSS DIVERSE WORKLOADS & ENVIRONMENTS

Scale out, fault tolerant

Python, Java, Scala, and R APIs

Standard libraries

APACHE SPARK ENGINE

Page 5: Databricks: Exploring all the ways to analyze data with Spark – Couchbase Connect 2016

First Cellular Phones Unified DeviceSpecialized Devices

ANALOGY: EVOLUTION OF CONSUMER ELECTRONICS

Page 6: Databricks: Exploring all the ways to analyze data with Spark – Couchbase Connect 2016

HISTORY REPEATS: FASTER, EASIER TO USE, UNIFIED

First DistributedProcessing Engine

Specialized Data Processing Engines

Unified Data Processing Engine

Page 7: Databricks: Exploring all the ways to analyze data with Spark – Couchbase Connect 2016

Google Trends: Hadoop vs. Spark

Page 8: Databricks: Exploring all the ways to analyze data with Spark – Couchbase Connect 2016

MAJOR FEATURES IN SPARK 2.0

PerformanceTungsten Phase 2speedups of 5-20x

Structured Streaming

Engine

SQL 2003& Machine Learning

Page 9: Databricks: Exploring all the ways to analyze data with Spark – Couchbase Connect 2016

Couchbase + Apache Spark Storage Processing

RecommendationsNext gen data warehousingPredictive analyticsFraud detection

Catalog Customer 360 + IOTPersonalizationMobile applications

Page 10: Databricks: Exploring all the ways to analyze data with Spark – Couchbase Connect 2016

Couchbase + Apache Spark Operations Analysis

RecommendationsNext gen data warehousingPredictive analyticsFraud detection

Catalog Customer 360 + IOTPersonalizationMobile applications

Page 11: Databricks: Exploring all the ways to analyze data with Spark – Couchbase Connect 2016

COUCHBASE SPARK CONNECTOR 2.0Spark 2.0 Support

Structured Streaming

Efficiency

Improved DCP handling memory allocation creates less garbageEasier Management

Tolerates Couchbase cluster topology changes (eg. add nodes & rebalance)

… except rollbacks

Page 12: Databricks: Exploring all the ways to analyze data with Spark – Couchbase Connect 2016

Demo

Page 13: Databricks: Exploring all the ways to analyze data with Spark – Couchbase Connect 2016

HADOOP / DATA LAKES

DATABRICKS JUST-IN-TIME DATA PLATFORM

Page 14: Databricks: Exploring all the ways to analyze data with Spark – Couchbase Connect 2016

Build a PoC on Databricks today.Professional services and training also available.

Contact [email protected]

Sign up for a trial at https://databricks.com/try-databricks


Recommended