+ All Categories
Home > Data & Analytics > Self-Service Data Science for Leveraging ML & AI on All of Your Data

Self-Service Data Science for Leveraging ML & AI on All of Your Data

Date post: 21-Jan-2018
Category:
Upload: mapr-data-technologies
View: 257 times
Download: 2 times
Share this document with a friend
44
© 2017 MapR Technologies MapR Confidential 1 Self-Service Data Science for Leveraging ML & AI on All of Your Data: Introducing the MapR Data Science Refinery Rachel Silver Product Manager Data Science & Analytics 11/16/17
Transcript
Page 1: Self-Service Data Science for Leveraging ML & AI on All of Your Data

© 2017 MapR TechnologiesMapR Confidential 1

Self-Service Data Science for

Leveraging ML & AI on All of Your

Data:Introducing the MapR Data Science Refinery

Rachel SilverProduct Manager – Data Science & Analytics

11/16/17

Page 2: Self-Service Data Science for Leveraging ML & AI on All of Your Data

© 2017 MapR TechnologiesMapR Confidential 2

Summary

• Why Companies Invest In ML/AI

• Winning With a Data First Approach

• Introducing the MapR Data Science Refinery

• Deep Dive & Demos

– Ease of Deployment

– Data Exploration

– Extensibility & Collaboration

Page 3: Self-Service Data Science for Leveraging ML & AI on All of Your Data

© 2017 MapR TechnologiesMapR Confidential 3

Why Companies Invest In ML/AI

Page 4: Self-Service Data Science for Leveraging ML & AI on All of Your Data

© 2017 MapR TechnologiesMapR Confidential 4

Where AI Creates Value In The Value Chain

Produce

Optimized Production &

Maintenance

Provide rich, personal, and convenient

user experiences.

Project

Smarter R&D and

forecasting

Promote

Targeted Sales &

Marketing

Source: McKinsey Global Institute – Artificial Intelligence / The Next Digital Frontier? (2017)

Page 5: Self-Service Data Science for Leveraging ML & AI on All of Your Data

© 2017 MapR TechnologiesMapR Confidential 5

Project Where The Next Threat Will Come FromDeep security analytics and advanced persistent threat (APT) detection

• Centralization and

visibility of all data

from an information

security perspective

• Reduced risk of

data breaches from

DDOS and APT

attacks

• Real-time insights

into what is

happening within

the environment

OBJECTIVE

• Early detection of data breaches and suspicious activity

• Aggregate and retain all security related data into a single central store and

then build statistical models to detect abnormal activity within the

environment.

• Get insights into what are insiders doing within the environment

CHALLENGES

• Existing SIEM solution could not scale

• Current solutions do not work well for “unknown” threats

SOLUTION

• Leverage MapR-DB for fast data ingestion and query performance

• MapR provided the deep storage and machine learning algorithms

• NFS enabled easy integration with the IT ecosystem

Retail

Bank

Page 6: Self-Service Data Science for Leveraging ML & AI on All of Your Data

© 2017 MapR TechnologiesMapR Confidential 6

Source

1

Source

2

Source

1000

Houston

MAPR

Core

Cluster

Time to insight (48 hrs)

Manual Process

Before Edge

Source

1

Source

2

Source

1000

Houston

MAPR

Core

Cluster

Time to insight (<2 hrs)

Automated Process

1000s of

Oil & Drill Sources

Will do Pre Processing locally +at Core

(Custom App + Down Sampling)

After Edge

Produce More EfficientlyML aggregation and processing at the edge optimizes production

Oil & Gas

company

Page 7: Self-Service Data Science for Leveraging ML & AI on All of Your Data

© 2017 MapR TechnologiesMapR Confidential 7

Promote personalized offers in real-timeTargeting credit card customers using Recommendation Engine

A Global Financial Services company wanted to offer real-time localized & personalized recommendations to their credit card holdersusing ML/AI

OBJECTIVE

• Increase revenue and customer loyalty through real-time personalized offers generated by a recommendation engine

CHALLENGES

• In order to be accurate, data had to be updated on a real-time basis• Being a global company, their Platform has to be consistent and 100%

available 24x7 – no downtime• Must be able to simultaneously ingest (stream) and update data in the

same cluster

SOLUTION

• MapR was the only distribution that met the mission critical needs of the customer and also provided the capability to ingest data continuously into the cluster

• Direct NFS allows data to be continuously ingested directly into their cluster• MapR-XD’s self-healing capability allowed them to go into production safely

Leading

Credit Card

Company

Page 8: Self-Service Data Science for Leveraging ML & AI on All of Your Data

© 2017 MapR TechnologiesMapR Confidential 8

Provide Customers With a Customized ExperienceProvide customers with a personalized and convenient experience

Using ML/AI to bring customer understanding to the center of business processes

OBJECTIVE

• Use full knowledge of customer relationship to inform online interactions.

CHALLENGES

• Need to store 20 trillion records• Training sample size is 400 million records• The decision trees contained 2 million possible pathways• Every combination must be evaluated every time a model is used (~15 billion

combinations)

SOLUTION

• The MapR Converged Data Platform centralizes analytics and operational apps on one platform allowing Quantium to make one large infrastructure investment instead of many small silo’d ones. Current cluster has 50TB of memory and 5000 CPUs to process and store 5PB of data

Page 9: Self-Service Data Science for Leveraging ML & AI on All of Your Data

© 2017 MapR TechnologiesMapR Confidential 9

A Winning Approach: Data First

Page 10: Self-Service Data Science for Leveraging ML & AI on All of Your Data

© 2017 MapR TechnologiesMapR Confidential 10

Gartner estimates they solve between 10-100 business problems in three to five years.

Gartner estimates they solve

between 3-20 business

problems in three to five years.

20%

Contemplators Experimenters

41%40%

Adopters

Uncertain about the

benefits of Data Science.

Desire easy entry

Entry Points in the Data Science Journey

20%

Source: McKinsey Global Institute – Artificial Intelligence / The Next Digital Frontier? (2017)Source: Gartner – Magic Quadrant for Data Science Platforms (2017)

Page 11: Self-Service Data Science for Leveraging ML & AI on All of Your Data

© 2017 MapR TechnologiesMapR Confidential 11

Entry Points in the Data Science Journey

Gartner estimates they solve between 10-100 business problems in three to five years.

Gartner estimates they solve

between 3-20 business

problems in three to five years.

Uncertain about the

benefits of Data Science.

Desire easy entry

Adopters

20%

Contemplators Experimenters

41%40%

80%!

Source: McKinsey Global Institute – Artificial Intelligence / The Next Digital Frontier? (2017)Source: Gartner – Magic Quadrant for Data Science Platforms (2017)

Page 12: Self-Service Data Science for Leveraging ML & AI on All of Your Data

© 2017 MapR TechnologiesMapR Confidential 12

Entry Points in the Data Science Journey

Gartner estimates they solve between 10-100 business problems in three to five years.

Gartner estimates they solve

between 3-20 business

problems in three to five years.

Uncertain about the

benefits of Data Science.

Desire easy entry

Adopters

20%

Experimenters

41%

Source: McKinsey Global Institute – Artificial Intelligence / The Next Digital Frontier? (2017)

AI adoption outside of the tech sectoris stuck here and many firms report they are

uncertain of the ROI

Contemplators

40%

Investment in AI is growing at a high rate,

but adoption in 2017 remains low

AI is only deployed into production

12% of the time

Page 13: Self-Service Data Science for Leveraging ML & AI on All of Your Data

© 2017 MapR TechnologiesMapR Confidential 13

Entry Points in the Data Science Journey

Gartner estimates they solve between 10-100 business problems in three to five years.

Gartner estimates they solve

between 3-20 business

problems in three to five years.

Uncertain about the

benefits of Data Science.

Desire easy entry

Contemplators Experimenters

41%40%

Adopters

20%

Source: McKinsey Global Institute – Artificial Intelligence / The Next Digital Frontier? (2017)

Seamless Data Access

Technical Capabilities (a strong digital foundation)

Leadership From The Top

Key Traits Of A Successful Data Science Approach

Page 14: Self-Service Data Science for Leveraging ML & AI on All of Your Data

© 2017 MapR TechnologiesMapR Confidential 14

If it is ALL about the data,

then it better be about ALL your data.

Seamless Data Access

Page 15: Self-Service Data Science for Leveraging ML & AI on All of Your Data

© 2017 MapR TechnologiesMapR Confidential 15

ML Models Improve when Trained on Larger Datasets

Instead of relying on

assumptions and weak

correlations, presence of

more data results in better

and more accurate models

Source: A Survey of Applications of AI Algorithms in Eco-environmental modelling (2009)

Page 16: Self-Service Data Science for Leveraging ML & AI on All of Your Data

© 2017 MapR TechnologiesMapR Confidential 16

Data Growth Puts A Premium on Efficient Leverage

Source: McKinsey Global Institute: “The Age of Analytics”, Dec. 2016

The amount of data

is predicted to

double every three

years

Data Diversity

EmailsCall Detail

Records

Click

stream

CSV DocumentsData

PDFBilling Data Meta

Data

JSON Network

Data

Mobile

Data

XMLProduct

Catalog

Medical

RecordsText Files VideoText

Messages

Merchant

Listings

Sensor

Data

Server

Logs

Set Top

Box

Social

Media

Audio

4 Zettabytes

of Data

20111986

300 Exabytes

of Data

3 Exabytes

of Data

20192016

2 Zettabytes

of Data

Page 17: Self-Service Data Science for Leveraging ML & AI on All of Your Data

© 2017 MapR TechnologiesMapR Confidential 17

Hadoop + Vendor Approach to Data ScienceRequires yet another cluster

Data Science

cluster

Batch

Cluster

Streaming

Cluster

NoSQL

Cluster

On Premises

Page 18: Self-Service Data Science for Leveraging ML & AI on All of Your Data

© 2017 MapR TechnologiesMapR Confidential 18

Page 19: Self-Service Data Science for Leveraging ML & AI on All of Your Data

© 2017 MapR TechnologiesMapR Confidential 19

A Capable Platform With a Strong Digital Foundation

NFS POSIX REST HDFS

MAPR CONVERGED DATA PLATFORM

ON-PREMISES, MULTI-CLOUD, IoT EDGE

FILESTORE

CONTAINER STORE

CUSTOMFILE APPS

METADATAMANAGEMENT

JSON HBASEKAFKA

HADOOP & SPARK APPS

REAL-TIMEBI APPS

STREAMING APPS

IoT/EDGE

SQL

OPERATIONAL DATA HUB

CDC

CONTEXTUAL USER

EXPERIENCES

CORE BUSINESS

APPS

SINGLE

VIEWIOT

Page 20: Self-Service Data Science for Leveraging ML & AI on All of Your Data

© 2017 MapR TechnologiesMapR Confidential 20

Real-time Machine Learning Pipelines

A Robust Microservices Framework

Event Streams

• Persistent

• Infinitely replicable

• Re-playable

Compare model

results live!

M

Model A

M

Model B Persistent

Client & Application

Containers

Page 21: Self-Service Data Science for Leveraging ML & AI on All of Your Data

© 2017 MapR TechnologiesMapR Confidential 21

Advice For Leadership

Avoid

• Creating new silos

• Looking for a one-trick pony

• Adopting tools that have

unwieldy install, integration,

and configuration processes

• Tools that don’t scale to

broader enterprise use

• Ensure secure role based

access to all data

• Adopt tools that meet the

needs of a broad range of

Data Science Teams

• Encourage adoption by

making things easy, secure,

and complete

Important

Page 22: Self-Service Data Science for Leveraging ML & AI on All of Your Data

© 2017 MapR TechnologiesMapR Confidential 22

Data Science @ MapR

Page 23: Self-Service Data Science for Leveraging ML & AI on All of Your Data

© 2017 MapR TechnologiesMapR Confidential 23

The MapR Data Science VisionA Holistic Approach To Self-Service Data Science

MAPR DATA SCIENCE REFINERY REFINERY DATA SCIENTISTS

Data Scientist led product-and-

services offerings including Quick

Start Solutions (QSS) & Training

REFINERY PARTNERSHIPS

Expand on what we offer in-

product to meet the needs of all

data science teams

An easy-to-deploy, secure, and

extensible data science offering

that leverages all existing platform

assets

MAPR CONVERGED DATA PLATFORM

Page 24: Self-Service Data Science for Leveraging ML & AI on All of Your Data

© 2017 MapR TechnologiesMapR Confidential 24

MapR Data Science Refinery

Provides the ability to work across many

engines in one visual space

• Apache Spark: Spark Streaming, SparkSQL, SparkR, and

PySpark

• Apache Hive

• Apache Pig

• Apache Drill

• Python

• Shell access to MapR-FS

• Programmatic access to MapR-DB and MapR-ES in Spark

Pluggable Visualization Available via Helium!

An Enterprise-ready Data Science Notebook

MAPR

POSIX CLIENT

FOR CONTAINERS

MAPR

CONVERGED CLIENT

FOR CONTAINERS

Page 25: Self-Service Data Science for Leveraging ML & AI on All of Your Data

© 2017 MapR TechnologiesMapR Confidential 25

MapR Data Science Refinery Benefits

Easy to Deploy• A Docker Image includes all the necessary bits - no more,

no less - required to leverage MapR as a persistent data

store for your data science output.

• Available on DockerHub

Secure• Authentication occurs at a container level to ensure

containerized applications only have access to data for

which they are authorized.

• Communications are encrypted to ensure privacy when

accessing data in MapR.

Extensible• A Dockerfile is also available on GitHub, allowing you to

further customize the image as needed to support your

specific application needs.

• The Helium Framework enables pluggable visualization

Leverage Locally, On-premise, or in Cloud

CLOUD-SCALE

DATA STORE

MAPR-XD

OPERATIONAL

DATABASE

MAPR-DB

EVENT

STREAMING

MAPR-ES

High Availability Real-time Unified Security Multi-Tenancy Disaster Recovery Global Namespace

MAPR CONVERGED DATA PLATFORM

Page 26: Self-Service Data Science for Leveraging ML & AI on All of Your Data

© 2017 MapR TechnologiesMapR Confidential 26

Partner Integration: An Example

We’re enabling our partners to integrate with and use this product

DataScience.com Platform

Services

MapR DSR

Zeppelin Livy

JDBC

MapR Clients

Page 27: Self-Service Data Science for Leveraging ML & AI on All of Your Data

© 2017 MapR TechnologiesMapR Confidential 27

Page 28: Self-Service Data Science for Leveraging ML & AI on All of Your Data

© 2017 MapR TechnologiesMapR Confidential 28

Demo: Ease of Deployment & Data Exploration

Page 29: Self-Service Data Science for Leveraging ML & AI on All of Your Data

© 2017 MapR TechnologiesMapR Confidential 29

Demo: Ease of Deployment

What’s in the command

docker run --rm -it --cap-add SYS_ADMIN --cap-add SYS_RESOURCE --

device /dev/fuse --memory 0 -e MAPR_CLUSTER=my.cluster.com -e

MAPR_MEMORY=0 -e MAPR_MOUNT_PATH=/mapr -e

MAPR_TZ=America/Los_Angeles -e MAPR_CONTAINER_USER=mapr -e

MAPR_CONTAINER_UID=5000 -e MAPR_CONTAINER_GROUP=mapr -e

MAPR_CONTAINER_GID=5000 -e

MAPR_CLDB_HOSTS=172.24.8.195,172.24.11.200,172.24.10.4 -e

MAPR_TICKETFILE_LOCATION=/tmp/maprticket_5000 -e

ZEPPELIN_SSL_PORT=9995 -e HOST_IP=172.24.11.62 -e

MAPR_HS_HOST=172.24.8.195 -p 9995:9995 -p 10000-10010:10000-10010 -v

/tmp/maprticket_5000:/tmp/maprticket_5000:ro -v

/sys/fs/cgroup:/sys/fs/cgroup:ro maprtech/data-science-

refinery:v1.0_6.0.0_4.0.0_centos7

Page 30: Self-Service Data Science for Leveraging ML & AI on All of Your Data

© 2017 MapR TechnologiesMapR Confidential 30

Demo: Ease of Deployment

What’s in the command

docker run --rm -it --cap-add SYS_ADMIN --cap-add SYS_RESOURCE --

device /dev/fuse --memory 0 -e MAPR_CLUSTER=my.cluster.com -e

MAPR_MEMORY=0 -e MAPR_MOUNT_PATH=/mapr -e

MAPR_TZ=America/Los_Angeles -e MAPR_CONTAINER_USER=mapr -e

MAPR_CONTAINER_UID=5000 -e MAPR_CONTAINER_GROUP=mapr -e

MAPR_CONTAINER_GID=5000 -e

MAPR_CLDB_HOSTS=172.24.8.195,172.24.11.200,172.24.10.4 -e

MAPR_TICKETFILE_LOCATION=/tmp/maprticket_5000 -e

ZEPPELIN_SSL_PORT=9995 -e HOST_IP=172.24.11.62 -e

MAPR_HS_HOST=172.24.8.195 -p 9995:9995 -p 10000-10010:10000-10010 -v

/tmp/maprticket_5000:/tmp/maprticket_5000:ro -v

/sys/fs/cgroup:/sys/fs/cgroup:ro maprtech/data-science-

refinery:v1.0_6.0.0_4.0.0_centos7

Page 31: Self-Service Data Science for Leveraging ML & AI on All of Your Data

© 2017 MapR TechnologiesMapR Confidential 31

Demo: Ease of Deployment

What’s in the command

docker run --rm -it --cap-add SYS_ADMIN --cap-add SYS_RESOURCE --

device /dev/fuse --memory 0 -e MAPR_CLUSTER=my.cluster.com -e

MAPR_MEMORY=0 -e MAPR_MOUNT_PATH=/mapr -e

MAPR_TZ=America/Los_Angeles -e MAPR_CONTAINER_USER=mapr -e

MAPR_CONTAINER_UID=5000 -e MAPR_CONTAINER_GROUP=mapr -e

MAPR_CONTAINER_GID=5000 -e

MAPR_CLDB_HOSTS=172.24.8.195,172.24.11.200,172.24.10.4 -e

MAPR_TICKETFILE_LOCATION=/tmp/maprticket_5000 -e

ZEPPELIN_SSL_PORT=9995 -e HOST_IP=172.24.11.62 -e

MAPR_HS_HOST=172.24.8.195 -p 9995:9995 -p 10000-10010:10000-10010 -v

/tmp/maprticket_5000:/tmp/maprticket_5000:ro -v

/sys/fs/cgroup:/sys/fs/cgroup:ro maprtech/data-science-

refinery:v1.0_6.0.0_4.0.0_centos7

Page 32: Self-Service Data Science for Leveraging ML & AI on All of Your Data

© 2017 MapR TechnologiesMapR Confidential 32

Demo: Ease of Deployment

What’s in the command

docker run --rm -it --cap-add SYS_ADMIN --cap-add SYS_RESOURCE --

device /dev/fuse --memory 0 -e MAPR_CLUSTER=my.cluster.com -e

MAPR_MEMORY=0 -e MAPR_MOUNT_PATH=/mapr -e

MAPR_TZ=America/Los_Angeles -e MAPR_CONTAINER_USER=mapr -e

MAPR_CONTAINER_UID=5000 -e MAPR_CONTAINER_GROUP=mapr -e

MAPR_CONTAINER_GID=5000 -e

MAPR_CLDB_HOSTS=172.24.8.195,172.24.11.200,172.24.10.4 -e

MAPR_TICKETFILE_LOCATION=/tmp/maprticket_5000 -e

ZEPPELIN_SSL_PORT=9995 -e HOST_IP=172.24.11.62 -e

MAPR_HS_HOST=172.24.8.195 -p 9995:9995 -p 10000-10010:10000-10010 -v

/tmp/maprticket_5000:/tmp/maprticket_5000:ro -v

/sys/fs/cgroup:/sys/fs/cgroup:ro maprtech/data-science-

refinery:v1.0_6.0.0_4.0.0_centos7

Page 33: Self-Service Data Science for Leveraging ML & AI on All of Your Data

© 2017 MapR TechnologiesMapR Confidential 33

Demo: Ease of Deployment

What’s in the command

docker run --rm -it --cap-add SYS_ADMIN --cap-add SYS_RESOURCE --

device /dev/fuse --memory 0 -e MAPR_CLUSTER=my.cluster.com -e

MAPR_MEMORY=0 -e MAPR_MOUNT_PATH=/mapr -e

MAPR_TZ=America/Los_Angeles -e MAPR_CONTAINER_USER=mapr -e

MAPR_CONTAINER_UID=5000 -e MAPR_CONTAINER_GROUP=mapr -e

MAPR_CONTAINER_GID=5000 -e

MAPR_CLDB_HOSTS=172.24.8.195,172.24.11.200,172.24.10.4 -e

MAPR_TICKETFILE_LOCATION=/tmp/maprticket_5000 -e

ZEPPELIN_SSL_PORT=9995 -e HOST_IP=172.24.11.62 -e

MAPR_HS_HOST=172.24.8.195 -p 9995:9995 -p 10000-10010:10000-10010 -v

/tmp/maprticket_5000:/tmp/maprticket_5000:ro -v

/sys/fs/cgroup:/sys/fs/cgroup:ro maprtech/data-science-

refinery:v1.0_6.0.0_4.0.0_centos7

Page 34: Self-Service Data Science for Leveraging ML & AI on All of Your Data

© 2017 MapR TechnologiesMapR Confidential 34

Demo: Ease of Deployment

How is Security Handled?

$ maprlogin password

[Password for user ’jane' at cluster 'my.cluster.com': ]

MapR credentials of user ’john' for cluster 'my.cluster.com' are written to '/tmp/janes_ticket’

Job submits as ‘jane’

Page 35: Self-Service Data Science for Leveraging ML & AI on All of Your Data

© 2017 MapR TechnologiesMapR Confidential 35

Demo: Ease of Deployment

Why Livy?

CLOUD-SCALE

DATA STORE

MAPR-XD

OPERATIONAL

DATABASE

MAPR-DB

EVENT

STREAMING

MAPR-ES

MAPR CONVERGED DATA PLATFORMHTTP (RPC)

Advantages over native Spark Interpreter:• Jobs are submitted in YARN cluster mode

• Spark context can be shared

• Support for Spark Dynamic Resource Allocation

Page 36: Self-Service Data Science for Leveraging ML & AI on All of Your Data

© 2017 MapR TechnologiesMapR Confidential 36

Demo: Extensibility & Collaboration

Page 37: Self-Service Data Science for Leveraging ML & AI on All of Your Data

© 2017 MapR TechnologiesMapR Confidential 37

Demo: Extensibility & Collaboration

Collaboration

CLOUD-SCALE

DATA STORE

MAPR-XD

OPERATIONAL

DATABASE

MAPR-DB

EVENT

STREAMING

MAPR-ES

MAPR CONVERGED DATA PLATFORM

Page 38: Self-Service Data Science for Leveraging ML & AI on All of Your Data

© 2017 MapR TechnologiesMapR Confidential 38

Demo: Extensibility & Collaboration

Collaboration

CLOUD-SCALE

DATA STORE

MAPR-XD

OPERATIONAL

DATABASE

MAPR-DB

EVENT

STREAMING

MAPR-ES

MAPR CONVERGED DATA PLATFORM

MAPR

POSIX CLIENT

FOR CONTAINERS

Page 39: Self-Service Data Science for Leveraging ML & AI on All of Your Data

© 2017 MapR TechnologiesMapR Confidential 39

Demo: Extensibility & Collaboration

What’s in the command

docker run --rm -it --cap-add SYS_ADMIN --cap-add SYS_RESOURCE --device /dev/fuse --memory 0 -e MAPR_CLUSTER=my.cluster.com -e MAPR_MEMORY=0 -e MAPR_MOUNT_PATH=/mapr -e ZEPPELIN_NOTEBOOK_DIR=/mapr/my.cluster.com/user/mapr/zeppelin/shared-notebooks/ -e MAPR_TZ=America/Los_Angeles -e MAPR_CONTAINER_USER=mapr -e MAPR_CONTAINER_UID=5000 -e MAPR_CONTAINER_GROUP=mapr -e MAPR_CONTAINER_GID=5000 -e MAPR_CLDB_HOSTS=172.24.8.195,172.24.11.200,172.24.10.4 -e MAPR_TICKETFILE_LOCATION=/tmp/maprticket_5000 -e ZEPPELIN_SSL_PORT=9995 -e HOST_IP=172.24.11.62 -e MAPR_HS_HOST=172.24.8.195 -p 9995:9995 -p 10000-10010:10000-10010 -v /tmp/maprticket_5000:/tmp/maprticket_5000:ro -v /sys/fs/cgroup:/sys/fs/cgroup:romaprtech/data-science-refinery:v1.0_6.0.0_4.0.0_centos7

Page 40: Self-Service Data Science for Leveraging ML & AI on All of Your Data

© 2017 MapR TechnologiesMapR Confidential 40

Demo: Extensibility

Adding Deep Learning libraries to the container

Page 41: Self-Service Data Science for Leveraging ML & AI on All of Your Data

© 2017 MapR TechnologiesMapR Confidential 41

Demo: Extensibility

Adding Deep Learning libraries to the container

CLOUD-SCALE

DATA STORE

MAPR-XD

OPERATIONAL

DATABASE

MAPR-DB

EVENT

STREAMING

MAPR-ES

MAPR CONVERGED DATA PLATFORM

Compute Persistent Storage

Page 42: Self-Service Data Science for Leveraging ML & AI on All of Your Data

© 2017 MapR TechnologiesMapR Confidential 42

Demo: Extensibility

Adding Deep Learning libraries to the container

CLOUD-SCALE

DATA STORE

MAPR-XD

OPERATIONAL

DATABASE

MAPR-DB

EVENT

STREAMING

MAPR-ES

MAPR CONVERGED DATA PLATFORM

Compute Persistent Storage

What if this was a box of GPUs?

Page 43: Self-Service Data Science for Leveraging ML & AI on All of Your Data

© 2017 MapR TechnologiesMapR Confidential 43

A Final Comparison

Traditional Hadoop Vendor

Ba

tch

Clu

ste

r

Stre

am

ing

Clu

ste

r

No

SQ

L C

luste

r

On Premises

Data

Science

cluster

Page 44: Self-Service Data Science for Leveraging ML & AI on All of Your Data

© 2017 MapR TechnologiesMapR Confidential 44

Q&A

ENGAGE WITH US

@mapr

[email protected]


Recommended