+ All Categories
Home > Data & Analytics > Building Custom Machine Learning Algorithms With Apache SystemML

Building Custom Machine Learning Algorithms With Apache SystemML

Date post: 21-Apr-2017
Category:
Upload: jen-aman
View: 729 times
Download: 0 times
Share this document with a friend
30
Building Custom Machine Learning Algorithms with Apache SystemML Fred Reiss Chief Architect, IBM Spark Technology Center Member, IBM Academy of Technology
Transcript
Page 1: Building Custom Machine Learning Algorithms With Apache SystemML

Building CustomMachine Learning Algorithmswith Apache SystemML

Fred ReissChief Architect, IBM Spark Technology CenterMember, IBM Academy of Technology

Page 2: Building Custom Machine Learning Algorithms With Apache SystemML

Roadmap• What is Apache SystemML?• Demo!• How to get SystemML

Page 3: Building Custom Machine Learning Algorithms With Apache SystemML

What is Apache SystemML?

Page 4: Building Custom Machine Learning Algorithms With Apache SystemML

Origins of the SystemML Project

20162015

You are here.

Page 5: Building Custom Machine Learning Algorithms With Apache SystemML

2014201320122011

Page 6: Building Custom Machine Learning Algorithms With Apache SystemML

200920082007

2007-2008: Multiple projects at IBM Research – Almadeninvolving machine learning on Hadoop.

2010

2009-2010: Through engagements with customers, we observe how data scientists create ML solutions.

2009: We form a dedicated team for scalable ML

Page 7: Building Custom Machine Learning Algorithms With Apache SystemML

Case Study: An Auto Manufacturer

Warranty Claims

Repair History

Diagnostic Readouts

PredictReacquired

Cars

Page 8: Building Custom Machine Learning Algorithms With Apache SystemML

Case Study: An Auto Manufacturer

Warranty Claims

Repair History

FeaturesLabels

PredictReacquired

Cars

MachineLearningAlgorithm

Algorithm

Algorithm

Algorithm

Result: 25x improvementin precision!

FalsePositives

Diagnostic Readouts

Page 9: Building Custom Machine Learning Algorithms With Apache SystemML

The Iterative Development Process

Build a pipelineResults good

enough?

Yes

Customize part of the pipeline

No

Page 10: Building Custom Machine Learning Algorithms With Apache SystemML

State-of-the-Art: Small Data

R orPython

DataScientist

PersonalComputer

Data

Results

Page 11: Building Custom Machine Learning Algorithms With Apache SystemML

State-of-the-Art: Big Data

R orPython

DataScientist

Results

SystemsProgrammer

Scala

Page 12: Building Custom Machine Learning Algorithms With Apache SystemML

State-of-the-Art: Big Data

R orPython

DataScientist

Results

SystemsProgrammer

Scala

😞 Days or weeks per iteration😞 Errors while translating

algorithms

Page 13: Building Custom Machine Learning Algorithms With Apache SystemML

The SystemML Vision

R orPython

DataScientist

Results

SystemML

Page 14: Building Custom Machine Learning Algorithms With Apache SystemML

The SystemML Vision

R orPython

DataScientist

Results

SystemML

😃 Fast iteration😃 Same answer

Page 15: Building Custom Machine Learning Algorithms With Apache SystemML

200920082007

2007-2008: Multiple projects at IBM Research – Almadeninvolving machine learning on Hadoop.

2010

2009-2010: Through engagements with customers, we observe how data scientists create machine learning algorithms.

2009: We form a dedicated team for scalable ML

Page 16: Building Custom Machine Learning Algorithms With Apache SystemML

2014201320122011

Research

Page 17: Building Custom Machine Learning Algorithms With Apache SystemML

20162015

Apache SystemMLJune 2015: IBM Announces open-source SystemML

September 2015: Code available on Github

November 2015: SystemML enters Apache incubation

June 2016: Second Apache release (0.10)

February 2016: First release (0.9) of Apache SystemML

Page 18: Building Custom Machine Learning Algorithms With Apache SystemML

SystemML at• Built algorithms for predicting treatment

outcomes– Substantial improvement in accuracy

• Moved from Hadoop MapReduce to Spark– SystemML supports both frameworks– Exact same code– 300X faster on 1/40th as many nodes

Page 19: Building Custom Machine Learning Algorithms With Apache SystemML

SystemML at Cadent Technology“SystemML allows Cadent to implement advanced numerical programming methods in Apache Spark, empowering us to leverage specialized algorithms in our predictive analytics software.”

Michael ZarghamChief Scientist

Cadent is a leading provider of TV advertising and data solutions, reaching over 140 million homes and trusted by the world’s largest service providers.

Page 20: Building Custom Machine Learning Algorithms With Apache SystemML

Demo!

Page 21: Building Custom Machine Learning Algorithms With Apache SystemML

Demo Scenario• Application: Targeted ads using demographic

information tied to cookies• Problem: The information is incomplete• Solution: Estimate the missing values

– Treat the problem as a matrix completion problem

Page 22: Building Custom Machine Learning Algorithms With Apache SystemML

Data• The U.S. Census Public Use Microdata Sample

(PUMS) data set for 2010• 10% sample of the U.S. population

– We’ll use just California today• Use this full data set to generate synthetic

incomplete data

Page 23: Building Custom Machine Learning Algorithms With Apache SystemML

Demo Scenario• Application: Identify products that are

complementary (often purchased together)• Problem: Customers are not currently buying

the best complements at the same time• Solution: Suggest new product pairings

– Treat the problem as a matrix completion problem

Page 24: Building Custom Machine Learning Algorithms With Apache SystemML

Demographics

Use

rs

i

jValue of

demographicfield j for

customer i

Matrix FactorizationTop Factor

Left

Fact

or

Multiply these two factors to

produce a less-sparse matrix.

×

New nonzero values become

interpolated demographic information

Page 25: Building Custom Machine Learning Algorithms With Apache SystemML

Demo Part 1: Data wrangling

Page 26: Building Custom Machine Learning Algorithms With Apache SystemML

Demo Part 2: Custom algorithm

Page 27: Building Custom Machine Learning Algorithms With Apache SystemML

Key Points• SystemML, Spark, and Zeppelin work together• Linear algebra is great for data science• Customization is important

Page 28: Building Custom Machine Learning Algorithms With Apache SystemML

How to get Apache SystemML

Page 29: Building Custom Machine Learning Algorithms With Apache SystemML

The Apache SystemML Web Sitehttp://systemml.apache.org

Download the binary release!

Try out some

tutorials!Browse the

source!

Contribute to the project!

Page 30: Building Custom Machine Learning Algorithms With Apache SystemML

THANK YOU.Please try out Apache SystemML!http://systemml.apache.org

Special thanks to Nakul Jindal and Mike Dusenberry for helping with the demo!


Recommended