+ All Categories
Transcript
Page 1: The Past, Present, and Future of Machine Learning APIs

BigML Inc API days Mediterranea 1

The Past, Present, and Future of Machine Learning APIs

May 2015

[email protected]

Page 2: The Past, Present, and Future of Machine Learning APIs

BigML Inc API days Mediterranea 2

Past

Machine Learning APIs

1

2 Present

3 Future

Page 3: The Past, Present, and Future of Machine Learning APIs

BigML Inc API days Mediterranea

Machine Learning

“a field of study that gives computer the ability to learn without being explicitly

programmed”

Professor Arthur Samuel

•The world's first self-learning program was a checkers-playing program developed for IBM by Professor Arthur Samuel in 1952.

•Thomas J. Watson Sr., the founder and President of IBM, predicted that Samuel’s checkers public demonstration would raise the price of IBM stock 15 points. It did.

3

Page 4: The Past, Present, and Future of Machine Learning APIs

BigML Inc API days Mediterranea 4

1950 1960 1970 1980 1990 2000 2010

PerceptronNeural

Networks

Ensembles

Support Vector Machines

Boosting

Brief HistoryIn

terp

reta

bilit

y

Rosenblatt, 1957

Quinlan, 1979 (ID3),

Minsky, 1969

Vapnik, 1963 Corina & Vapnik, 1995

Schapire, 1989 (Boosting) Schapire, 1995 (Adaboost)

Breiman, 2001 (Random Forests)Breiman, 1994 (Bagging)

Deep LearningHinton, 2006Fukushima, 1989 (ANN)

Breiman, 1984 (CART)

2020

+

-

Decision Trees

Page 5: The Past, Present, and Future of Machine Learning APIs

BigML Inc API days Mediterranea 5

New algorithms &

Theory

Parameter estimation &

Scalability

Automated Representation &

Composability

Applicability&

Deployability

1950 1960 1970 1980 1990 2000 2010 2020

Focu

sFocus

AUTOMATION

1st Machine Learning Workshop Pittsburgh, PA, 1980

Page 6: The Past, Present, and Future of Machine Learning APIs

BigML Inc API days Mediterranea 6

State the problem

Data Wrangling

Feature EngineeringLearning

Deploying

Predicting

Measuring Impact

The Stages of a ML app

Machine Learning That Matters, Kiri Wagstaff, 2012

Machine Learning is only as good as the impact it makes on the real world

Page 7: The Past, Present, and Future of Machine Learning APIs

BigML Inc API days Mediterranea 7

•Most tools have been focused on just training models manually

•Consider: Having 1M users, needing to create a model for each one, and then running 10 predictions for each one a day (100M predictions)

Learning (Training) Predicting (Scoring)

DATA MODEL NEW DATA PREDICTIONS

Machine Learning Tasks

Page 8: The Past, Present, and Future of Machine Learning APIs

BigML Inc API days Mediterranea 8

Legacy ML Tools•By scientists (with a Ph.D.) for scientists (with a Ph.D.) •Excess of algorithms •Single-threaded, desktop apps for small datasets •Overcomplicated for common people •Oversimplified for real world problems •Poorly engineered for real world use or high scale

1993 1997 20071997 2004 2008 2013

PRE-HADOOP POST-HADOOP

•Commercial tools (SPSS, SAS) not only inherit the same issues but are also overpriced

Page 9: The Past, Present, and Future of Machine Learning APIs

BigML Inc API days Mediterranea 9

The Paradox of Choice

Do we need hundreds of classifiers? The Paradox of Choice

Page 10: The Past, Present, and Future of Machine Learning APIs

BigML Inc API days Mediterranea 10

Smarter Apps?•5 years after the data deluge,

why don’t we see more smarter apps?

•Real-world Machine Learning expert ise is scarce and expensive

•Scaling Machine Learning is hard

•C u r r e n t t o o l s w e r e n ’ t designed for developers. They require a Ph.D., are c o m p l e x , e r r o r p r o n e , expensive, etc)

Page 11: The Past, Present, and Future of Machine Learning APIs

BigML Inc API days Mediterranea 11

REST APIs

REST, Roy Fielding

History of APIs

2000 2001 2002

XML, 2000

XML, 2000

XML, 2002 REST, 2004

2003 2004

Page 12: The Past, Present, and Future of Machine Learning APIs

BigML Inc API days Mediterranea 12

2010 2011 2012 2013 2014 2015

Hadoop and Big Data Craziness

Machine Learning APIs

Watson wins Jeopardy

Page 13: The Past, Present, and Future of Machine Learning APIs

BigML Inc API days Mediterranea 13

Past

Machine Learning APIs

1

2 Present

3 Future

Page 14: The Past, Present, and Future of Machine Learning APIs

BigML Inc API days Mediterranea 14

•Machine Learning (or Predictive) APIs can:

•Abstract the inherent complexity of ML algorithms

•Manage the heavy infrastructure needed to learn from data and make predictions at scale. No additional servers to provision or manage

•Easily close the gap between model training and scoring

•Be built for developers and provide full flow automation

•Add traceability and repeatability to ML tasks

Machine Learning APIs

Page 15: The Past, Present, and Future of Machine Learning APIs

BigML Inc API days Mediterranea 15

"

•Did you know anyone (knowing nothing about ML) can predict in real-time with few lines of code:

•Which employee will leave in the next 6 months •Which electric generator is likely to die in the next 2 weeks •Which sales lead has the highest potential to close in the

next 3 months •What each new website visitor is likely to buy based on

past visitors •etc. 

Machine Learning APIs

Page 16: The Past, Present, and Future of Machine Learning APIs

BigML Inc API days Mediterranea 16

• Programmable Machine Learning • Automated application workflows • Repeatable and traceable • Higher level algorithms • Asynchronous resources

Example: BigML API

project

source dataset

sample model

ensemble

cluster

anomaly detector

(batch) prediction

(batch) centroid

(batch) anomaly score

Each machine learning element is a REST resource

Page 17: The Past, Present, and Future of Machine Learning APIs

BigML Inc API days Mediterranea 17

Source Dataset Anomaly Detector

Dataset with scores

Batch anomaly score

Dataset filtered

Filter

Anomaly Detection

Real-Time scores

Page 18: The Past, Present, and Future of Machine Learning APIs

BigML Inc API days Mediterranea 18

export BIGML_USERNAME=apidays export BIGML_API_KEY=aa3140519eacc1e9c034f8c973d976e35fff8b29 export BIGML_AUTH="username=$BIGML_USERNAME;api_key=$BIGML_API_KEY" export BIGML_DOMAIN=bigml.io

export BIGML_URL=https://$BIGML_DOMAIN export DEV_BIGML_URL=$BIGML_URL/dev

RESOURCES="source dataset sample model cluster anomaly ensemble evaluation prediction centroid anomalyscore batchprediction batchcentroid batchanomalyscore project"

for RESOURCE in $RESOURCES; do VARIABLE=$(echo $RESOURCE | tr '[a-z]' '[A-Z]') export ${VARIABLE}="$BIGML_URL/$RESOURCE?$BIGML_AUTH" export DEV_${RESOURCE}="$DEV_BIGML_URL/$RESOURCE?$BIGML_AUTH"

Anomaly Detection at the prompt

https://github.com/jakubroztocil/httpie

http://stedolan.github.io/jq/

HTTPie: a CLI, cURL-like tool for humans

jq: sed for JSON data

Page 19: The Past, Present, and Future of Machine Learning APIs

BigML Inc API days Mediterranea 19

source_id=$(http $SOURCE remote=$APPLE name=APIDays | jq -r .resource)

dataset_id=$(http $DATASET source=$source_id | jq -r .resource)

anomaly_id=$(http $ANOMALY dataset=$dataset_id | jq -r .resource)

http $ANOMALYSCORE anomaly=$anomaly_id input_data:='{"open": 200}' | jq .score

APPLE=https://s3.amazonaws.com/bigml-public/csv/nasdaq_aapl.csv

Anomaly Detection at the prompt

Page 20: The Past, Present, and Future of Machine Learning APIs

BigML Inc API days Mediterranea 20

Anomaly Detection in Python#!/usr/bin/env python # -*- coding: utf-8 -*-

from bigml.api import BigML from bigml.anomaly import Anomaly

BigML()

APPLE = "https://s3.amazonaws.com/bigml-public/csv/nasdaq_aapl.csv"

source = api.create_source(APPLE, {'name': 'APIDays'}) api.ok(source)

dataset = api.create_dataset(source) api.ok(dataset)

anomaly = api.create_anomaly(dataset) api.ok(anomaly)

local_anomaly = Anomaly(anomaly)

local_anomaly.anomaly_score({"Open": 275, "High": 300, "Low": 250})

• http://bigml.readthedocs.org/en/latest/#anomaly-detector • http://bigml.readthedocs.org/en/latest/#local-anomaly-detector • http://bigml.readthedocs.org/en/latest/#local-anomaly-scores

• https://github.com/bigmlcom/python

Page 21: The Past, Present, and Future of Machine Learning APIs

BigML Inc API days Mediterranea 21

Anomaly Detection in BigMLer

APPLE=https://s3.amazonaws.com/bigml-public/csv/nasdaq_aapl.csv

bigmler anomaly --train $APPLE --name APIDays

• http://bigmler.readthedocs.org/en/latest/#anomaly-subcommand

• https://github.com/bigmlcom/bigmler

Page 22: The Past, Present, and Future of Machine Learning APIs

BigML Inc API days Mediterranea 22

The Present of ML APIs

• # Algorithms • Training speed • Prediction speed • Performance • Ease-of-Use • Deployability • Scalability • API-first? • API design • Documentation • UI (Dashboard, Studio, Console) • SDKs • Automation • Time-to-productivity • Importability • Exportability • Transparency • Dependency • Price

Recent tools with too many aspects to compare and too few benchmarks so far

Page 23: The Past, Present, and Future of Machine Learning APIs

BigML Inc API days Mediterranea 23

Democratization

Immediately available, anyone can try it for free!!!

Page 24: The Past, Present, and Future of Machine Learning APIs

BigML Inc API days Mediterranea 24

Exportability

yes

no

Tran

spar

ency

B>A

yes

Models are exportable to predict outside the platform

Blac

k-bo

x m

odel

ing

no

Whi

te-b

ox m

odel

ing

Predicting only available via the same platform

N/A

Exportability vs Transparency

Page 25: The Past, Present, and Future of Machine Learning APIs

BigML Inc API days Mediterranea 25

Past

Machine Learning APIs

1

2 Present

3 Future

Page 26: The Past, Present, and Future of Machine Learning APIs

BigML Inc API days Mediterranea 26

API-first

Page 27: The Past, Present, and Future of Machine Learning APIs

BigML Inc API days Mediterranea 27

Simplicity

vs

1.Select: classification or regression 2.Select: two-class or multi-class 3.Select: algorithm

and infer the task based on the type and distribution of the objective field

Page 28: The Past, Present, and Future of Machine Learning APIs

BigML Inc API days Mediterranea 28

Simplicity?

Page 29: The Past, Present, and Future of Machine Learning APIs

BigML Inc API days Mediterranea 29

Programmability

• Future: Remote Execution / Mobile Code

• Today: Cloud Client Computing

Page 30: The Past, Present, and Future of Machine Learning APIs

BigML Inc API days Mediterranea 30

Freedom• Importability

from bigml.model import Model

model = Model("model/55428485af447f69e1001bab")model.predict({"petal length": 3, “petal width": 2})

• Exportability

from bigml.api import BigML

ml = BigML()

source_1 = ml.create_source("azure://csv/iris.csv?AccountName=bigmlpublic")source_2 = ml.create_source("s3://bigml-public/csv/iris.csv")

dataset_1 = ml.create_dataset(source_1)dataset_2 = ml.create_dataset(source_2)

model = ml.create_model([dataset_1, dataset_2])

Page 31: The Past, Present, and Future of Machine Learning APIs

BigML Inc API days Mediterranea 31

Composability

Enhancing your cloud applications with Artificial Intelligence

Page 32: The Past, Present, and Future of Machine Learning APIs

BigML Inc API days Mediterranea 32

Specialization

Classification Regression Cluster Analysis

Anomaly Detection Other…

Specific Data

Specialized API

Specific Data Transformations

and Feature Engineering

Specific Modeling Strategy

Specific Predicting Strategy

Specific Evaluations

LanguageIdentification

SentimentAnalysis

AgeGuessing

MoodGuessing

Many Others…

Page 33: The Past, Present, and Future of Machine Learning APIs

BigML Inc API days Mediterranea 33

Age Guessing

Page 34: The Past, Present, and Future of Machine Learning APIs

BigML Inc API days Mediterranea 34

Specialization

Classification Regression Cluster Analysis

Anomaly Detection Other…

Specific Data

Specialized API

Specific Data Transformations

and Feature Engineering

Specific Modeling Strategy

Specific Predicting Strategy

Specific Evaluations

LeadScoring

LifetimeValue

PredictionFraud

DetectionIntrusionDetection

Many Others…

Page 35: The Past, Present, and Future of Machine Learning APIs

BigML Inc API days Mediterranea 35

Machine Learning Layer

•Machine Learning is becoming a new abstraction layer of the computing infrastructure.

•An application developer expects to have access to a machine learning platform.

Tushar Chandra, Google

Page 36: The Past, Present, and Future of Machine Learning APIs

BigML Inc API days Mediterranea 36

Standardization?

Classification Regression Cluster Analysis

Anomaly Detection Other…

Standard ML API

The SQL of Machine Learning?

Page 37: The Past, Present, and Future of Machine Learning APIs

BigML Inc API days Mediterranea 37

Born to learn

from django.db import models

class Customer(models.Model) name = models.CharsField(max_length=30) age = models.PositiveIntegerField() monthly_income = models.FloatField(blank=True, null=True) dependents = models.PositiveIntegerField(default=0)

open_credit_lines = models.PositiveIntegerField(default=0)delinquent = models.BooleanField(predictable=True)

•Predictions will be embedded into data models •Development frameworks will increasingly abstract modeling

and predicting strategies •New applications designed and implemented from scratch

will take advantage of machine learning from day 0

Page 38: The Past, Present, and Future of Machine Learning APIs

BigML Inc API days Mediterranea 38

Conclusions• Machine Learning APIs not only help abstract the inherent

complexity of machine learning algorithms but also the complexity associated with the infrastructure needed to learn from data and make predictions at scale adding traceability and repeatability to machine learning tasks

• Once more powerful and easier to use general Machine Learning APIs are in place, API providers will switch their focus f rom more a lgor i thms to: special izat ion , composability, standardization and complete automation

• Developing smart applications will become easier, faster, and cheaper with the consequent impact in productivity realized in a multitude of sectors

Page 39: The Past, Present, and Future of Machine Learning APIs

BigML Inc API days Mediterranea 39

“As machine learning leaves the lab and goes into practice, it will threaten white-collar, knowledge-worker jobs just as

machines, automation and assembly lines destroyed factory jobs in the 19th and 20th centuries.” The Economist, February 1, 2014

Leaving the lab

Page 40: The Past, Present, and Future of Machine Learning APIs

BigML Inc API days Mediterranea 40

Want to know more?

Page 41: The Past, Present, and Future of Machine Learning APIs

BigML Inc API days Mediterranea 41


Top Related