Date post: | 21-Apr-2017 |
Category: |
Data & Analytics |
Upload: | joshua-bloom |
View: | 359 times |
Download: | 2 times |
Industrial Machine Learning
Applied Artificial Intelligence in the New Industrial Revolution, 13 April 2017 (SF)
Josh Bloom @profjsb
COPYRIGHT 2012-2017, WISE.IO INC.
• Brief Background/Introduction: Me & Wise.io• Industrial Machine Learning (IML) Opportunities• ML as a Systems Engineering Challenge• IML Applications at GE
Agenda
Teaching
‣ Python Bootcamps 200+ undergrad/grad
‣ Python for Data Science graduate course
Industry
‣ML Applications Company
Code / Repos
Q4’16
CTO, Co-founder Professor, UC Berkeley
Research
Gordon & Betty Moore Foundation
Data-Driven Investigator
‣ Automated Data-driven Discovery & Inference in the Time Domain
‣300+ refereed articles
COPYRIGHT 2012-2017, WISE.IO INC.
“Intelligent applications in Production”
Customer Support Product ○Intelligent Routing/Triage ○Response Recommendation ○Auto-Response ○Knowledge-base Deflection ○Federated Search ○Spam Filtering ○Sentiment Prediction ○IoT/proactive support
Enhancing Decisions in Human-centric Workflows
• Currently serving dozens of customers in production • Our customers: mid-sized, 5k-5M interactions/month,
charged on a per ticket basis
COPYRIGHT 2012-2017, WISE.IO INC.
Wise.io @ GE
Build & deploy SaaS-based production-grade scalable intelligent IIoT applications for end business users
Leveraging the data, horizontal edge-to-cloud platform (Predix), & industry relationships already at GE
IIoT: Beyond “Smart” Thermostats, Fitbits, and Self-driving cars…
COPYRIGHT 2012-2017, WISE.IO INC.
Consumer Internet Industrial Internet
Data Management Day’s worth of Twitter: 500 GB Single flight: 1 TB
Connectivity Biggest cell phone complaint: dropped calls Mission critical, rough & remote
DeviceSupport
Average wearables lifetime: 6 months
Lifetime of a Turbine: 20+ years
Security Time to Hack most devices: minutes 24/7 Mission Critical
Privacy Privacy is no longer a “social norm” - Zuck HIPAA, ITAR, …
IIoT: The Internet of Really Important Things
Industrial Machine Learning as a Systems Challenge
What are we optimizing for?
Component What
Algorithm/Model Learning rate, convexity, error bounds, scaling, …
+ Software/HardwareAccuracy, Memory usage, Disk
usage, CPU needs, time to learn, time to predict
+ Project Stafftime to implement, people/resource costs, reliability,
maintainability, experimentability
+ Consumersdirect value, useability,
explainability, actionability, security, privacy
+ Society indirect value, ethics
- multi-axis optimizations in a given component
- highly coupled optimization considerations between components- myopic view can be costly further up the stack
All ML in production is a Systems Challenge
Copyright 2012-2017, wise.io inc.
10
One ML Algorithmic Trade-OffHigh
LowLow High
Inte
rpre
tabi
lity
Accuracy
Linear/Logistic Regression
Naive Bayes
Decision Trees
SVMs
Bagging
Boosting
Decision Forests
Neural Nets Deep Learning
Nearest Neighbors
Gaussian/Dirichlet
Processes
Splines
* on real-world data setsLasso
Warning
Unscientific &
opinionated!
11
>$50k Prize<$50k Prize
Netflix
winning metric
best benchmark
many teams get within ~few % of optimum
so which is easier to put into production?
Leaderboard data from Kaggle & Netflix
Optimization Metric
12
“We evaluated some of the new methods offline but the additional accuracy gains that we measured did not seem to justify the engineering effort needed to bring them into a production environment.”
Xavier Amatriain and Justin Basilico (April 2012)
On the Prize
http://research.google.com/pubs/pub43146.html
• Complex models erode abstraction boundaries
• Data dependencies cost more than code dependencies: weak contracts
• System-level Spaghetti
• Changing External World
“It may be surprising to the academic community to know that only a fraction of the code … is actually doing ‘machine learning’. A mature system might end up being (at most) 5% machine learning code and (at least) 95% glue code.”
see also, Bottou (Facebook) ICML
Prediction API
in-houseas a service
experimental/sandbox
production/scale ready
watsonAPI
Prediction API
in-houseas a service
experimental/sandbox
production/scale ready
watsonAPI
time & cost to
implement cost to maintain
COPYRIGHT 2012-2017, WISE.IO INC.
Wise Architecture: Leveraging Cloud-based ServicesServices Oriented, Leveraging PaaS Managed Services
Microscaling: Dockerized templated workflows for CPU/GPU build/predict end-points
Macro scaling: compute clusters load-balance
RESTful contracts between services
Build on the AWS stack; Instantiated with terraform
End-user Transactional Systems
Embedded UI
Wise App SDK Use Case Specific Middleware
AuthMonitoring/
Alerting
Admin Dashboard
Reporting
Wise Factory
Wise Template (Learn/Prediction/Feedback)
Transaction DB
Model Storage / Management
Fron
t end
Mid
dlew
are
ML
back
end
Example Industrial Machine Learning Application
Inline Pipeline Inspection
Technology ▶ Action
+
seam detected
Crack
Terabytes of Inspection
data
Aggregate historic data
to enable learning from
experience
Advanced machine learning generates
more accurate insights
Surfaced to analysts to improve
performance, drive consistency, & repeatability
Our Goal: drive Zero-Pipeline-Failure
The Power of a 1% Gain in Efficiency
$27B$30B
$63B$66B
$90B
RailAviation
HealthcarePower
Oil & Gas
Source: “Industrial Internet Pushing Boundaries of Minds & Machines” GE, 2012
Industrial Machine Learning
Applied Artificial Intelligence in the New Industrial Revolution, 13 April 2017 (SF)
Josh Bloom @profjsb
Thanks! (and yes, we’re hiring…)