Why data science is the new frontier in software development
And why every developer should care
Jeff Prosise
@jprosise
Assertion #1 Being a programmer is like being the god of your own universe
Assertion #2 Your universe is smaller
than you think
AI and ML let you do things that can't be done algorithmically
Assertion #3
Identify spam e-mails Identify objects in images Translate speech in real time Detect faces in images Identify people in images Analyze sentiment in tweets Convert handwriting to text Translate text between languages Detect credit-card fraud in real time Filter adult content from images Filter profanity from text
A Brief History of AI
1950 - 1990
1990 – 2010
2010 - Present
Symbolic AI (e.g., chess)
AI winter
Boom driven by affordable GPUs, more data, and algorithmic advances
AI Taxonomy
Artificial Intelligence
Deep Learning Supervised Learning
Unsupervised Learning
Reinforcement Learning
Machine Learning
Machine Learning
Traditional Programming
Data
Rules Answers
Machine Learning
Data
Answers Rules
Building an ML Model
Import Data
Clean and Prepare
Data
Train Model
Score and Evaluate Model
Learning Algorithm
Tune or replace algorithm
Regression Classification
Support Vector Machines (SVM) Decision Trees and Random Forests
Neural Networks
ML Tools and Libraries
• Scikit-learn and Spark MLlib
• Azure ML Studio and Amazon ML
• TensorFlow, Caffe, Keras, and CNTK*
• MATLAB, Torch, and many more
* Primarily used for deep learning
scikit-learn
Azure ML Studio
• Visual ML modeling (no code)
• Drag-and-drop modules for cleaning, learning, scoring, tuning, and more
• Customizable with R and Python
• Easy operationalization
Azure ML Studio
Deep Learning
Convolutional Neural Network Generative Adversarial Network Recurrent Neural Network
• The magic behind computer vision, speech translation, and much more
ConvNets Input Image Convolution Pooling Fully Connected Layers
26x26 24x24
24x24
24x24
6x6
6x6
6x6
ConvNets in Action
Transfer Learning
• Leverages existing DNNs to achieve acceptable accuracy with exponentially less data and training time – Adds domain-specific layer(s) to layers already
present in pretrained model
• Some libraries (e.g., Keras) now include popular pretrained DNNs
Pretrained ConvNets ResNet
VGGNet
MobileNet
And many more…
Inception/Xception
"Imagine a deep CNN architecture. Take that, double the number of layers, add a couple more, and it still probably isn’t as deep as the ResNet architecture that Microsoft Research Asia came up with in late 2015. ResNet is a new 152 layer network architecture that set new records in classification, detection, and localization through one incredible architecture.
— Adit Deshpande, UCLA
https://github.com/tensorflow/models/tree/master/research/slim#Pretrained https://github.com/Microsoft/CNTK/blob/master/PretrainedModels/Image.md https://github.com/caffe2/models
ONNX
• Open Neural Network Exchange
– Format for interchangeable AI models developed by Microsoft, Amazon, and Facebook
– Backed by Intel, AMD, NVIDIA, and others
• https://github.com/onnx/models
TensorFlow
Intelligence as a Service
• Azure Cognitive Services
• Amazon Cognitive Services
• Google Cloud AI
• IBM Watson
Azure Cognitive Services
Vision Computer Vision API | Content Moderator | Custom Vision Service | Emotion API | Face API | Video Indexer
Speech Speech API | Custom Speech Service | Speaker Recognition API | Translator Speech API
Knowledge QnA Maker | Custom Decision Service
Language Bing Spell Check API | Language Understanding (LUIS) | Linguistic Analysis API | Text Analytics API | Translator Text API | Web Language Model API
Search Bing Autosuggest API | Bing Custom Search API | Bing Image Search API | Bing News Search API | Bing Video Search API | Bing WebSearch API
Computer Vision API
Captioning a Photo POST /vision/v1.0/analyze?visualFeatures=Description HTTP/1.1 Content-Type: application/json Content-Length: ••• Host: westus.api.cognitive.microsoft.com:443 Ocp-Apim-Subscription-Key: •••••••••••••••••••••••••••••••• {"url":"https://mypics.com/photos/Dubai.jpg"}
JSON Result { "description": { "tags": [ "man", "dune", "riding", "board", "hill", "sand" ], "captions": [ { "text": "a man riding a skateboard in the sand", "confidence": 0.66107721083049154 } ] }, "requestId": "03501f93-a0ba-4205-a778-6e02cce2b509", "metadata": { "width": 3072, "height": 2304, "format": "Jpeg" } }
Custom Vision Service
• Build sophisticated image-classification models backed by deep neural networks
– Get acceptable accuracy with 50 to 100 images
• Build intelligent apps that invoke models using REST API
• Or export to iOS (CoreML), Android (TensorFlow), or Windows (ONNX) and run locally
Custom Vision
Local Intelligence
• New libraries allow ML models to run on-device – ML.NET (Microsoft)
– ML Kit (Google)
– Windows ML
• Faster, cheaper, and works without a connection
Are you a hot dog? Or not a hot dog?