Date post: | 07-Aug-2015 |
Category: |
Data & Analytics |
Upload: | fahad-shah |
View: | 146 times |
Download: | 0 times |
Building a predictive analytics solution with Azure ML
Fidan Boylu Uz, Ph.DSyed Fahad Allam Shah, Ph.DData Scientists, Microsoft
Advanced AnalyticsBeyond business intelligence
Source: Gartner
VA
LU
E
DIFFICULTY
HINDSIGHT
INSIGHT
FORESIGHT
Descriptive Analytics
DiagnosticAnalytics
Predictive Analytics
Prescriptive Analytics
What happened?
Why did it happen?
What will happen?
How can we make it happen?
Traditional BI Advanced AnalyticsINFORMATION
OPTIMIZATION
Machine LearningComputing systems that improve with experience
1 1 5 4 3
7 5 3 5 3
5 5 9 0 6
3 5 2 0 0
Training examples Training labels
Accurate digit classifier
2
Machine learning system
Bing Translator App
Predictive AnalyticsPredicting future performance from historical data
Recommenda-tion engines
Advertising analysis
Weather forecasting for business planning
Social network analysis
IT infrastructure and web app optimization
Legal discovery and document archiving
Pricing analysisFraud detection
Churn analysis
Equipment monitoring
Location-based tracking and services
Personalized Insurance
Predictive analytics should address the likelihood of something happening in the future, even if it is just an instant later…
Transformational trends
cloud computing
2011 2016 5x increase
emerging data science talent
Universities filling 300,000 US talent gap
90% of the data in the world today has been created in the last two years alone
data explosion
connected customers
1B+200M10.4M 160M
The old Advanced Analytics landscape No improvement in generations
Huge set-up costs of tools, expertise, and compute/storage capacity Expensive
Siloed and cumbersome data management restricts access to dataSiloed data
Complex and fragmented tools limit participation in exploring data and building models
Disconnected tools
Many models never achieve business value due to difficulties with deploying to production
Deployment complexity
Differentiation
Model Your Way[Data Scientist]
All Skill Levels Business-tested Algorithms
R & Python
Deploy in Minutes[Data Scientist, IT & Developers]
One Click DeploymentManage via Cloud Portal
Model accessed as a Web Service
Expand your Reach[Ecosystem & Developers]
Azure MarketplaceGlobal Scale
The Data Science “Inside”
Accessibility
Predictive MaintenanceDemo
Scenario
11
This is Karl.Karl owns a company that
operates vending machines in Washington.
His job is to make sure that his 100 vending machines are selling drinks & obtaining
revenue.
Karl wants revenue to always be high & his
business to be profitable
Scenario
12
Sadly, vending machine will occasionally break & may take up to 7 days to fix, thus hurting
sales.
To eliminate this occurrence, Karl must maintain operations & figure out the best way to utilize
resources in order to optimize revenue.
Questions & Solutions
3. Which Machines Will Soon Fail
13
1. Which Machines Have Lost Sales?
2. Which Machines Have Failed?
Cloud
ConsumptionAnalysis / ManagementAnalysis/ StorageData Collection
Stream Analytics
Demo architecture: Advanced Analytics
API Link
Event Hubs
Data Factory
Azure Machine Learning
Power BI
Excel
Field Data
MicrosoftAzure Portal
Power BI / ExcelMicrosoft Azure
Blob Storage
14
Advanced analytics architecture
Data to model to web services in minutes
Data preparation Business valueModeling Deployment
• HDFS• RDBMS• NoSQL stores• Blobs and tables
Data
• Desktop files
• Spreadsheets
• Server stores
• Sensors
Cloud
Local
Apps, dashboardsand processes
Storage space
Integrated development environment for machine
learning
MLStudio
http://studio.azureml.net
API
Model is now a web svc
Monetize this API
MMarketplace
Web
• Data factory• Stream analytics
• Machine learning• HDInsight
• Marketplace• Azure portal
• Power BI• Apps
Establish mechanisms to conduct data science activities end-to-end in the cloud or on premises, friction free.
Set up a Data Science Environment in the cloud Move data from on premise to cloud Explore and understand your data Build a model with Azure Machine Learning Deploy model as web-service and consume it End-to-End walkthroughs with real datasets
Advanced Analytics Process & Technology (ADAPT)
http://aka.ms/adapt
ADAPT Hands On Walkthroughs
Setup Cloud Environment
Load DataExplore Data
Engineer Features
Sample Data
Build Model Deploy Model Consume Model
Today: Hands On
Setup Cloud Environment
Load DataExplore Data
Engineer Features
Sample Data
Build Model Deploy Model Consume Model
All taxi trips and fares paid for trips in NYC from January 2013 to December 2013. ~20GB of compressed CSV data (~48GB uncompressed). >173 million rows.
Each record includes Pickup and drop-off location and time Anonymized hack license number Medallion number (i.e. the taxi’s unique id number).
trip_data CSV contains trip details, and trip_fare CSV contains details of the fare paid.
Unique key to join trip_data and trip_fare is: medallion, hack_licence, and pickup_datetime.
New York City Taxi Dataset
Let’s get going!
Demo
Setup Cloud Environment
Load DataExplore Data
Engineer Features
Sample Data
Build Model Deploy Model Consume Model