Post on 01-Jan-2017
transcript
Copyr igh t © 2012, SAS Ins t i tu te Inc . A l l r igh ts reserved.
INTRODUCTION TO DATA MININGSAS® ENTERPRISE MINER™
Mary-Elizabeth (“M-E”) EddlestonePrincipal Systems Engineer, AnalyticsSAS Customer Loyalty, SAS Institute, Inc.
Copyr igh t © 2012, SAS Ins t i tu te Inc . A l l r igh ts reserved.
AGENDA
• Overview/Introduction to Data Mining and Predictive Modeling
• Building Models Using SAS® Enterprise Miner™• Walk through example• Essential steps: Sample, Explore, Modify, Model, Assess, Score• Show selection of tools, how to change their properties and surface
results• Building Automated Models using Excel or SAS®
Enterprise Guide (Rapid Predictive Modeler)
Copyr igh t © 2012, SAS Ins t i tu te Inc . A l l r igh ts reserved.
INTRODUCTION TO DATA MINING
Copyr igh t © 2012, SAS Ins t i tu te Inc . A l l r igh ts reserved.
Copyr igh t © 2012, SAS Ins t i tu te Inc . A l l r igh ts reserved.
DATA MINING GOALS
AGILEor
DYNAMIC
IMPROVEDPROFITABILITYINSIGHT PRECISIONSPEED
Better Decisions
PERSONALIZATION
Copyr igh t © 2012, SAS Ins t i tu te Inc . A l l r igh ts reserved.
Copyr igh t © 2012, SAS Ins t i tu te Inc . A l l r igh ts reserved.
ANALYTICS INFERENTIAL
Inferential Statistics Uses patterns in the sample data to draw inferences
about the population represented, accounting for randomness Answering yes/no questions about the data (hypothesis testing) Describing associations within the data (correlation) Modeling relationships within the data (regression)
Source: Wikipedia
Copyr igh t © 2012, SAS Ins t i tu te Inc . A l l r igh ts reserved.
ANALYTICS PREDICTIVE
Predictive Analytics Encompasses a variety of techniques from statistics,
modeling, machine learning, and data mining that analyze current and historical facts to make predictions about future, or otherwise unknown, events. Include:
• Data Mining • Forecasting
Source: Wikipedia
Copyr igh t © 2012, SAS Ins t i tu te Inc . A l l r igh ts reserved.
ANALYTICS DATA MINING VERSUS FORECASTING
• DATA MINING• Time independent• Casual (relationship) focused• Categorical, Continuous,
Discrete• Seldom weight more recent
observations
• FORECASTING• Time dependent• Interval oriented• Continuity assumed• Frequently weights more
recent phenomena
Both are predictive and both model past behavior.
Copyr igh t © 2012, SAS Ins t i tu te Inc . A l l r igh ts reserved.
DATA MINING
• Descriptive Data Mining• Predictive Data Mining
Copyr igh t © 2012, SAS Ins t i tu te Inc . A l l r igh ts reserved.
DATA MINING
• Descriptive Data Mining• Clustering (Segmentation)• Associations and Sequences
• Predictive Data Mining• Classification Models to predict class membership• Regression Models to predict a number
Copyr igh t © 2012, SAS Ins t i tu te Inc . A l l r igh ts reserved.
THE GOAL? SCORING!
• Scoring is the act of applying what we’ve learned from data mining to new cases.
• Keep this goal in mind and use it to help formulate the questions and the data needed for data mining and scoring.
Copyr igh t © 2012, SAS Ins t i tu te Inc . A l l r igh ts reserved.
THE ULTIMATE GOAL? BETTER DECISIONS
• The ultimate goal of data mining is to improve decision making.
• As you formulate your problem, also keep in mind how and when model scores will be used.
Copyr igh t © 2012, SAS Ins t i tu te Inc . A l l r igh ts reserved.
EXAMPLE DEVELOPING A CLASSIFICATION MODEL
• Models are developed using historical data in which the behavior is observed or known.
• Information about each subject, in this case an individual, is used as inputs to the model to see how well the model can distinguish between the people who exhibit the behavior and those who do not. For example, age, gender, previous behaviors, etc.
Indicates the behavior was observed in this subject
Copyr igh t © 2012, SAS Ins t i tu te Inc . A l l r igh ts reserved.
EXAMPLE DATA
Copyr igh t © 2012, SAS Ins t i tu te Inc . A l l r igh ts reserved.
WHY?
• Consider a group of subjects whose relevant behavior is unknown.
• The same information is available for each of these subjects (age, gender, etc.) as is available for the individuals with known behavior.
• We would like to know which individuals are most likely to have the relevant behavior.
Copyr igh t © 2012, SAS Ins t i tu te Inc . A l l r igh ts reserved.
EXAMPLE NEW DATA
?
Copyr igh t © 2012, SAS Ins t i tu te Inc . A l l r igh ts reserved.
SCORING
• The output of a predictive classification model output is typically an equation. Models are applied to new cases to calculate the predicted behavior through a process called scoring.
• Scoring, using the equation, calculates each subject’s likelihood to have the relevant behavior. (It also calculates the likelihood to not have the behavior.)
Copyr igh t © 2012, SAS Ins t i tu te Inc . A l l r igh ts reserved.
EXAMPLE SCORED DATA
Copyr igh t © 2012, SAS Ins t i tu te Inc . A l l r igh ts reserved.
IDENTIFY /FORMULATE
PROBLEMDATA
PREPARATION
DATAEXPLORATION
TRANSFORM& SELECT
BUILDMODEL
VALIDATEMODEL
DEPLOYMODEL
EVALUATE /MONITORRESULTSDomain Expert
Makes DecisionsEvaluates Processes and ROI
BUSINESSMANAGER
Data PreparationModel ValidationModel DeploymentModel Monitoring
IT SYSTEMS /MANAGEMENT
Data ExplorationData VisualizationReport Creation
BUSINESSANALYST
Exploratory AnalysisDescriptive SegmentationPredictive Modeling
DATA MINER /STATISTICIAN
THE ANALYTICS LIFECYCLE
Copyr igh t © 2012, SAS Ins t i tu te Inc . A l l r igh ts reserved.
IDENTIFY /FORMULATE
PROBLEMDATA
PREPARATION
DATAEXPLORATION
TRANSFORM& SELECT
BUILDMODEL
VALIDATEMODEL
DEPLOYMODEL
EVALUATE /MONITORRESULTS
THE ANALYTICS LIFECYCLE
Copyr igh t © 2012, SAS Ins t i tu te Inc . A l l r igh ts reserved.
MAIN TYPES OF DATA MARTS
One-Row-per-Subject Data Mart
Multiple-Row-per-Subject Data Mart
LongitudinalData Mart
Copyr igh t © 2012, SAS Ins t i tu te Inc . A l l r igh ts reserved.
IDENTIFY /FORMULATE
PROBLEMDATA
PREPARATION
DATAEXPLORATION
TRANSFORM& SELECT
BUILDMODEL
VALIDATEMODEL
DEPLOYMODEL
EVALUATE /MONITORRESULTS
Exploratory AnalysisDescriptive SegmentationPredictive Modeling
DATA MINER /STATISTICIAN
THE ANALYTICS LIFECYCLE
SAS® Enterprise Miner™ focuses on these aspects of the process.
Copyr igh t © 2012, SAS Ins t i tu te Inc . A l l r igh ts reserved.
SAS® ENTERPRISE MINER™
Copyr igh t © 2012, SAS Ins t i tu te Inc . A l l r igh ts reserved.
SAS® ENTERPRISE MINER™
Copyr igh t © 2012, SAS Ins t i tu te Inc . A l l r igh ts reserved.
SAS® ENTERPRISE MINER™
• Organized and logical GUI for data mining success
• Unmatched suite of modeling techniques and methods
• Sophisticated set of data preparation, summarization and exploration tools
• Business-based model comparisons, reporting and management
Copyr igh t © 2012, SAS Ins t i tu te Inc . A l l r igh ts reserved.
SAS® ENTERPRISE MINER™
• Automated scoring process delivers faster results
• High-performance grid-enabled workbench
• Modern, distributable data mining system suited for large enterprises
• Open, extensible design for ultimate flexibility
Copyr igh t © 2012, SAS Ins t i tu te Inc . A l l r igh ts reserved.
WHAT IS SAS®
ENTERPRISE MINER™?
• SAS Enterprise Miner is a sophisticated graphical user interface, designed with the specific needs of data miners in mind.
• SAS Enterprise Miner is a data miner’s workbench that manages the process and provides a comprehensive set of tools to aid the data miner throughout the essential steps, known by the acronym, SEMMA: Sample, Explore, Modify, Model, Assess.
• SAS Enterprise Miner streamlines the data mining process to create highly accurate predictive and descriptive models based on analysis of vast amounts of data from across an enterprise.
Copyr igh t © 2012, SAS Ins t i tu te Inc . A l l r igh ts reserved.
DATA MINING WITH SAS®
ENTERPRISE MINER™
Copyr igh t © 2012, SAS Ins t i tu te Inc . A l l r igh ts reserved.
SAS® ENTERPRISE MINER™ 7.1 AND
12.1MODEL DEVELOPMENT PROCESS (SEMMA)
Sample Explore Modify Model Assess
Utility
Copyr igh t © 2012, SAS Ins t i tu te Inc . A l l r igh ts reserved.
SAS® ENTERPRISE MINER™
Copyr igh t © 2012, SAS Ins t i tu te Inc . A l l r igh ts reserved.
SAS® ENTERPRISE MINER™
• Use the desired tools to define a logical process (SEMMA)
Sample Explore Modify Model Assess
Copyr igh t © 2012, SAS Ins t i tu te Inc . A l l r igh ts reserved.
SAS® ENTERPRISE MINER™
• Modify settings (properties) for the tools.
Copyr igh t © 2012, SAS Ins t i tu te Inc . A l l r igh ts reserved.
SAS® ENTERPRISE MINER™
• Run the flow and check results. Refine as needed.
Copyr igh t © 2012, SAS Ins t i tu te Inc . A l l r igh ts reserved.
DEMONSTRATION
Copyr igh t © 2012, SAS Ins t i tu te Inc . A l l r igh ts reserved.
AUTOMATED PREDICTIVE MODELING
Copyr igh t © 2012, SAS Ins t i tu te Inc . A l l r igh ts reserved.
Copyr igh t © 2012, SAS Ins t i tu te Inc . A l l r igh ts reserved.
SAS RAPID PREDICTIVE
MODELERKEY DRIVERS (BUSINESS USERS)
• Need to generate numerous models to solve a variety of business problems in a credible manner
• Models need to be developed in a quick time-frame using a self-service approach
• Does not want to always rely on analytic professionals (e.g. statistician or modeler or data miner)
Copyr igh t © 2012, SAS Ins t i tu te Inc . A l l r igh ts reserved.
SAS RAPID PREDICTIVE
MODELERKEY DRIVERS (ANALYTIC PROFESSIONALS)
• Solving more complex issues on hand to gain incremental value
• Further customize or refine models for better results
Copyr igh t © 2012, SAS Ins t i tu te Inc . A l l r igh ts reserved.
RAPID PREDICTIVE MODELER
Copyr igh t © 2012, SAS Ins t i tu te Inc . A l l r igh ts reserved.
Open your data in SAS® Enterprise Guide or Microsoft Excel
Use the Rapid Predictive Modeler task and modify settings
Review results
Copyr igh t © 2012, SAS Ins t i tu te Inc . A l l r igh ts reserved.
Microsoft Excel
Copyr igh t © 2012, SAS Ins t i tu te Inc . A l l r igh ts reserved.
SAS Enterprise Guide
Copyr igh t © 2012, SAS Ins t i tu te Inc . A l l r igh ts reserved.
Copyr igh t © 2012, SAS Ins t i tu te Inc . A l l r igh ts reserved.
Copyr igh t © 2012, SAS Ins t i tu te Inc . A l l r igh ts reserved.
RAPID PREDICTIVE
MODELERBASIC
Copyr igh t © 2012, SAS Ins t i tu te Inc . A l l r igh ts reserved.
Copyr igh t © 2012, SAS Ins t i tu te Inc . A l l r igh ts reserved.
RAPID PREDICTIVE
MODELERINTERMEDIATE
Copyr igh t © 2012, SAS Ins t i tu te Inc . A l l r igh ts reserved.
Copyr igh t © 2012, SAS Ins t i tu te Inc . A l l r igh ts reserved.
RAPID PREDICTIVE
MODELERADVANCED
Copyr igh t © 2012, SAS Ins t i tu te Inc . A l l r igh ts reserved.
RAPID PREDICTIVE
MODELER: SAMPLE OUTPUT
Copyr igh t © 2012, SAS Ins t i tu te Inc . A l l r igh ts reserved.
Rapid Predictive Modeler: Sample Output
Copyr igh t © 2012, SAS Ins t i tu te Inc . A l l r igh ts reserved.
Rapid Predictive Modeler: Sample Output
Copyr igh t © 2012, SAS Ins t i tu te Inc . A l l r igh ts reserved.
Rapid Predictive Modeler: Sample Output
Copyr igh t © 2012, SAS Ins t i tu te Inc . A l l r igh ts reserved.
DEMONSTRATION
Copyr igh t © 2012, SAS Ins t i tu te Inc . A l l r igh ts reserved.
IN CONCLUSION
Copyr igh t © 2012, SAS Ins t i tu te Inc . A l l r igh ts reserved.
SAS® ENTERPRISE MINER™ BENEFITS
• Support the entire data mining process with a broad set of tools.
• Build more models faster with an easy-to-use Graphical User Interface.
• Enhance accuracy of predictions• Surface business information and easily share results through the unique model repository
Copyr igh t © 2012, SAS Ins t i tu te Inc . A l l r igh ts reserved.
RESOURCES
• SAS Rapid Predictive Modeler Website • Product brief, Press release, Brief product demo, etc.
• SAS Enterprise Miner Web Site• SAS Enterprise Miner Technical Support Web Site• SAS Enterprise Miner Technical Forum (Join Today!)• SAS Enterprise Miner Training
• “Rapid Predictive Modeling for Customer Intelligence”• SAS Global Forum 2010 paper written by Wayne Thompson and
David Duling, SAS Institute Inc., Cary, NC
Copyr igh t © 2012, SAS Ins t i tu te Inc . A l l r igh ts reserved.
POTENTIAL NEXT STEPS
• Work through the example in “Getting Started with SAS®
Enterprise Miner™” - Both the data and the documentation are available on support.sas.com http://support.sas.com/documentation/onlinedoc/miner/
• Contact SAS Technical Support if you get stuck• There is no charge for this – it is included in your SAS software
license.
Copyr igh t © 2012, SAS Ins t i tu te Inc . A l l r igh ts reserved.
Copyr igh t © 2012, SAS Ins t i tu te Inc . A l l r igh ts reserved. www.SAS.com
THANK YOU FOR USING SAS!