Date post: | 07-Aug-2015 |
Category: |
Technology |
Upload: | hyderabad-scalability-meetup |
View: | 66 times |
Download: | 1 times |
Introduction to business signals
Use Cases for Signals
Signal Extraction : Deep Dive
R Introduction & Common Commands
Q & A
1
What we will cover in the 60 mins
2
3
4
5
What exactly is a Signal ?
• A Signal is a pattern
• It is indicative of an impending business outcome
• For example • In Telecom, billing resolution errors is a signal for customer churn
• In Retail, search frequency is a signal of purchase intent
• In Healthcare, decrease in inter hospital visit is a signal of a medical condition
• Early warning sign - Detection of the signal gives time for the business to intervene and influence an outcome
4 Business Signals in Banking Industry
1. Large balance is a signal for mutual fund product
2. Frequent bottoming is a signal for a loan product
3. Repetitive transmission before 5th of every month is a signal for loan refinance
4. Increase in frequency of delayed loan payments is a signal for PD
3 Business Signals in Retail Industry
1. Downloading a Digital/Mobile coupon request is a demand signal for product
2. Searching for a store location is a proxy for expected demand at store
3. Season change is a demand signal for certain types of products
4 Business Signals in Telecom industry
1. Identity management does not have event but application logs records a login event. Signal for a security event
2. 5 dropped calls in last 3 weeks is a signal for churn
3. Data pack exceeded 40 % of time in last 6 months is a signal for upgrade
4. Billing related tweet frequency is a signal for churn
Is their a method to the madness ? Methodology for Signal extraction
1.Biz problem to solve
2. Data model
3. Analytical Model
4. Univariate
5. Correlations
6.Cross tab 7. Model building
8. Business narrative
9. Action and ROI
Step-1 : Business problem to solve
• I am an IT company focussing on services
• Have 300,000 employees globally
• My business model is dependent upon people
• How do I reduce attrition in my company ?
• Powerful Unanswered Questions • Can I give an attrition score for every employee ?
• Which factors are the top 3 drivers of attrition in my company
1.Biz problem to solve
2. Data model
3. Analytical Model
4. Univariate
5. Correlations
6.Cross tab 7. Model building
8. Business narrative
9. Action and ROI
Step-2 : Data Model
1. Employee id
2. Employee name
3. Tenure
4. Niche skill flag
5. Appraisal rating
6. Change in appraisal
7. Salary change
8. Relative Peer group benchmark index
9. Relative Market salary benchmark index
10. Manager
11. Project type ( support / dev / maint )
12. Technology ( mainframe
13. SMAC flag – yes/no ( Social / Mobile / Analytics / Cloud )
14. Has been abroad
1.Biz problem to solve
2. Data model
3. Analytical Model
4. Univariate
5. Correlations
6.Cross tab 7. Model building
8. Business narrative
9. Action and ROI
Step-3 : Analytical Model
1. Scoring Model
2. Logistic
3. Decision tree
4. Support Vector Machine
1.Biz problem to solve
2. Data model
3. Analytical Model
4. Univariate
5. Correlations
6.Cross tab 7. Model building
8. Business narrative
9. Action and ROI
Step-4 : Univariate Analysis
1. Trends
2. Seasonality
3. Distributions
4. Min/Max/Median/Average
5. Outlier detection
1.Biz problem to solve
2. Data model
3. Analytical Model
4. Univariate
5. Correlations
6.Cross tab 7. Model building
8. Business narrative
9. Action and ROI
Step-5 : Correlation Analysis
1. Between numeric outcome and numeric predictors
2. Examine correlation coefficient
3. If correlation coefficient > 0.6 consider it as a potential input to the model
1.Biz problem to solve
2. Data model
3. Analytical Model
4. Univariate
5. Correlations
6.Cross tab 7. Model building
8. Business narrative
9. Action and ROI
Step-6 : Crosstab Analysis
1. Find anomalies in distribution
2. For example are number of churners higher for 30-35 group in Bangalore than what is normally expected
1.Biz problem to solve
2. Data model
3. Analytical Model
4. Univariate
5. Correlations
6.Cross tab 7. Model building
8. Business narrative
9. Action and ROI
Step-7 : Model Building
1. Identify a technique like logistic
2. Build the model
3. Examine the statistical significance / quality of model
4. Examine the predictor power of input vectors
5. Iterate ! Iterate ! Iterate
1.Biz problem to solve
2. Data model
3. Analytical Model
4. Univariate
5. Correlations
6.Cross tab 7. Model building
8. Business narrative
9. Action and ROI
Step-8 : Biz Narrative
1. Convert statistical model into a business narrative
1.Biz problem to solve
2. Data model
3. Analytical Model
4. Univariate
5. Correlations
6.Cross tab 7. Model building
8. Business narrative
9. Action and ROI
Step-9 : Actions and ROI
1. Identify specific outbound actions to trigger in response to signal detected
2. Examine the ROI of the actions
3. Recalibrate
1.Biz problem to solve
2. Data model
3. Analytical Model
4. Univariate
5. Correlations
6.Cross tab 7. Model building
8. Business narrative
9. Action and ROI
Intro to R
• R is a scripting language for statistical + graphical analysis
• R: initially written by Ross Ihaka and Robert Gentleman at Dep. of Statistics of U of Auckland, New Zealand during 1990s.
Step-1 : Download R
• http://cran.r-project.org/bin/windows/base/
• Please create a directory called /dataproject
Step-2 : How do we find out where we are ? “getwd”
• Get the current working directory
• Create a separate folder to hold all training data on your machine
Step-3 : How do we set the working directory ? “setwd”
• Change current working directory using setwd command
Step-3 : How do we load some data into R ? “read”
• You can also give the full path if required along with the file name
What is the name of the data file to load ? 2 3
What is the name of the R file handler ?
1
Read function
Step-4 : How do we see data we just loaded into R ?
• Type in the name of the file pointer to see the loaded data
File pointer
1
Step-2.1 : Analysing meta model How many rows + columns are presented in the imported data ?
“Dim” shows the number of rows and columns in the dataset imported
Step-2.2 : Analysing meta model ( cont …) What are the columns in the imported dataset ?
“names” shows the columns
Step-2.3 : Analysing meta model ( cont …) What are the columns, data types + sample values in the imported dataset ? “str” shows the depth of observations and breadth
of variables used along with sample values
Step-2.4 : Analysing meta model ( cont …) What are the columns + class + # of observation rows in the imported dataset ? “str” shows the depth of observations and breadth
of variables used along with sample values
Step-3 : How to extract a subset of data based on conditions? “subset”
• 3 Key elements 1. subset
2. filepointer
3. condition
Subset command
1
2 File pointer
3 Condition clause
Step-4 : How to extract a subset of data based on MULTIPLE conditions?
• Logical operators like & or etc
Subset command
1
2 File pointer
3 Condition-1
5 Condition-2
4
Separator
Step-6 : How to apply conditional statements to get count of observations matching criteria ? “table”
WHERE CLAUSE
No of observations which matched the condition
Step-7 : How to find the median from a range of observations? median() –
Column on which to find median File pointer
Median
Parting thought –
Slide 39
Industrial revolution,
Oil powered machines
Services revolution
Analytics powers processes
Big Data Resources • datasciencecentral.com
• bigdatauniversity.com
• Courseera.com
• Big Data Architecture
• Spotting Signals in Big Data
• Signal Extraction Methodology
• Advanced Visualization in Big Data
• Exploratory Data Analysis (EDA) : Quick Deep Dive
• Best practices in designing dashboards and scorecards
• Exploring Big Data Using Bivariate Analysis
• Where to start looking in Big Data using Univariate Analysis
• Big Data Platform & Applications
• Statistics Role in Data Science
• Applied Mathematics Role in Data Science
• Data-Scientist-playbook
• 5-disruption-data-products By Data Science