+ All Categories
Home > Documents > Dissertation Presentation Bidyut Mondal

Dissertation Presentation Bidyut Mondal

Date post: 30-Jan-2016
Category:
Upload: rano-joy
View: 223 times
Download: 1 times
Share this document with a friend
Description:
Dissertation Presentation
Popular Tags:
22
Decision Making using Big Data Analytics in International Business Presenter Bidyut Kumar Mondal Roll – 5 MBA (IB) 2012 - 15 Under Supervision Of Prof. Dr. P. K. Das IIFT (Kolkata Campus)
Transcript
Page 1: Dissertation Presentation Bidyut Mondal

Decision Making using Big Data

Analytics in International

Business

Presenter Bidyut Kumar Mondal Roll – 5 MBA (IB) 2012 - 15

Under Supervision OfProf. Dr. P. K. Das

IIFT (Kolkata Campus)

Page 2: Dissertation Presentation Bidyut Mondal

Agenda

1 Background

2 Objectives

4 Repository of Analytical Tools5 Repository of Big Data Techniques

7 Binary Logistic Regression

3 Perspective on Big Data

8 Research Methodology

9 Results & Interpretation

6 An Application to Credit Risk Modeling

10 Conclusion

Page 3: Dissertation Presentation Bidyut Mondal

Background

Recent trends towards data

driven industry

Huge volume of data is being

generated everyday.

Issue is how to store & analyze the data to get

information

So, big data analytics came into existence.

Organizations utilizing power of

big data are ahead of competition.

Big data will change the way

people live

Page 4: Dissertation Presentation Bidyut Mondal

Objectives

To develop a repository of analytical tools appropriate for real-life problem solving in different sectors.

To study the use of big data analytics in different domains for taking international business decision.

To apply appropriate classification techniques to establish a model to classify defaulter in loan on secondary big data.

Page 5: Dissertation Presentation Bidyut Mondal

Big data – a perspective

Page 6: Dissertation Presentation Bidyut Mondal

Repository of Analytical TechniquesItem Area Statistical Method Data Requirement

1 Market Segmentation Cluster Analysis Buy & sell data for long period

2 Purchase Intention Factor AnalysisSurvey to get rating of each product

attribute

3 Churn Analysis Binary Logistic RegressionInstances of customers who left and who

stayed with the service/organization

4 Credit Default Probability Binary Logistic RegressionData where there are instances of

default and non-default both

5 Group belongingness Discriminant AnalysisData where there are instances of

person belonging to a group and do not belong

6 Probability of Disease of a group Binary Logistic RegressionData where there are instances of

person having disease and do not have

7 Calculate Price Elasticity Regression AnalysisPrice of a product in different times and

sales of the product at that time

8Calculate Productivity of

EmployeesANOVA

Data of employee output on different work condition

9Find out brand

positioning/product positioningMultidimensional scaling

Customer rating on similarity for each pair of product or brand in 7 point Likert

scale.

10 Lost Sales Analysis Binary Logistic RegressionData where there are instances of bids resulting in sales and which do not got

converted into sales

11 Demand Forecasting Time Series ForecastingHistorical demand data of previous years

for more than 10 years of data

12 New Product Design Conjoint AnalysisPreference rank data for each of the

attribute of the product is taken from the respondent.

13 Quality Control Hypotheses TestingA random sample is drawn from the

production floor and Z test or t test is applied on the sample.

14 Customer Loyalty Analysis Regression AnalysisData should be collected from sample

respondent about how satisfied they are with product and how long he is buying

Page 7: Dissertation Presentation Bidyut Mondal

Repository of Big data TechniquesItem Data Pattern Big Data Analysis Business Area Analysis Tool

1

Customer activity based data like Website tracking history, purchase data, call centre data, mobile data etc. are example of activity-based data Predictive Analysis Segmentation Cluster Analysis

2User online profile data and their online purchase history and pattern

Predictive Analysis Digital Marketing Factor Analysis

3Customer’s footprints in network, clicks, browse, comments, review etc.

Predictive Analysis Purchase Intention Binary Logistic Regression

4Customer product/service usage pattern data and customer demography data

 Predictive Analysis

Churn AnalysisBinary Logistic

Regression

5

Bank and financial institution data about loan and their current status along with customer demography

 

Predictive Analysis Credit Default Modelling

Binary Logistic Regression

6Historical health parameter data of animals in a dairy firm

 Predictive Analysis

Agriculture Discriminant Analysis

7

Historical data received from the GPS tracker of consignments in shipment about its location and condition

 

Predictive AnalysisLogistics Discriminant Analysis

8

Data on customer buying pattern and clicking pattern on different cultural festival from online retail website.

 

Predictive AnalysisRetail Classification Techniques

9Customer purchase data given that the customers are provided with facilities

like bonus card.Predictive Analysis

Retail Regression Analysis

10Patient health data and their track record of disease.

 Predictive Analysis

HealthcareBinary Logistic

Regression

11

Historical Data of marketing expenses and the demand of that period for several years

 

Predictive AnalysisCRM

Multiple Regression   

12

customers' spending,usage and other behaviour exhibited in a retail shop

 

Predictive Analysis Marketing(Cross Sell)

Multiple Regression

13Historical demand data in store level and inventory level.

 Predictive Analysis

Retail(Inventory

Requirement)Time Series Forecasting

14Historical data of risk and return of a portfolio.

 Predictive Analysis

Finance Regression Analysis.

15Historical data of unemployment of a country.

 Predictive Analysis

Economics Time Series Forecasting

16Different document and their key words while uploading the document in online

website.Predictive Analysis

Web Publishing Discriminant Analysis

Page 8: Dissertation Presentation Bidyut Mondal

Big data – Case Studies

AgricultureAgriculture Texan Dairy: Case – Cattle HealthTexan Dairy: Case – Cattle Health

LogisticsLogistics DHL: Case – Predictive AnalysisDHL: Case – Predictive Analysis

Online RetailOnline Retail Amazon: Case – Predictive ShipmentAmazon: Case – Predictive Shipment

RetailRetail Walmart: Case – Customer LoyaltyWalmart: Case – Customer Loyalty

HealthcareHealthcare CCHHS: Case – Disease PredictionCCHHS: Case – Disease Prediction

Page 9: Dissertation Presentation Bidyut Mondal

An Application to Credit Risk Modelling

Is it possible to predict whether a customer is likely to default in the

loan before sanctioning?

Lowering NPALowering NPAIncrease

Customer Base

Increase Customer

Base

Page 10: Dissertation Presentation Bidyut Mondal

Binary Logistic Regression - Variables

Dependant Variable - Dichotomous

Independent Variable – Categorical or numerical

Independent Variable – Categorical variables need coding

Page 11: Dissertation Presentation Bidyut Mondal

Binary Logistic Regression - Assumptions

Logistic regression does not rely on distributional assumptions in the same sense that discriminant analysis does.

However, your solution may be more stable if your predictors have a multivariate normal distribution

Additionally, as with other forms of regression, multi-collinearity among the predictors can lead to inflated standard errors

The procedure is most effective when group membership is a truly categorical variable

Page 12: Dissertation Presentation Bidyut Mondal

Binary Logistic Regression - Odds

Odds RatioOdds Ratio

Log of odds Ratio

Log of odds Ratio

Page 13: Dissertation Presentation Bidyut Mondal

Research Methodology

Data CollectionData Collection

Data CleaningData Cleaning

Data CodingData Coding

Binary Logistic Regression

Binary Logistic Regression

ROC Analysis & Model SelectionROC Analysis & Model Selection

Page 14: Dissertation Presentation Bidyut Mondal

Data

Raw Data(3,91,000)

Debt Consolidatio

nCredit Card Home Loan

Page 15: Dissertation Presentation Bidyut Mondal

Binary Logistic Regression - Variables

Page 16: Dissertation Presentation Bidyut Mondal

Results & Interpretation – Credit Card Segment

Model 1

Observed

PredictedSelected Casesb Unselected Casesc

is_defaulterPercentage

Correct

is_defaulter Percentage

Correct0 1 0 1

Step 11is_defaulter

0 46085 825 98.24224

7796 98.2

1 2311 3053 56.9 2057 2626 56.1Overall Percentage     94.00    94.00

a. The cut value is .500

Model 2

Observed

Predicted

Selected Casesb Unselected Casesc

is_defaulter Percentage Correct

is_defaulter Percentage Correct0 1 0 1

Step 11is_defaulter

0 46012 898 98.1 42246 797 98.1

1 2658 2706 50.4 2342 2341 50

Overall Percentage     93.2     93.4

a. The cut value is .500

Page 17: Dissertation Presentation Bidyut Mondal

ROC Curve - Model1

Model Cutoff TP TN FP FN Sensitivity Specificity 1- Specificity

1

0.1 40473 4822 6437 542 0.99 0.43 0.570.2 43423 4370 3487 994 0.98 0.56 0.440.3 44843 3953 2067 1411 0.97 0.66 0.340.4 45624 3490 1286 1874 0.96 0.73 0.270.5 46085 3053 825 2311 0.95 0.79 0.210.6 46402 2631 508 2733 0.94 0.84 0.160.7 46609 2141 301 3223 0.94 0.88 0.120.8 46772 1651 138 3713 0.93 0.92 0.080.9 46864 1074 46 4290 0.92 0.96 0.04

Page 18: Dissertation Presentation Bidyut Mondal

ROC Curve – Model2

Model Cutoff TP TN FP FN Sensitivity Specificity 1- Specificity

2

0.1 39834 4771 7076 593 0.99 0.40 0.600.2 43216 4236 3694 1128 0.97 0.53 0.470.3 44667 3658 2243 1706 0.96 0.62 0.380.4 45466 3171 1444 2193 0.95 0.69 0.310.5 46012 2706 898 2658 0.95 0.75 0.250.6 46362 2246 548 3118 0.94 0.80 0.200.7 46586 1815 324 3549 0.93 0.85 0.150.8 46746 1374 164 3990 0.92 0.89 0.110.9 46869 881 41 4483 0.91 0.96 0.04

Page 19: Dissertation Presentation Bidyut Mondal

ROC Curve – Credit Card Segment

Model1 WinsModel1 Wins

Page 20: Dissertation Presentation Bidyut Mondal

Model1 – Credit Card Segment

 Variables B S.E. Wald Sig. Exp(B)

95% C.I.for EXP(B)

Lower Upper

 

annual_inc 0 0 3184.977 0.0 1 1 1

delinq_2yrs 0.708 0.089 62.6 0.0 2.03 1.704 2.419

dti 0.278 0.005 3643.476 0.0 1.32 1.15 1.564

emp_length_year -0.027 0.007 14.967 0.0 0.973 0.96 0.987

funded_amnt 0.004 0 6074.822 0.0 1.004 1.004 1.004

funded_amnt_inv -0.004 0 6165.636 0.0 0.996 0.996 0.996

inq_last_6mths -1.077 0.032 1128.505 0.0 0.341 0.32 0.363

int_rate 1.597 0.719 901.239 0.0 4.9382 4.8201 5.101

mths_since_last_delinq

0.031 0.001 492.118 0.0 1.031 1.029 1.034

term_months -0.11 0.005 532.142 0.0 0.896 0.888 0.905

total_rec_late_fee 5.256 0.104 2538.213 0.0 191.686 156.24 235.175

Constant 10.511 0.238 1951.251 0.0 36735.161   

Page 21: Dissertation Presentation Bidyut Mondal

Conclusion

The organization who will interpret it and convert them to actionable information will outperform among the competitors.

The organization who will interpret it and convert them to actionable information will outperform among the competitors.

Google, Amazon, Microsoft, IBM, DHL, P&G are some the leading organization who have leading the big data analytics in current marketGoogle, Amazon, Microsoft, IBM, DHL, P&G are some the leading organization who have leading the big data analytics in current market

How big data analytics and its strength will be used in an organization depends on organization culture

How big data analytics and its strength will be used in an organization depends on organization culture

Challenges – Data Collection, Technical, ExpertiseChallenges – Data Collection, Technical, Expertise

Threats – Individual Privacy, Need Govt regulation & monitoringThreats – Individual Privacy, Need Govt regulation & monitoring

Page 22: Dissertation Presentation Bidyut Mondal

Recommended