+ All Categories
Home > Data & Analytics > Estimation of the probability of default : Credit Rish

Estimation of the probability of default : Credit Rish

Date post: 15-Apr-2017
Category:
Upload: arsalan-qadri
View: 415 times
Download: 1 times
Share this document with a friend
33
Estimating the probability of default: Credit Risk Mohamed Arsalan Qadri Sarvesh Saurabh Mohit Ravi
Transcript
Page 1: Estimation of the probability of default : Credit Rish

Estimating the probability of default: Credit Risk

Mohamed Arsalan QadriSarvesh SaurabhMohit Ravi

Page 2: Estimation of the probability of default : Credit Rish

Summary

• Credit risk – The probability of default• Data Cleansing• Logistic Regression• Linear Discriminant Analysis• Comparison of the LR and LDA• Factor Analysis

Page 3: Estimation of the probability of default : Credit Rish

Credit RiskWhat is it?• The risk of default on a debt that may arise from a borrower failing to make

required payments.

Impact on the lender?• Lost principal and interest, disruption to cash flows, and increased collection

costs.

How to estimate it?• Credit risk arises from the potential that a borrower or counterparty will fail to perform

on an obligation

Page 4: Estimation of the probability of default : Credit Rish

Sources of risk?• For most banks, loans are the largest and most obvious source of credit risk.

• There are other sources of credit risk both on and off the balance sheet including letters of credit unfunded loan commitments, and lines of credit.

• Other products, activities, and services that expose a bank to credit risk are credit derivatives, foreign exchange, and cash management services.

Credit Risk

Page 5: Estimation of the probability of default : Credit Rish

Credit Scoring vs RiskEstimation of risk?• The risk posed by the borrower is inversely proportional to the credit score.• A statistically derived numeric expression of a person's creditworthiness that is used by

lenders to access the likelihood that a person will repay his or her debts. • A credit score is based on, among other things, a person's past credit history (300-850)

Page 6: Estimation of the probability of default : Credit Rish

Credit Scoring

• Consumers can typically keep their credit scores high by maintaining a long history of always paying their bills on time and not having too much debt.

• A FICO score is the most widely used credit scoring system.

• A credit score is primarily based on a credit report information typically sourced from credit bureaus.

Page 7: Estimation of the probability of default : Credit Rish

Data Cleaning

Page 8: Estimation of the probability of default : Credit Rish

Data Cleaning• Serious Delinquency in two years. (Make a Pi chart for this)

Page 9: Estimation of the probability of default : Credit Rish

• Revolving Utilization Of Unsecured Lines

Data Cleaning

Page 10: Estimation of the probability of default : Credit Rish

• Age

Data Cleaning

Page 11: Estimation of the probability of default : Credit Rish

• Number Of Time 30-59 Days Past Due Not Worse Data Cleaning

Page 12: Estimation of the probability of default : Credit Rish

• Number Of Time 60-89 Days Past Due Not Worse Data Cleaning

Page 13: Estimation of the probability of default : Credit Rish

• Number Of Times 90 Days LateData Cleaning

Page 14: Estimation of the probability of default : Credit Rish

• Monthly Income• Replaced with Mean

Data Cleaning

Page 15: Estimation of the probability of default : Credit Rish

Data Cleaning• Monthly Income• Ran Multiple Linear Regression on Missing Values

Page 16: Estimation of the probability of default : Credit Rish

Data Cleaning• Monthly Income• The Histogram after running Multiple Linear Regression on Missing Values

Page 17: Estimation of the probability of default : Credit Rish

Data Cleaning• Debt Ratio• We found that the Debt Ratio was extremely high in many cases.• Upon Closer inspection, we found out that high debt ratio was present for those records

whose Monthly Income was unknown.• From this we inferred that the Debt Ratio could most probably be the Debt.

Page 18: Estimation of the probability of default : Credit Rish

Data Cleaning• Debt Ratio• We replaced the high values of debt ratio by dividing it by the predicted values of the

monthly income.• The new mean after replacement was 0.67

Page 19: Estimation of the probability of default : Credit Rish

Data Cleaning• Number of Dependents

Page 20: Estimation of the probability of default : Credit Rish

Data Modelling

• Split the dataset into Training data (70%) and Test Data (30%).• Computed Co-relation Matrix among Independent variables. • The variables had very less Co-relation amongst themselves.• Ran Logistic Regression by using Stepwise selection.• Ran Linear Discriminant Analysis.• Compared both the models by measuring their accuracy of prediction.• Ran both models on significant Factors using Factor Analysis.

Page 21: Estimation of the probability of default : Credit Rish

Logistic Regression

Page 22: Estimation of the probability of default : Credit Rish

Logistic Regression

• Ran Logistic Regression separately for each variable.• Computed the ROC curve for each variable and compared the AUC value.

Page 23: Estimation of the probability of default : Credit Rish

Stepwise Selection

• Overall Model was Significant.• All the variables were included in the

model.• The model built on the Training data

was tested on the Test data.• Probability of default > 0.7 was coded

as 1, and Probability of default <0.7 was coded as 0.

Page 24: Estimation of the probability of default : Credit Rish

Logistic Regression on Test Data

Overall Accuracy = (41374+291)/(41374+291+175+2661)

= 93.6 % True Positive Rate = TP / (TP+FN) = 9.85% True Negative Rate = TN / (TN+FP) = 99.5%

Predicted Values Actual Values

Confusion Matrix

Page 25: Estimation of the probability of default : Credit Rish

ROC curve for Test Data

• AUC Value = 0.8557

Page 26: Estimation of the probability of default : Credit Rish

Discriminant Analysis

Page 27: Estimation of the probability of default : Credit Rish

Discriminant Analysis

Overall accuracy =(38134+1717)/Total =89.5 %

True Positive Rate = TP / (TP+FN) = 58%

True Negative Rate = TN / (TN+FP) = 91.7%

Predicted Predicted0 1

Actual0

Actual1

38134 3415

1235 1717

Serious Deliquen

Page 28: Estimation of the probability of default : Credit Rish

Comparison of Models

Linear Discriminant Analysis

Overall accuracy =89.5 %

Predicted Predicted0 1

Actual0

Actual1

38134 3415

1235 1717

Serious Deliquen

Logistic Regression

Overall Accuracy = 93.6 %

Page 29: Estimation of the probability of default : Credit Rish

Normality of variables

Page 30: Estimation of the probability of default : Credit Rish

Factor Analysis

Page 31: Estimation of the probability of default : Credit Rish

Factor Analysis

  Factor Pattern

    Factor1 Factor2 Factor3 Factor4

  NumberOfTimes90DaysLate 0.54684 0.28062 0.26286 -0.0429

Factor 1 NumberOfTime60_89DaysPastDueNot 0.50016 0.3943 0.37949 -0.0015

  RevolvingUtilizationOfUnsecured 0.60945 0.24942 -0.1861 -0.0285

  NumberOfOpenCreditLinesAndLoans -0.5203 0.5275 0.1922 0.15051

  NumberRealEstateLoansOrLines -0.4698 0.61529 -0.0292 0.09694

Factor 2 NumberOfDependents_num 0.03058 0.46357 -0.6034 -0.008

  Monthlyincome_debt -0.4298 0.5044 -0.09 -0.1628

  NumberOfTime30_59DaysPastDueNot 0.40861 0.49901 0.31943 0.05977

Factor 3 age -0.4301 -0.1476 0.65733 -0.0396

Factor 4 DebtRatio 0.05584 -0.0712 -0.0331 0.97112

Page 32: Estimation of the probability of default : Credit Rish

Conclusion

• 80% time spent on Data cleaning

• Logistic Regression gives better results when data is not normal as compared to LDA

• Factors can be grouped for a logical understanding, with Debt Ratio and age explaining high variance.

Page 33: Estimation of the probability of default : Credit Rish

Thank you


Recommended