predictive credit risk scoring using sas enterprise miner

20
CREDIT SCORING USING SAS ENTERPRISE MINER AMAL SHANKER DESHBANDHU PACHAURI

Upload: amal-shanker

Post on 18-Nov-2014

1.347 views

Category:

Business


5 download

DESCRIPTION

 

TRANSCRIPT

Page 1: Predictive Credit Risk Scoring using SAS Enterprise Miner

CREDIT SCORINGUSING SAS ENTERPRISE MINER

AMAL SHANKERDESHBANDHU PACHAURI

Page 2: Predictive Credit Risk Scoring using SAS Enterprise Miner

LENDING CLUB: INTRODUCTIONFounded in 2007An online financial communityBringing together creditworthy borrowers and savvy investors

LATEST COMPANY STATISTICSLoans funded to date: $2,595,182,275Loans funded last month: $203,355,750Interest paid to investors since inception: $229,080,795

Image from https://www.lendingclub.com/public/how-peer-lending-works.action

Page 3: Predictive Credit Risk Scoring using SAS Enterprise Miner

AIM To build predictive decision models using SAS Enterprise miner that will

be the best indicator of Credit Worthiness.

Compare Regression analysis to Decision tree model and select the one

that predicts accurately.

DATA 42539 customers data 2007 to 2011 59 Variables

Page 4: Predictive Credit Risk Scoring using SAS Enterprise Miner

• TARGET VARIABLE = Bad Flags (Given)• Remaining were 58 INPUT VARIABLES• 58 was an enough big number to deal with.• Reducing this number to best 4 or 5 was the

first target.• METHODOLOGY USED:

• INTUITIVE METHODS• VARIABLE CATEGORIZATION• SAS FUNCTIONS

VARIABLE SELECTION

Page 5: Predictive Credit Risk Scoring using SAS Enterprise Miner

• INTUITIVE METHODSAll the variables were checked for any sort of preliminary data inconsistencyVARIABLES DISCARDED

VARIABLES WITH “SIGNIFICANTLY HIGH” MISSING VALUES Months since last delinquency (26929) Months since last record (38887)

VARIABLE WITH DIFFICULTY IN ROLE ASSIGNMENT Employment length

NON-USEFUL VARIABLES State, Member ID etc.

VARIABLES WITH SAME VALUES FOR ALL ROWS OR ALMOST ALL ROWS Tax liens Charge off within 12 months

VARIABLE SELECTION (Contd.)

Page 6: Predictive Credit Risk Scoring using SAS Enterprise Miner

VARIABLE SELECTION (Contd.)VARIABLES CATEGORIZATION

PRE-APPROVAL VARIABLESFICO range high, Annual Income etc.

DERIVATIVE VARIABLESCredit grade, Credit Sub-grade, Interest rate

POST-APPROVAL VARIABLESPrincipal paid, Interest paid

FINAL OUTCOME• 16 pre-approval variables

Page 7: Predictive Credit Risk Scoring using SAS Enterprise Miner

VARIABLE SELECTION (Contd.) SAS FUNCTIONS

16 pre-approval variables were then assessed using following SAS functions:

STATEXPLORE To check worthiness of the variables

VARIABLE CLUSTERING To identify and group variables with high degree of correlation

INPUT VARIABLES

STAT EXPLOREVARIABLE

CLUSTERING

Page 8: Predictive Credit Risk Scoring using SAS Enterprise Miner

STATEXPLORE: WORTH ANALYSISWORTH OF VARIABLES ANALYZED1. SUBGRADE2. FICO RANGE HIGH 3. FICO RANGE LOW4. PURPOSE5. REVOLVNG UTILITY6. PUBLIC RECORD BANKRUPTCIES7. ANNUAL INCOME8. PUBLIC RECORDS9. OPEN ACOUNT10. TOTAL ACCOUNT11. DTI12. REVOLVING BALANCE13. LOAN AMOUNT14. HOME OWNERSHIP15. DELINQUENCY IN 2 YEARS16. DELINQUENCY AMOUNT

Page 9: Predictive Credit Risk Scoring using SAS Enterprise Miner

CLUSTER 01. DELINQUENCY AMOUNT

CLUSTER 12. FICO RANGE HIGH3. FICO RANGE LOW4. REVOLVING UTILITY BALANCE

CLUSTER 25. OPEN ACCOUNTS6. TOTAL ACCOUNTS7. DTI

CLUSTER 31. PUBLIC RECORDS BANKRUPTCY2. PUBLIC RECORDS

CLUSTER 43. ANNUAL INCOME4. LOAN AMOUNT5. REVOLVING BALANCE

CLUSTER 56. DELINQUENCY SINCE 2 YEARS

VARIABLE CLUSTERING: CLUSTER ANALYSIS 1

PURPOSE AND HOME OWNERSHIP TOO!!!

Page 10: Predictive Credit Risk Scoring using SAS Enterprise Miner

VARIABLE CLUSTERING: CLUSTER ANALYSIS 2

CLUSTER 11. FICO RANGE HIGH2. DELINQUENCY SINCE 2 YEARS3. PUBLIC RECORDS BANKRUPTCY

CLUSTER 24. ANNUAL INCOME5. OPEN ACCOUNTS

PURPOSE ANDHOME OWNERSHIP STILL UNDER CONSIDERATION!!!

Page 11: Predictive Credit Risk Scoring using SAS Enterprise Miner

BEGINNING

PRE-APPROVAL VARIABLES

WORTH ANALYSIS

CLUSTER ANALYSIS 1

CLUSTER ANALYSIS 2

58161574

SUMMARY: VARIABLE SELECTION

Page 12: Predictive Credit Risk Scoring using SAS Enterprise Miner

FINAL 4 INPUT VARIABLES

FICO RANGE HIGH PURPOSE ANNUAL INCOME HOME OWNERSHIP

Page 13: Predictive Credit Risk Scoring using SAS Enterprise Miner

SAS DIAGRAM

IMPUTEDATA

PARTITION

DECISION TREE

REGRESSION

LOAN V3INPUT VARIABLES

MODEL COMPARISION

WORKSPACE DIAGRAM

Page 14: Predictive Credit Risk Scoring using SAS Enterprise Miner

Good 6373 1500 9,559,500.00Bad 565 10000 5,650,000.00Total 6938 3,909,500.00

563.49EARNINGS PER CUSTOMER

DECISION TREE ANALYSIS

MODEL PROFITABILITY CALCULATIONS

CUMULATIVE LIFT

Page 15: Predictive Credit Risk Scoring using SAS Enterprise Miner

DECISION TREE OUTPUT

OBS NAME LABEL NRULES IMPORTANCE VIMPORTANCE RATIO

1 IMP_fico_rangehigh Imp: fico_rangehigh 1 1 1 1

2 IMP_annual_inc Imp: annual_inc 1 0.2694 0.2016 0.7484

3 IMP_purpose Imp: purpose 1 0.193 0 0

4 IMP_homeownershipImp: home_ownership 1 0.1075 0.1139 1.0601

VARIABLE IMPORTANCE

Page 16: Predictive Credit Risk Scoring using SAS Enterprise Miner

REGRESSION ANALYSISCumulative % Cumulative Number of Mean Posterior

Depth Gain Lift Lift Response % Response Observations Probability PRODUCT5 124.595 2.24595 2.24595 27.9671 27.9671 851 0.28826 245.3093

10 106.193 1.87791 2.06193 23.3843 25.6757 851 0.2125 180.837515 88.735 1.53819 1.88735 19.1539 23.5018 851 0.19381 164.932320 79.298 1.50988 1.79298 18.8014 22.3267 851 0.17844 151.852425 69.107 1.2834 1.69107 15.9812 21.0576 851 0.16588 141.163930 57.751 1.00973 1.57751 12.5734 19.6436 851 0.15506 131.956135 50.204 1.04871 1.50204 13.0588 18.7038 850 0.14529 123.496540 45.111 1.09466 1.45111 13.631 18.0696 851 0.13596 115.70245 39.263 0.9248 1.39263 11.5159 17.3413 851 0.12699 108.068550 33.734 0.83987 1.33734 10.4583 16.653 851 0.11909 101.345655 31.099 1.04748 1.31099 13.0435 16.3248 851 0.11126 94.6822660 27.33 0.85874 1.2733 10.6933 15.8555 851 0.10387 88.3933765 23.85 0.821 1.2385 10.2233 15.4222 851 0.09661 82.2151170 19.867 0.68025 1.19867 8.4706 14.9261 850 0.08954 76.10975 17.286 0.81156 1.17286 10.1058 14.6047 851 0.0824 70.122480 12.746 0.44667 1.12746 5.5621 14.0395 851 0.0746 63.484685 9.481 0.5725 1.09481 7.1289 13.6329 851 0.06686 56.8978690 6.754 0.60395 1.06754 7.5206 13.2933 851 0.05864 49.9026495 3.42 0.43409 1.0342 5.4054 12.8781 851 0.04906 41.75006

100 0 0.34957 1 4.3529 12.4523 850 0.03406 28.95111911 1101.121

Good 10809.88 1500 16214819Bad 1101.121 10000 11011210Total 11911 5203609

436.8742EARNINGS PER CUSTOMER

CUMULATIVE LIFT

MODEL PROFITABILITY CALCULATIONS

Page 17: Predictive Credit Risk Scoring using SAS Enterprise Miner

MODEL COMPARISION

Event  Classification            

ModelSelection based on Valid: Misclassification Rate (_V MISC_)    

Model Model Data   FALSE TRUE FALSE TRUE

Node Description Role Target Negative Negative Positive Positive               Tree2 Decision Tree TRAIN Bad_Flag 3176 22345 0 0

Tree2 Decision TreeVALIDATE Bad_Flag 2119 14898 0 0

Reg2 Regression TRAIN Bad_Flag 3176 22344 1 0

Reg2 RegressionVALIDATE Bad_Flag 2119 14898 . 0

Page 18: Predictive Credit Risk Scoring using SAS Enterprise Miner

MODEL COMPARISION

Page 19: Predictive Credit Risk Scoring using SAS Enterprise Miner

REGRESSION VS DECISION TREE MODELGood 10809.88 1500 16214819Bad 1101.121 10000 11011210Total 11911 5203609

436.8742EARNINGS PER CUSTOMER

Good 6373 1500 9,559,500.00Bad 565 10000 5,650,000.00Total 6938 3,909,500.00

563.49EARNINGS PER CUSTOMER

REGRESSION ANALYSISDECISION TREE ANALYSIS

436.87

563.49

Page 20: Predictive Credit Risk Scoring using SAS Enterprise Miner

CONCLUSION Decision tree model is more credit worthy Most significant factor to consider is credit score Regression analysis shows more relative total earnings Decision analysis shows more earnings per customer