credit risksubscriptions.risk.net/wp-content/uploads/2019/02/... · is at the forefront in tackling...

82
Volume 14 Number 2 June 2018 Credit Risk The Journal of Credit default prediction using a support vector machine and a probabilistic neural network Mohammad Zoynul Abedin, Chi Guotai, Sisira Colombage and Fahmida-E-Moula Modeling dependent risk factors with CreditRisk + Xiaohang Zhang, SuBang Choe, Ji Zhu and Jill Bewick Consumer risk appetite, the credit cycle and the housing bubble Joseph L. Breeden and José J. Canals-Cerdá Trial Copy For all subscription queries, please call: UK/Europe: +44 (0) 207 316 9300 USA: +1 646 736 1850 ROW: +852 3411 4828

Upload: others

Post on 17-Jul-2020

0 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Credit Risksubscriptions.risk.net/wp-content/uploads/2019/02/... · is at the forefront in tackling the many issues and challenges posed by the recent financial crisis, focusing

The Jo

urn

al of C

redit R

isk Volum

e 14 Num

ber 2 June 2018

Volume 14 Number 2 June 2018

PEFC Certified

This book has been produced entirely from sustainable papers that are accredited as PEFC compliant.

www.pefc.org

Credit Risk

The Journal of

■ Credit default prediction using a support vector machine and a probabilistic neural network Mohammad Zoynul Abedin, Chi Guotai, Sisira Colombage and Fahmida-E-Moula

■ Modeling dependent risk factors with CreditRisk+ Xiaohang Zhang, SuBang Choe, Ji Zhu and Jill Bewick

■ Consumer risk appetite, the credit cycle and the housing bubble Joseph L. Breeden and José J. Canals-Cerdá

JCR_14-2_June18.indd 1 04/06/2018 10:11

Tria

l Cop

y For all subscription queries, please call:

UK/Europe: +44 (0) 207 316 9300

USA: +1 646 736 1850 ROW: +852 3411 4828

Page 2: Credit Risksubscriptions.risk.net/wp-content/uploads/2019/02/... · is at the forefront in tackling the many issues and challenges posed by the recent financial crisis, focusing

in numbers

140,000

Users

Page views

19,400+ on Regulation

6,900+ on Commodities

19,600+ on Risk Management

6,500+ on Asset Management

58,000+ articles stretching back 20 years

200+

New articles & technical papers

370,000

21,000+ on Derivatives £

Visit the world’s leading source of exclusive in-depth news & analysis on risk management, derivatives and complex fi nance now.

(each month)

(each month)

See what you’re missing

(each month)

RNET16-AD156x234-numbers.indd 1 21/03/2016 09:44

To subscribe to a Risk Journal visit subscriptions.risk.net/journals or email [email protected]

Page 3: Credit Risksubscriptions.risk.net/wp-content/uploads/2019/02/... · is at the forefront in tackling the many issues and challenges posed by the recent financial crisis, focusing

The Journal of Credit RiskEDITORIAL BOARD

Editors-in-Chief

Ashish Dev Federal Reserve BoardMichael Gordy Federal Reserve Board

Associate Editors

Linda Allen Baruch College, CUNYEdward Altman NYU Stern School of

BusinessJennie Bai Georgetown UniversityJ. L. Breeden Prescient Models LLCSudheer Chava Scheller College of

Business, Georgia TechArturo Cifuentes Columbia University

and CLAPES UCJonathan Crook The University of

EdinburghDarrell Duffie Stanford UniversityKay Giesecke Stanford UniversityJens Hilscher UC DavisJay Huang Penn State UniversityJohn Hull University of TorontoRobert Jarrow Cornell UniversityNikunj Kapadia University of

Massachusetts Amherst

Ahmet Kocagil Western AssetManagement

Holger Kraft Goethe UniversityAndre Lucas VU AmsterdamDilip Madan University of MarylandLoriana Pelizzon Goethe UniversityDmitry Pugachevsky QuantifiMichael Pykhtin Federal Reserve

BoardDan Rosen Fields InstitutePeter Ritchken Case Western Reserve

UniversityJorge R. Sobehart CitiStuart Turnbull University of

HoustonDonald R. Van Deventer Kamakura

CorporationFan Yu Claremont McKenna CollegeJing Zhang Moody’s Analytics

SUBSCRIPTIONS

The Journal of Credit Risk (Print ISSN 1744-6619 j Online ISSN 1755-9723) is published quarterlyby Infopro Digital, Haymarket House, 28–29 Haymarket, London SW1Y 4RX, UK.

Subscriptions to The Journal of Credit Risk, and Risk.net Journals, are available on an annual basis.To find out about the different options, including our exclusive academic rates which start from£100, visit subscriptions.risk.net/journals-print or contact [email protected].

All subscription orders, single/back issues orders, and changes of address should be sent to:

UK & Europe Office: Infopro Digital, Haymarket House, 28–29 Haymarket,London SW1Y 4RX, UK. Tel: +44 (0) 207 316 9300

US & Canada Office: Infopro Digital, 55 Broad Street, Floor 22, New York,NY 10005, USA. Tel: +1 646 736 1850

Asia & Pacific Office: Infopro Digital, Unit 1704-05 Berkshire House,Taikoo Place, 25 Westlands Road, Hong Kong. Tel: +852 3411 4888

Website: www.risk.net/journals E-mail: [email protected]

Subscriptions to The Journal of Credit Risk, and Risk.net Journals, are available on an annual basis. To find out about the different subscriptions, including our exclusive academic package, visit subscriptions.risk.net/journals-print or contact [email protected] (EU/US) or [email protected] (ROW).

To subscribe to a Risk Journal visit subscriptions.risk.net/journals or email [email protected]

Page 4: Credit Risksubscriptions.risk.net/wp-content/uploads/2019/02/... · is at the forefront in tackling the many issues and challenges posed by the recent financial crisis, focusing

The Journal of Credit RiskGENERAL SUBMISSION GUIDELINES

The Journal of Credit Risk welcomes submissions from practitioners as well as academics.Manuscripts and research papers submitted for consideration must be original work thatis not simultaneously under review for publication in another journal or other publicationoutlets. All papers submitted for consideration should follow strict academic standardsin both theoretical content and empirical results. Papers should be of interest to a broadaudience of sophisticated practitioners and academics.

Submitted papers should follow Webster’s New Collegiate Dictionary for spelling, andThe Chicago Manual of Style for punctuation and other points of style. Papers should besubmitted electronically via our online submissions site:

https://editorialexpress.com/cgi-bin/e-editor/e-submit_v15.cgi?dbase=risk

Please clearly indicate which journal you are submitting to.Papers should be submitted as either a LATEX file or a Word file (“source file”). The

source file must be accompanied by a PDF file created from the version of the source filethat is submitted. LATEX files need to have an explicitly coded bibliography included or besent with a BBL file. All files must be clearly named and saved by author name and dateof submission.

A concise and factual abstract of between 150 and 200 words is required and it should beincluded in the main document. Four to six keywords should be included after the abstract.Submitted papers must also include an Acknowledgements section and a Declaration ofInterest section. Authors should declare any funding for the paper or conflicts of interest.In-text citations should follow the author-date system as outlined in The Chicago Manualof Style. Reference lists should be formatted in APA style.

The number of figures and tables included in a paper should be kept to a minimum.Figures and tables must be included in the main PDF document and also submitted asclearly numbered editable files (please see the online submission guidelines for guidanceon editable figure files). Figures will appear in color online, but will be printed in black andwhite. Footnotes should be used sparingly. If footnotes are necessary then these shouldbe included at the end of the page and should be no more than two sentences. Appendixeswill be published online as supplementary material.

Before submitting a paper, authors should consult the full author guidelines at:

http://www.risk.net/static/risk-journals-submission-guidelines

Queries may also be sent to:

The Journal of Credit Risk, Infopro Digital, Haymarket House,28–29 Haymarket, London SW1Y 4RX, UKTel: +44 1858 438 800 (UK/EU), +1 212 776 8075 (USA), +852 3411 4828 (Asia)E-mail: [email protected]

To subscribe to a Risk Journal visit subscriptions.risk.net/journals or email [email protected]

Page 5: Credit Risksubscriptions.risk.net/wp-content/uploads/2019/02/... · is at the forefront in tackling the many issues and challenges posed by the recent financial crisis, focusing

The Journal of

Credit Risk

The journalWith the rewriting of the Basel accords in international banking and their ensuingapplication, interest in credit risk has never been greater. The Journal of Credit Riskis at the forefront in tackling the many issues and challenges posed by the recentfinancial crisis, focusing on the measurement and management of credit risk, thevaluation and hedging of credit products, and the promotion of greater understandingin the area of credit risk theory and practice.

The Journal of Credit Risk considers submissions in the form of research papersand technical reports on, but not limited to, the following topics.

� Modeling and management of portfolio credit risk.

� Recent advances in parameterizing credit risk models: default probability esti-mation, copulas and credit risk correlation, recoveries and loss given default,collateral valuation, loss distributions and extreme events.

� The pricing and hedging of credit derivatives.

� Structured credit products and securitizations, eg, collateralized debt obliga-tions, synthetic securitizations, credit baskets, etc.

� Measuring, managing and hedging counterparty credit risk.

� Credit risk transfer techniques.

� Liquidity risk and extreme credit events.

� Regulatory issues, such as Basel II, internal ratings systems, credit-scoringtechniques and credit risk capital adequacy.

To subscribe to a Risk Journal visit subscriptions.risk.net/journals or email [email protected]

Page 6: Credit Risksubscriptions.risk.net/wp-content/uploads/2019/02/... · is at the forefront in tackling the many issues and challenges posed by the recent financial crisis, focusing

To subscribe to a Risk Journal visit subscriptions.risk.net/journals or email [email protected]

Page 7: Credit Risksubscriptions.risk.net/wp-content/uploads/2019/02/... · is at the forefront in tackling the many issues and challenges posed by the recent financial crisis, focusing

The Journal of Credit Risk Volume 14/Number 2

CONTENTS

RESEARCH PAPERSCredit default prediction using a support vector machine anda probabilistic neural network 1Mohammad Zoynul Abedin, Chi Guotai, Sisira Colombageand Fahmida-E-Moula

Modeling dependent risk factors with CreditRiskC 29Xiaohang Zhang, SuBang Choe, Ji Zhu and Jill Bewick

Consumer risk appetite, the credit cycle and the housing bubble 45Joseph L. Breeden and José J. Canals-Cerdá

Editors-in-Chief: Ashish Dev, Michael Gordy Subscription Sales Manager: Aaraa JavedPublisher: Nick Carver Global Key Account Sales Director: Michelle GodwinJournals Manager: Sarah Campbell Composition and copyediting: T&T Productions LtdEditorial Assistant: Ciara Smith Printed in UK by Printondemand-Worldwide

© Infopro Digital Risk (IP) Limited, 2018. All rights reserved. No parts of this publication may be reproduced,stored in or introduced into any retrieval system, or transmitted, in any form or by any means, electronic,mechanical, photocopying, recording or otherwise without the prior written permission of the copyright owners.

Composition and copyediting: T&T Productions LtdPrinted in UK by Printondemand-Worldwide

To subscribe to a Risk Journal visit subscriptions.risk.net/journals or email [email protected]

Page 8: Credit Risksubscriptions.risk.net/wp-content/uploads/2019/02/... · is at the forefront in tackling the many issues and challenges posed by the recent financial crisis, focusing

To subscribe to a Risk Journal visit subscriptions.risk.net/journals or email [email protected]

Page 9: Credit Risksubscriptions.risk.net/wp-content/uploads/2019/02/... · is at the forefront in tackling the many issues and challenges posed by the recent financial crisis, focusing

Journal of Credit Risk 14(2), 1–27DOI: 10.21314/JCR.2017.233

Research Paper

Credit default prediction using a supportvector machine and a probabilistic neuralnetwork

Mohammad Zoynul Abedin,1,2 Chi Guotai,1

Sisira Colombage3 and Fahmida-E-Moula1

1Faculty of Management and Economics, Dalian University of Technology, 2 Linggong Road,Ganjingzi District, 116024 Dalian City, Liaoning Province, People’s Republic of China;emails: [email protected], [email protected], [email protected] of Finance and Banking, Hajee Mohammad Danesh Science andTechnology University, Dinajpur 5200, Bangladesh3Federation Business School, Berwick Campus, Federation University Australia, 100 Clyde Road,Berwick, VIC 3806, Australia; email: [email protected]

(Received April 24, 2017; revised July 11, 2017; accepted December 21, 2017)

ABSTRACT

The design of consistent classifiers to forecast credit-granting choices is critical formany financial decision-making practices. Although a number of artificial and statis-tical techniques have been developed to predict customer insolvency, how to providean inclusive appraisal of prediction models and recommend adequate classifiers isstill an imperative and understudied area in credit default prediction (CDP) model-ing. Previous evidence demonstrates that the ranking of classifiers varies for differentcriteria with measures under different circumstances. In this study, we address thismethodological flaw by proposing the simultaneous application of support vectormachine and probabilistic neural network (PNN)-based CDP algorithms, togetherwith frequently used high-performance models. We fill the gap by introducing a setof multidimensional evaluation measures combined with some novel metrics that arehelpful in discovering unseen features of the model’s performance. For effectiveness

Corresponding author: M. Z. Abedin Print ISSN 1744-6619 j Online ISSN 1755-9723© 2018 Infopro Digital Risk (IP) Limited

1Journal of Credit Risk

To subscribe to a Risk Journal visit subscriptions.risk.net/journals or email [email protected]

www.risk.net/journals

Page 10: Credit Risksubscriptions.risk.net/wp-content/uploads/2019/02/... · is at the forefront in tackling the many issues and challenges posed by the recent financial crisis, focusing

2 M. Z. Abedin et al

and feasibility purposes, six real-world credit data sets have been applied. Our empir-ical study shows that the PNN model is more robust than its rivals, and traditionalperformance evaluations are more or less consistent with their original counterparts.With these contributions, therefore, our investigations offer several advantages topractitioners of financial risk management.

Keywords: financial risk management; credit default prediction (CDP); support vector machine(SVM); probabilistic neural network (PNN); performance criteria; discovering unseen features.

1 INTRODUCTION

For many years, credit default prediction (CDP) has been a significant and challengingissue, and has served as motivation for the vast majority of academic investigations(Tong et al 2012; Butaru et al 2016; Xia et al 2017). “Credit prediction” is an analyticalterm that describes the procedure of distinguishing bank customers to grant them creditby using a set of predefined criteria (Thomas et al 2002). Credit failure may resultfrom internal or external factors, or a combination of both, eg, managerial errors dueto insufficient or inappropriate industry experience, risk-seeking managers, lack ofcommitment and motivation, autocratic control, economic climate, complexities inoperating successfully in the market (see, for example, Rajan and Ramcharan 2016).Evaluating a customer’s credit risk is fundamental for financial institutions due to thehigh risks associated with inappropriate credit-granting results. Further, it is now amajor issue that one of the leading causes of the last financial crisis was the substantialnumber of defaults surrounding it. Therefore, in aiming to satisfy timely signals forsuperior investments and government decisions, perfecting CDP models has gainedrapt attention from researchers and stakeholders alike.

There are a number of models that can be used for credit evaluation in the financialindustry, including the neural network (NN) (Son et al 2016), the support vectormachine (SVM) (Huang et al 2007), data-envelopment analysis (Eskelinen 2017),discriminant analysis (DA) (Lee and Choi 2013), naive Bayes classifiers (Dabrowskiet al 2016), logistic regression (LR) (Desai et al 1996), K-nearest neighbors (Wautersand Vanhoucke 2017), case-based reasoning (Sartori et al 2016), classification andregression trees (CARTs) (Kao et al 2012), genetic algorithms (Kozeny 2015) andBayesian network models (Xia et al 2017).

The NN is one of the most actively researched and successfully applied classi-fiers for CDP problems, and was initially sketched by Frank Rosenblatt at the end ofthe 1950s (Alweshah 2014). Since then, many NN techniques have been designed,including feedforward neural networks (FFNNs), radial basis function neural net-works (RBFNNs), the multilayer perceptron (MLP), modular networks, back propa-gation (BP) and probabilistic neural networks (PNNs). These neural classifiers vary

Journal of Credit Risk www.risk.net/journals

To subscribe to a Risk Journal visit subscriptions.risk.net/journals or email [email protected]

Page 11: Credit Risksubscriptions.risk.net/wp-content/uploads/2019/02/... · is at the forefront in tackling the many issues and challenges posed by the recent financial crisis, focusing

Credit default prediction 3

from one another in terms of behavior, training algorithms and structural design. Theyare appropriate for solving diverse problems, eg, stock market prediction (Chong et al2017), bankruptcy prediction (Lee and Choi 2013), CDP (Son et al 2016), time seriesforecasting (Pradeepkumar and Ravi 2017) and pattern recognition (Sever 2013).

An alternative piece of NN architecture, the PNN (Specht 1990), is comprised of aclassification methodology that merges the learning speed and flexibility of artificialNNs (ANNs) while managing to remain transparent and simple. The key advantageof PNNs over ANNs is their simplified network, which reduces the complexity ofspecifying an appropriate ANN model, and their smooth execution during trainingand testing. In addition, SVMs, proposed by Vapnik (1995), have managed to gainpopularity due to their many useful traits, such as their outstanding generalizationcapacity on an extensive range of problems, including financial applications. A fewempirical studies have used SVMs and PNNs for credit and bankruptcy prediction inrecent years. To start, Fan and Palaniswami (2000) examined SVMs with DA, MLPand learning vector quantization (LVQ) for a bankruptcy prediction problem using anAustralian credit database. They demonstrated that the SVM was competitive and out-performed other classifiers in terms of generalization performance. Huang et al (2004)argued that the SVM outperformed not only BP but also LR, using Taiwan and UScredit data sets. Huang et al (2007) also concluded, based on two credit data sets, thatthe SVM achieved an identical classificatory accuracy and was a promising additionto existing data-mining methods. Subsequently, Jones et al (2015) reviewed the pre-dictive performance of binary learners based on a credit rating database, and claimedthat a radial basis SVM consistently outperformed its more traditional counterparts.

The studies of Min and Lee (2005), Kim and Ahn (2012) and Shin et al (2005)applied SVMs to Korean bankruptcy prediction and found that the SVM is a betterapproach than ordinary DA, LR and MLP when it comes to learning data patterns in asmall sample size. Similar results were observed in the studies of Hui and Sun (2006),Ding et al (2008) and Xie et al (2011) for Chinese listed companies’financial distressprediction (FDP). However, recent work from Abellán and Castellano (2017), Xiaoet al (2016) and Abellán and Mantas (2014) has concluded that LR, decision treeand SVM classifiers are promising choices as base classifiers in default-predictionmodeling.

So far, only a small effort has been made to employ the PNN model for CDPmodeling.Yang et al (1999), Bensic et al (2005), Abdou et al (2008), Pan (2008), Wuet al (2008) and Yang et al (2008) all investigated PNN architecture, and the resultsthis method produced were compared with its statistical and NN counterparts. Theauthors concluded that the PNN was the most successful model for credit predictionand bankruptcy prediction. In contrast, the PNN had inferior clustering power to itsSVM counterparts in Chaudhuri and De (2011). Further, Hájek (2011) explored fourNN models – an FFNN, RBFNN, PNN and cascade correlation (CC) NN – on a US

www.risk.net/journals Journal of Credit Risk

To subscribe to a Risk Journal visit subscriptions.risk.net/journals or email [email protected]

Page 12: Credit Risksubscriptions.risk.net/wp-content/uploads/2019/02/... · is at the forefront in tackling the many issues and challenges posed by the recent financial crisis, focusing

4 M. Z. Abedin et al

municipal credit data set and concluded that the PNN showed the best results for bothfour-class and nine-class municipal credit rating problems.

One major constraint of existing studies, however, is that they judge either SVMsor PNNs. The simultaneous use of SVMs and PNNs as a CDP technique has not beenexplored, and there is no standard database for CDP problems. Most of the existingliterature (except for three studies) uses only one data set and a small number ofinstances for system validation. Moreover, most studies only examine the averageprediction performance to assess their models. Besides, database features such assample size, class distribution, noise and redundancy can shape the performance ofprediction models. Therefore, assessing the performance of prediction algorithmsusing one or two performance criteria on one or two databases is often inadequate.

In light of the above findings, this study attempts to apply SVM and PNN classi-fiers to predict credit default and compare their performances with frequently usedhigh-performance models, DA, LR, CARTs and an MLP. We observe that, as far aswe know, SVM and PNN methods have not been used together for this purpose untilnow. The advantage of this particular application as it relates to the existing literatureis twofold. First, we compare different state-of-the-art classifiers to each other, withsix different data sets, in order to obtain the model with the highest accuracy and effi-ciency. Second, the literature indicates that although there is some evidence in favorof using one or two performance criteria, and for each criterion one or two measures toassess the performance of the competing prediction classifiers, the appraisal is usuallyrestricted to the ranking of classifiers by way of a single measure of a solitary prin-ciple at a certain time. For example, in the area of building credit prediction models,West (2000) compared the performance of some NNs, logit models, linear discrimi-nant analysis (LDA) and CARTs using type I and type II errors with misclassificationcosts. He found that, for type I errors, the logit model gave the best performance fora German credit database, while LDA gave the best performance for an Australiancredit database. For type II errors, LDA outperformed its competitors for the Germandatabase, while the CART performed the best for the Australian one; the rankingsof models also differ with respect to misclassification costs. Desai et al (1996) con-cluded that LDA methods outperformed NNs and LR on correctly classified samples,but the LR model outperformed LDA and NNs for type II errors. Moreover, Vegaet al (2013) compared the performance of several NNs with some of the parametricmodels using area under the curve (AUC), test accuracy, type I and type II errors andmisclassification costs. They found that classifier rankings vary depending on whatcriteria are used for measurement. The main imperfection of popular approaches tothe comparative performance assessment of competing credit prediction models isthat the rankings corresponding to different criteria/measures are often contradictory;this results in modelers being unable to build up-to-date conclusions that take allcriteria and their measures into consideration as to which classifiers perform the best.

Journal of Credit Risk www.risk.net/journals

To subscribe to a Risk Journal visit subscriptions.risk.net/journals or email [email protected]

Page 13: Credit Risksubscriptions.risk.net/wp-content/uploads/2019/02/... · is at the forefront in tackling the many issues and challenges posed by the recent financial crisis, focusing

Credit default prediction 5

Classifiers’ evaluation measures have received much attention from the pattern-recognition community. In particular, Ferri et al (2009), Sokolova and Lapalme(2009), Rodríguez et al (2016) and Carbonero-Ruz et al (2017) all assessed theperformance of classifiers using an inclusive set of performance metrics (eg, 18, 24,21 and 7, unlike performance metrics in abundant scenarios in first, second, third andfourth studies, respectively). The first study used customary criteria, ie, F -measure,accuracy, different error rates, AUC, etc, to evaluate binary/multiclass forecastingmeasures using multidisciplinary data sets. The second study also utilized the tra-ditional metrics supporting the multitopic, multiclass and hierarchical forecastingmeasures using several case studies in the area of text segmentation. The third studyappraised the quality of multivariate forecasting methods. Finally, the fourth studyproposed a two-dimensional framework for the comparison of learning classifierscomposed by the accuracy and deviation of the hit rate among multiple classes. Nev-ertheless, none of these studies is easy to contrast and compare jointly, as there is nospecific exploration of the diplomat performance criteria and therefore they cannot beadvocated in specific areas such as CDP modeling. There is currently no benchmarkperformance metric that could achieve the best-fitted index for all study domains. Thevarious performance metrics assess dissimilar trade-offs in the predictions made bymodels; it is possible for learning models to perform well for one criterion in one fieldbut be suboptimal for another criterion in a different domain. For an exact problemdomain, therefore, we need to discover the traits of specific instances of the problemthat are likely to be associated with model performance.

Based on the above scenarios, the current study claims that the performance metricsin use at present do not fully assemble the needs of credit prediction classifiers wherethe specific domains are highly significant and several classifiers are compared. There-fore, we address this methodological flaw by introducing a set of multidimensionalevaluation measures; this is combined with some novel metrics that measure otherproperties (eg, the ability to avoid default instances, class association, discriminatorypower, etc) and ensure the optimality of the credit prediction classifiers. Accordingly,our main object regarding possible measures is to introduce new features that willenhance model performance. However, it is also vital to state that most of the perfor-mance criteria reported in the literature are from areas such as information retrieval,statistics or medicine. For instance, the F -measure was originally introduced in thefield of information retrieval, but it is now regularly exercised as a performance cri-terion for investigating default prediction (Carbonero-Ruz et al 2017). Therefore,we remind the reader that this paper has borrowed these novel performance criteriafrom reviews of medical trials and behavioral research (Afina et al 2003), where theyare intensively applied. These metrics are Youden’s index (�) (Youden 1950), cross-customer ratio (R) and discriminatory power (T ), which are helpful in discoveringunseen features of the credit predictions model’s performance.

www.risk.net/journals Journal of Credit Risk

To subscribe to a Risk Journal visit subscriptions.risk.net/journals or email [email protected]

Page 14: Credit Risksubscriptions.risk.net/wp-content/uploads/2019/02/... · is at the forefront in tackling the many issues and challenges posed by the recent financial crisis, focusing

6 M. Z. Abedin et al

For the effectiveness and feasibility of the proposed investigation, alongside threefrequently used data sets, ie, Australian, German and Japanese credit data sets, orig-inating from the University of California, Irvine (UCI) machine-learning repository(Lichman 2013), the current study employs three new credit data sets. The first isassembled from a top Chinese state bank, while the second and third – PAKDD creditand Kaggle credit – have been supplied by two financial institutions. Our experimen-tal results and statistically comparative analysis illustrate that the PNN is the mosteffective classifier in terms of credit default evaluation, and it has great potential as aprospective classification approach for other applications. It is also revealed that thefindings gained from the scrutiny of traditional performance evaluations are more orless consistent with their new performance evaluation counterparts, except for somecriteria. This paper recommends that these findings be regarded as a reliable baselinefor future research into CDP analysis. In Section 2, a brief overview of a SVM withPNN classifiers for CDP analysis is described. The details of our experimental intentare given in Section 3. In Section 4, the experimental results are presented. Finally,we conclude with possible directions for future research in Section 5.

2 BACKGROUND

2.1 Overview of classification techniques

The classification methods of DA, LR, MLP and the CART are well known enoughfor us to skip describing them here. Instead, we provide a brief overview of the SVMand PNN.

2.1.1 Support vector machine

The basic idea of applying an SVM to a CDP model can be stated briefly, as follows; formore details, please refer to Murty and Raghava (2016). For simplicity, throughout thispaper we consider a two-class classification problem based on nondefault and defaultcustomers. Here, the training set T D fzi ; digN , where N is the number of trainingdata points, with client feature vectors zi D .z1; z2; : : : ; zn/ and corresponding binarytarget variables di D .d1; d2; : : : ; dn/. di D 1 and di D �1 stand for nondefault anddefault customers, respectively. Then, according to Vapnik’s original formulation, theSVM classifier satisfies the following conditions:

rT�.zi / C c > C1 for di D C1; (2.1)

rT�.zi / C c 6 �1 for di D �1I (2.2)

this can be restated such that

di ŒrT�.zi / C c� > 1; i D 1; : : : ; N; (2.3)

Journal of Credit Risk www.risk.net/journals

To subscribe to a Risk Journal visit subscriptions.risk.net/journals or email [email protected]

Page 15: Credit Risksubscriptions.risk.net/wp-content/uploads/2019/02/... · is at the forefront in tackling the many issues and challenges posed by the recent financial crisis, focusing

Credit default prediction 7

where r and c stand for the weight vector and bias term, respectively. The creditprediction input vectors are mapped into a high-dimensional feature space throughthe nonlinear function �.zi /. It is likely for numerous results to divide the exampledata points in (2.3), and, in general, it is best to decide on two hyper-bounding linesat opposite sides of a hyper-separating line rT�.z/ C c D 0, with 2=.krk2/ as thelargest margin. Details of the optimal separating hyperplane are given in Figure 1 inthe online supplementary materials.

Most of the classification problems, however, are linearly nonseparable cases.Therefore, it is common to introduce slack variables �i to permit misclassification.Thus, the optimization problem becomes

minr;c;�

�12rTr C S

NXiD1

�i

�; (2.4)

such that

di .rT�.zi / C c/ > 1 � �i ; �i > 0; i D 1; : : : ; N; (2.5)

where S is the regularization parameter and is used to balance the classificationaccuracy and classifiers’ complexity in the training example T . After constructingthe Lagrangian multiplier, the solution of the primal is obtained, and this can then beconverted into the following quadratic problem:

max˛

.vT˛ � 12˛TE˛/; (2.6)

subject to

0 6 ˛i 6 S; i D 1; : : : ; N;

nXiD1

˛idi D 0;

9>>=>>;

(2.7)

where ˛i are Lagrange multipliers, and Eij D didj �.zi /T�.zj /. The inner product

is replaced with the kernel function K.zi ; zj / due to a large amount of computation;this transforms the problem into a high-dimensional space in which the points arelinearly separable. The linear function zT

i zj , polynomial function .�zTi zj C p/q , RBF

(exp.��kzi � zj k2/) and sigmoid function (tanh.�zTi zj Cp/) are the four basic kernel

functions that are provided by SVMs. The selection of kernel functions is extremelyrelevance-dependent, which is the imperative issue in SVM applications. The CDPfunction as a final SVM classifier will be

D.z/ D sign

� nXiD0

di˛iK.z; zi / C c

�: (2.8)

www.risk.net/journals Journal of Credit Risk

To subscribe to a Risk Journal visit subscriptions.risk.net/journals or email [email protected]

Page 16: Credit Risksubscriptions.risk.net/wp-content/uploads/2019/02/... · is at the forefront in tackling the many issues and challenges posed by the recent financial crisis, focusing

8 M. Z. Abedin et al

Once the training process is executed to choose the hyperplane, the sorting methodis able substitute the trained model parameters obtained from the training databaseinto the SVM to determine the credit customer class for testing the example; in otherwords, the nondefault credit class (C1) is forecasted if D.z/ > 0, and the defaultcredit class (�1) is forecasted otherwise.

2.1.2 Probabilistic neural network

The PNN model combines four layers: an input layer, a pattern layer, a summationlayer and an output layer. The credit prediction features from the input layer dis-tribute the credit data to the pattern units, where the pattern layer usually employs thefollowing function, as in (2.9):

P.kj / D expf.kj � 1/�2g: (2.9)

Here, kj is the (�) product of the credit approval feature and weight vectors, whilethe scale parameter �2 describes the breadth of the area of influence, and normallydrops as the credit customer size rises. When the credit approval feature vector isfed, the pattern layer measures the deviation from the input vector to the traininginput vectors, resulting in a new vector with units that point out how close the creditprediction input is to the training input. The summation layer has one neuron forevery credit group, and each summation neuron offered to a single group sums thepattern layer neurons related to the numbers of that summation neuron’s class. Theactivation of summation neuron h is the estimated density function of population N .The credit prediction output neuron is a threshold discriminator that recognizes whichof its credit inputs from the summation units is the highest; for more details, pleaserefer to Cao et al (2015).

3 EXPERIMENTAL DESIGN

3.1 Real-world credit data set

In the experimentation, we focus on six real-world credit data sets to verify the fea-sibility and effectiveness of SVM and PNN classifiers. Australian credit, Germancredit and Japanese credit are three benchmarking data sets from the UCI machine-learning repository (Lichman 2013) that have been widely used to compare the per-formance of various classification tools. A real-life data set provided by a leadingChinese commercial bank (Chinese credit) and two other real-life data sets for 2010supplied by two financial institutions (PAKDD and Kaggle credit) are also used.“PAKDD” denotes information from the Pacific-Asia Conference on Knowledge Dis-covery and Data Mining 2010 data-mining challenge (PAKDD 2010), while “Kaggle”denotes information from the “Give Me Some Credit” competition (Kaggle 2017).

Journal of Credit Risk www.risk.net/journals

To subscribe to a Risk Journal visit subscriptions.risk.net/journals or email [email protected]

Page 17: Credit Risksubscriptions.risk.net/wp-content/uploads/2019/02/... · is at the forefront in tackling the many issues and challenges posed by the recent financial crisis, focusing

Credit default prediction 9

TABLE 1 Description of the data sets used in the experiment.

Total Nondefault/ Number ofcases default cases attributes

Australian credit 690 307/383 14German credit 1 000 700/300 20Japanese credit 690 307/383 15Chinese credit 3 111 3 040/71 81PAKDD credit 50 000 36 959/13 041 34Kaggle credit 150 000 139 974/10 026 11

TABLE 2 The confusion matrix for a classification problem.

Predicted observations‚ …„ ƒActual observations Predicted positive Predicted negative

Actual positive TP FPActual negative FN TN

The data sets include examples of good customers and bad customers with a binaryresponse variable, characterized by a set of risk drivers that capture information fromthe customer application form (eg, financial, nonfinancial, macroeconomic, etc) andcustomer information (eg, demographic, sociographic, etc). A summary of the sixdata sets is presented in Table 1; additional facts are given in the online supplementarymaterials.

3.2 Performance evaluation

The evaluation of a credit approval model’s performance plays a significant role inthe design of a classifier system; therefore, the choice of a suitable criterion becomesas vital as the choice of a good prediction algorithm to effectively tackle a givenproblem. For a binary classification problem, the decision made by an algorithm overa set of credit instances can be stated in the form of a 2 � 2 confusion matrix, suchas that given in Table 2. Here, each cell (i , j ) holds the number of nondefault/defaultpredictions, and there are four occupied cells: true positives (TPs), true negatives(TNs), false positives (FPs) and false negatives (FNs).

Traditionally, most CDP modeling frequently uses accuracy (Acc) (3.1) as thecriterion in credit approval model assessment. This criterion denotes the percentage

www.risk.net/journals Journal of Credit Risk

To subscribe to a Risk Journal visit subscriptions.risk.net/journals or email [email protected]

Page 18: Credit Risksubscriptions.risk.net/wp-content/uploads/2019/02/... · is at the forefront in tackling the many issues and challenges posed by the recent financial crisis, focusing

10 M. Z. Abedin et al

of credit instances predicted correctly:

Acc D TP C TN

TP C FN C TN C FP: (3.1)

Apart from the accuracy criterion, many other performance appraisals have beenused with the aim of discovering the unseen characteristics of the algorithm, as thebest evaluation performance criterion depends on many factors. Evidently, the relativejudgment then becomes a complex task, as there is currently no standard methodol-ogy for this in the credit prediction realm. Besides, the criterion mentioned aboveoverlooks the cost of different error types (default candidates being predicted as non-default, or vice versa). Numerous assessment metrics have therefore been developedto measure the different aspects of the problems. The most common are the truepositive rate (TP rate) and true negative rate (TN rate), which have typically beenemployed to supervise the credit prediction performance on each credit class sepa-rately. Note that the TP rate (3.2) evaluates the proportion of nondefault credits thatare predicted to be nondefault, whereas the TN rate (3.3) evaluates the proportion ofdefault credits that are predicted to default. Normally, high TP and TN rates entail ahigh credit approval performance:

TP rate D TP

TP C FN; (3.2)

TN rate D TN

TN C FP: (3.3)

Likewise, the positive predictive value (PPV) (3.4) of a model is defined as theproportion of bad players identified by the model, while the negative predictive value(NPV) (3.5) of a model is defined as the proportion of good players identified by themodel. The more sensitive the model, the greater its NPV, and the more specific themodel, the higher its PPV:

PPV D TP

TP C FP; (3.4)

NPV D TN

TN C FN: (3.5)

For (3.6), the F -measure is used to incorporate the PPV and TP rate into a distinctmetric, symbolizing a weighted harmonic mean between these two criteria. Similarly,the G-mean parameter (3.7) can be perceived as a balanced performance of the clas-sifier between the two classes. At the credit approval forecasting stage, Matthew’scorrelation coefficient (MCC) (3.8) is desirable: this specifies the model quality forthe two-class problem, particularly when the binary classes are of dissimilar sizes.

Journal of Credit Risk www.risk.net/journals

To subscribe to a Risk Journal visit subscriptions.risk.net/journals or email [email protected]

Page 19: Credit Risksubscriptions.risk.net/wp-content/uploads/2019/02/... · is at the forefront in tackling the many issues and challenges posed by the recent financial crisis, focusing

Credit default prediction 11

MCC ranges from �1 to C1, where �1, 0 and C1 refer to inverse classification,average classification and perfect classification performance, respectively:

F -measure D 2 � PPV � TP rate

PPV C TP rate; (3.6)

G-mean D .TP rate � TN rate/1=2; (3.7)

MCC D TP � TN � FP � FN

Œ.TP C FN/.TP C FP/.FP C TN/.TN C FN/�1=2; (3.8)

AUC D TP rate C TN rate

2: (3.9)

Finally, the AUC statistic (3.9) is also an important index for measuring the dis-criminatory power of a CDP model based on the area under a receiver operatingcharacteristic (ROC) curve, where a classifier is preferred if its ROC curve is closerto the upper-left corner, that is to say, it has a large AUC value. The ideal CDP modelshould present an AUC of 1.0, while an AUC of 0.5 denotes a random classifier.

However, our principal object regarding possible measures is to introduce newfeatures in order to enhance model performance. As a preface to our arguments, weremind the reader that this study has borrowed its performance criteria from reviewsof medical trials and behavioral research, where they are intensively applied. Accord-ingly, in (3.10) the cross-customer ratio (R) does exactly what its name implies andis useful in measuring the association that evaluates how the classifier distinguishesnondefault from default creditors. The value of R ranges from 0 to 1, with highervalues indicating better distinguishing power, ie, a classifier can perfectly differentiatenondefault from default creditors:

Cross-customer ratio .R/ D PPV � NPV

Œ.1 � PPV/ � .1 � NPV/�: (3.10)

Youden’s index .�/ D sensitivity � .1 � specificity/: (3.11)

Discriminatory index .T / D .sensitivity � specificity C 1/ ln R: (3.12)

It is evident that TP rate and TN rate have conflicting trends with regard to classpreference between nondefault and default groups. In such cases, Youden’s index (� )in (3.11) can be used to select an appropriate class. Youden’s index captures the per-formances of both classes by assigning equal weight to the classifier’s performancein nondefault and default instances. Since it is a combined measure of TP rate andTN rate, one has to select a classifier with a high Youden’s index, which signifies agreater ability to avoid default instances (Deng and Han 2016). However, a defaultobject should reflect the performance of the merged class with regard to the dis-criminatory index (T ) (Rodríguez et al 2016), which is popularly known as “test

www.risk.net/journals Journal of Credit Risk

To subscribe to a Risk Journal visit subscriptions.risk.net/journals or email [email protected]

Page 20: Credit Risksubscriptions.risk.net/wp-content/uploads/2019/02/... · is at the forefront in tackling the many issues and challenges posed by the recent financial crisis, focusing

12 M. Z. Abedin et al

effectiveness”. T is described as the ability of an algorithm to classify nondefaultand default objects, which are formally quoted in (3.12). The cross-customer ratio R

plays the most significant role in the determination of the discriminatory power ofa classifier, and therefore the larger the R, the better the classifier for distinguishingdefault instances, ie, the better its discriminatory power T . A credit prediction clas-sifier is a poor discriminant if T < 1, a limited one if T < 2, a good one if T < 3,and an excellent one in other cases.

4 RESULTS AND DISCUSSION

In order to fully appreciate the implications and performance of our classifiers, reducethe variability and increase the consistency of the conclusions drawn, the t -fold cross-validation (CV) was applied, ie, the original data set was partitioned into t -subsets/folds of equal sizes, where each subset was trained and tested. Accordingly, the con-cluding outcome is forecasted by taking the averages of all complementary subsetsthat have been tested. However, concerns might arise over the number of subsets/folds to be tested. A number of empirical studies (Liu and Liao 2017; Barrow andCrone 2016; Donate et al 2013) reported that between five and twenty subsets isa workable choice when using data sets of various sizes, with replications of theinvestigated methodology also being desirable in order to verify alterations to testingwith training instances as much as possible, so that bias related to the initializationof the approach can be minimized. Moreover, recently Zhang and Yang (2015) citedthat t 6 5 is clearly poor, ten-fold is considerably worse, and repeated twenty-foldCVs are the best for selecting classification algorithms. In addition, the larger numberof folds indicates the smaller variance of each subset (Liu and Liao 2017). Basedon these experiences, a twenty-fold CV was carried out by repeating each repli-cation ten times, ie, a 20 � 10 CV, generating 200 test outcomes; these were thenaveraged to produce the end output for the respective data set to achieve reliableconclusions.

To date, there is no feasible guidance for selecting an SVM kernel function and itshyper-parameters, and the comparisons between various kernel function and parame-ter selections are not at the center of investigated methodology. The RBF is one of themost commonly used kernel functions to run an SVM algorithm in CDP analysis (Xiaet al 2017). In order to improve our accuracy, therefore, RBF was adopted. To simplifythe setting of parameter � , Lu et al (2009) states that in multivariate d -dimensionalproblems the RBF width parameter � is used as �n � .0:1; 0:5/, where n is the num-ber of input variables; � D 0:8 is employed for all experiments in this methodology.In addition, for the PNN, the gene pool size is set to 300, and the smoothing factor ischosen as 0.8. As suggested by Tseng and Hu (2010), the single hidden layer networkis adequate to design any neural system; therefore, it is having just one hidden layer in

Journal of Credit Risk www.risk.net/journals

To subscribe to a Risk Journal visit subscriptions.risk.net/journals or email [email protected]

Page 21: Credit Risksubscriptions.risk.net/wp-content/uploads/2019/02/... · is at the forefront in tackling the many issues and challenges posed by the recent financial crisis, focusing

Credit default prediction 13

MLP architecture, supposing that it also routinely accepts a number of hidden neuronsfor the optimal architecture, that generates the feasible credit prediction outcomes.In addition, to improve generalization, it is crucial to avoid the overfit and underfitproblems. With too few hidden units, the network might not have enough flexibilityto capture the nonlinearities in the data. With too many hidden units, however, thenetwork can achieve enough flexibility, but this may also result in overfitting. In ourstudy, the number of hidden units is somewhere in the range of five to 100, whichis empirically approved as being able to solve the overfitting/underfitting challenges(Adeodato et al 2011; Kim and Ahn 2012; Gajewski and Valis 2017). Moreover, thetraining parameters are randomly determined in order to train MLP architectures.Typically, the learning rate varies between 0.01 and 0.4, the momentum fluctuatesbetween 0.8 and 0.99, and the training lengths range from 1000 to 10 000 epochs(Liu et al 2016). In addition, the CART methodology splits the points via the Giniindex and entails growing a tree by employing a recursive partitioning process. Toprune our algorithm, we utilize the minimal cost complexity measure as well as theone-standard-error rule in order to choose a feasible tree structure. Last, 0.50 cutoffpoints are used for LR and DA classifiers. The comprehensive CDP results employingthe above-mentioned six modeling techniques are summarized in Tables 3–8.

From the experimental results shown in Table 3 for the Australian credit data set,the MLP classifier shows the competitive predictive performance in all criteria, forexample, an accuracy of 93.91%, TP rate of 94.75%, TN rate of 92.88%, : : : , and anAUC of 93.82%. By contrast, the PNN, LR, DA and CART models yielded accuraciesof 86.47% to 85.51%, TP rates of 93.01% to 87.84%, TN rates of 85.42% to 78.67%,: : : , and AUCs of 87.36% to 85.84%. As shown in Table 3, the PNN has performedmuch better than its rivals, which are approved by the highest predictive performanceof traditional measures – and ensured from new measures, too – by producing thehighest R (235.47), � (0.8763) and T (4.7860). However, the SVM shows the leastpotential in all performance measures.

For the German credit data set (see Table 4), the PNN classifier has the highest pre-dictive performance in all experiments. Generally, we observe that the PNN producedthe best results, with an overall prediction rate of 98.30%, TP rate of 98.17%, TN rateof 98.63%, : : : , and AUC of 98.39%; it also exhibited the highest R (3841.3846), �

(0.9679) and T (7.9888). In fact, the remaining five classifiers reveal “average perfor-mance” for both types of criteria. LR was the runner-up, with an accuracy of 76.80%,TN rate of 65.30%, PPV of 89.16%, : : : , and a T of 0.9155. However, it was lessimpressive in terms of its TP rate (80.03%) and its results for three other measures.A careful examination of these results reveals that the prediction performance for thenondefault credit class is significantly higher than the prediction performance for thedefault credit class.

www.risk.net/journals Journal of Credit Risk

To subscribe to a Risk Journal visit subscriptions.risk.net/journals or email [email protected]

Page 22: Credit Risksubscriptions.risk.net/wp-content/uploads/2019/02/... · is at the forefront in tackling the many issues and challenges posed by the recent financial crisis, focusing

14 M. Z. Abedin et al

TAB

LE

3C

lass

ifier

perf

orm

ance

onth

eA

ustr

alia

nda

tase

t.

Cre

dit

pre

dic

tio

nm

od

els

Acc

ura

cyT

Pra

teT

Nra

teP

PV

NP

VF

-mea

sure

MC

CG

-mea

nA

UC

R�

T

SV

M0.

6749

0.69

300.

6667

0.48

530.

8272

0.57

090.

3353

0.67

970.

6798

4.51

520.

3597

0.54

22P

NN

0.86

470.

8784

0.85

420.

8228

0.90

110.

8497

0.72

820.

8662

0.86

6342

.301

60.

7326

2.74

33M

LP0.

9391

0.94

750.

9288

0.94

260.

9348

0.94

500.

8769

0.93

810.

9382

235.

4700

0.87

634.

7860

CA

RT

0.85

510.

9301

0.78

670.

7990

0.92

510.

8596

0.72

040.

8554

0.85

8449

.070

60.

7168

2.79

07LR

0.87

540.

9114

0.83

590.

8590

0.89

580.

8844

0.75

100.

8728

0.87

3652

.358

20.

7472

2.95

76D

A0.

8594

0.93

330.

7917

0.80

420.

9283

0.86

400.

7288

0.85

960.

8625

53.2

000

0.72

502.

8812

Journal of Credit Risk www.risk.net/journals

To subscribe to a Risk Journal visit subscriptions.risk.net/journals or email [email protected]

Page 23: Credit Risksubscriptions.risk.net/wp-content/uploads/2019/02/... · is at the forefront in tackling the many issues and challenges posed by the recent financial crisis, focusing

Credit default prediction 15

TAB

LE

4C

lass

ifier

perf

orm

ance

onth

eG

erm

anda

tase

t.

Cre

dit

pre

dic

tio

nm

od

els

Acc

ura

cyT

Pra

teT

Nra

teP

PV

NP

VF

-mea

sure

MC

CG

-mea

nA

UC

R�

T

SV

M0.

7450

0.79

080.

5957

0.86

430.

4667

0.82

590.

3577

0.68

640.

6933

5.57

240.

3866

0.66

41P

NN

0.98

300.

9817

0.98

630.

9943

0.95

670.

9879

0.95

940.

9839

0.98

3938

41.3

846

0.96

797.

9888

MLP

0.72

100.

8006

0.53

360.

8017

0.53

180.

8011

0.33

380.

6536

0.66

715.

5919

0.33

410.

5093

CA

RT

0.74

400.

7868

0.59

730.

8700

0.45

000.

8263

0.35

060.

6856

0.69

215.

4755

0.38

420.

6532

LR0.

7680

0.80

030.

6530

0.89

160.

4783

0.84

350.

4094

0.72

290.

7227

7.53

840.

4532

0.91

55D

A0.

7400

0.87

060.

5481

0.73

890.

7425

0.79

940.

4494

0.69

080.

7094

8.16

100.

4187

0.87

91

www.risk.net/journals Journal of Credit Risk

To subscribe to a Risk Journal visit subscriptions.risk.net/journals or email [email protected]

Page 24: Credit Risksubscriptions.risk.net/wp-content/uploads/2019/02/... · is at the forefront in tackling the many issues and challenges posed by the recent financial crisis, focusing

16 M. Z. Abedin et al

Similarly, as indicated in Table 5, the results for the Japanese credit data set exhibitpatterns in all measurement aspects that are similar to the German database; that isto say, the PNN prediction classifier performs better than its counterparts, with thehighest average accuracy of 90.59%, a TP rate of 88.42%, a TN rate of 93.33%,: : : , and an AUC of 90.88%; it is thereby the preferred method, a statement that issupported by the fact that it also boasts the highest R (106.9091), � (0.8175) andT (3.8195). The predictive performance of DA emerges in the second position inalmost all criteria except three. Prediction results for all other models range from85.80% to 82.61% for accuracy, 82.75% to 78.73% for TP rate, 90.37% to 85.83%for TN rate, : : : , and 2.6334 to 2.0231 for T . As for the reported results, all of theirpredictive performance rates reach 75%, except MCC; this indicates that all of thesemodels possess sufficient capabilities to predict the default and nondefault borrower,although SVM is the worst performer.

For the Chinese credit data set, the empirical results presented in Table 6 reveal thatthe PNN classifier produces the best results in nine out of twelve criteria, boastingthe highest values for average accuracy (98.91%), TN rate (99.21%), : : : , and AUC(91.91%) as well as R (687.5000), � (0.8382) and T (5.4761). However, the CARTclassifier possesses the highest capability of predicting the default borrower, with aTP rate of 99.58%, followed closely by DA (99.49%). The highest PPV (99.97%)belongs to the LR classifier, the most specific model, closely followed by the SVM(99.83%).

For the PAKDD credit modeling database, it is surprising to see in Table 7 thatall classifiers produce, on average, significantly lower results compared with otherdatabases for all criteria. However, the LR classifier has better predictability basedon some customary as well as novel measures. Side by side, the SVM model showsa higher discriminatory power based on traditional measures, for example, an accu-racy of 73.93%, a PPV of 99.97%, etc. From these outcomes, it is clear that when adatabase is larger with vastly incomplete features, it may contain noisy/outlier infor-mation with redundant and irrelevant features, resulting in a poor performance fromthe trained models. However, the Kaggle credit database results presented in Table 8reveal that the MLP classifier produces the best results for both new and old perfor-mance criteria, with the exception of three measures; for instance, accuracy (93.60%),TN rate (57.20%), F -measure (0.9665) and AUC (75.78%); it also achieves the max-imum R (22.3268), � (0.5156) and T (1.6012). The SVM classifier exhibits a goodextrapolative performance with a PPV of 99.95%. Again, very good average resultsare shown by the PNN, MLP and CART for accuracy, TP rate and F -measure.

Further details can be found in “Experimental results: some extensions”, which isnot possible to include in this manuscript due to space constraints but is available inSection 3 of the online supplementary materials. The following findings are drawn

Journal of Credit Risk www.risk.net/journals

To subscribe to a Risk Journal visit subscriptions.risk.net/journals or email [email protected]

Page 25: Credit Risksubscriptions.risk.net/wp-content/uploads/2019/02/... · is at the forefront in tackling the many issues and challenges posed by the recent financial crisis, focusing

Credit default prediction 17

TAB

LE

5C

lass

ifier

perf

orm

ance

onth

eJa

pane

seda

tase

t.

Cre

dit

pre

dic

tio

nm

od

els

Acc

ura

cyT

Pra

teT

Nra

teP

PV

NP

VF

-mea

sure

MC

CG

-mea

nA

UC

R�

T

SV

M0.

8261

0.78

950.

8583

0.83

060.

8224

0.80

950.

6504

0.82

320.

8239

22.7

163

0.64

782.

0231

PN

N0.

9059

0.88

420.

9333

0.94

380.

8642

0.91

300.

8128

0.90

840.

9088

106.

9091

0.81

753.

8195

MLP

0.85

220.

8275

0.87

270.

8436

0.85

900.

8355

0.70

140.

8498

0.85

0132

.874

60.

7002

2.44

54C

AR

T0.

8493

0.81

620.

8780

0.85

340.

8459

0.83

440.

6968

0.84

660.

8471

31.9

729

0.69

422.

4055

LR0.

8580

0.81

010.

9037

0.88

930.

8329

0.84

780.

7179

0.85

560.

8569

40.0

216

0.71

382.

6334

DA

0.85

650.

7873

0.93

290.

9283

0.79

900.

8520

0.72

370.

8570

0.86

0151

.481

70.

7202

2.83

85

www.risk.net/journals Journal of Credit Risk

To subscribe to a Risk Journal visit subscriptions.risk.net/journals or email [email protected]

Page 26: Credit Risksubscriptions.risk.net/wp-content/uploads/2019/02/... · is at the forefront in tackling the many issues and challenges posed by the recent financial crisis, focusing

18 M. Z. Abedin et al

TAB

LE

6C

lass

ifier

perf

orm

ance

onth

eC

hine

seda

tase

t.

Cre

dit

pre

dic

tio

nm

od

els

Acc

ura

cyT

Pra

teT

Nra

teP

PV

NP

VF

-mea

sure

MC

CG

-mea

nA

UC

R�

T

SV

M0.

9785

0.98

000.

6429

0.99

830.

1268

0.98

910.

2792

0.79

370.

8114

88.1

129

0.62

282.

7894

PN

N0.

9891

0.84

620.

9921

0.68

750.

9968

0.75

860.

7574

0.91

620.

9191

687.

5000

0.83

825.

4761

MLP

0.98

170.

9905

0.60

000.

9908

0.59

150.

9906

0.58

640.

7709

0.79

5215

5.79

310.

5905

2.98

10C

AR

T0.

9402

0.99

580.

2532

0.94

280.

8310

0.96

860.

4390

0.50

220.

6245

80.9

837

0.24

911.

0944

LR0.

9781

0.97

840.

8000

0.99

970.

0563

0.98

890.

2088

0.88

470.

8892

181.

1343

0.77

844.

0471

DA

0.95

500.

9949

0.30

940.

9589

0.78

870.

9765

0.47

690.

5548

0.65

2187

.061

30.

3043

1.35

91

Journal of Credit Risk www.risk.net/journals

To subscribe to a Risk Journal visit subscriptions.risk.net/journals or email [email protected]

Page 27: Credit Risksubscriptions.risk.net/wp-content/uploads/2019/02/... · is at the forefront in tackling the many issues and challenges posed by the recent financial crisis, focusing

Credit default prediction 19

TAB

LE

7C

lass

ifier

perf

orm

ance

onth

ePA

KD

Dcr

edit

mod

elin

gda

tase

t.

Cre

dit

pre

dic

tio

nm

od

els

Acc

ura

cyT

Pra

teT

Nra

teP

PV

NP

VF

-mea

sure

MC

CG

-mea

nA

UC

R�

T

SV

M0.

7393

0.73

940.

1539

0.99

970.

0002

0.85

01�0

.004

00.

3373

0.44

660.

5160

�0.1

067

0.07

06P

NN

0.71

310.

7389

0.25

710.

9461

0.05

290.

8298

�0.0

020

0.43

590.

4980

0.97

95�0

.004

08.

2E�5

MLP

0.69

890.

7456

0.31

130.

8999

0.12

840.

8155

0.04

010.

4818

0.52

841.

3246

0.05

690.

0160

CA

RT

0.73

920.

7394

0.13

330.

9996

0.00

020.

8501

�0.0

051

0.31

400.

4364

0.43

66�0

.127

20.

1055

LR0.

7378

0.74

010.

3612

0.99

470.

0085

0.84

870.

0180

0.51

700.

5507

1.61

010.

1013

0.04

82D

A0.

7189

0.74

230.

3105

0.94

940.

0647

0.83

320.

0273

0.48

010.

5264

1.29

760.

0529

0.01

38

www.risk.net/journals Journal of Credit Risk

To subscribe to a Risk Journal visit subscriptions.risk.net/journals or email [email protected]

Page 28: Credit Risksubscriptions.risk.net/wp-content/uploads/2019/02/... · is at the forefront in tackling the many issues and challenges posed by the recent financial crisis, focusing

20 M. Z. Abedin et al

TAB

LE

8C

lass

ifier

perf

orm

ance

onth

eK

aggl

ecr

edit

data

set.

Cre

dit

pre

dic

tio

nm

od

els

Acc

ura

cyT

Pra

teT

Nra

teP

PV

NP

VF

-mea

sure

MC

CG

-mea

nA

UC

R�

T

SV

M0.

9332

0.93

360.

5200

0.99

950.

0078

0.96

540.

0574

0.69

680.

7268

15.2

353

0.45

361.

2355

PN

N0.

9226

0.93

310.

0633

0.98

780.

0115

0.95

97�0

.001

60.

2430

0.49

820.

9425

�0.0

036

0.00

02M

LP0.

9360

0.94

350.

5720

0.99

080.

1718

0.96

650.

2896

0.73

470.

7578

22.3

268

0.51

561.

6012

CA

RT

0.90

770.

9486

0.29

720.

9527

0.27

930.

9506

0.23

880.

5309

0.62

297.

8032

0.24

580.

5050

LR0.

9338

0.93

560.

5656

0.99

760.

0426

0.96

560.

1420

0.72

740.

7506

18.9

390

0.50

131.

4743

DA

0.93

360.

9391

0.51

810.

9932

0.10

120.

9654

0.20

770.

6976

0.72

8616

.589

40.

4573

1.28

43

Journal of Credit Risk www.risk.net/journals

To subscribe to a Risk Journal visit subscriptions.risk.net/journals or email [email protected]

Page 29: Credit Risksubscriptions.risk.net/wp-content/uploads/2019/02/... · is at the forefront in tackling the many issues and challenges posed by the recent financial crisis, focusing

Credit default prediction 21

from the proficient discussion above, credit prediction databases, experimental resultsfrom the manuscript and the supplementary information.

(1) The use of only one database is not enough to make a fair judgment or offera consistent conclusion on CDP analysis. Database composition is currentlyof particular interest and relevance in credit risk prediction due to the intro-duction of compliance guidelines, such as Basel II and Basel III (Moges et al2013). Therefore, it is much better to use some databases for authenticating theprediction classifier.

(2) The selection of performance criteria should be used to reflect the businessobjectives of a real problem; a combined application of traditional and newmeasures should be employed. Under this assumption, the AUC, F -measure,G-mean, etc, along with some new criteria (�; T; R), are suggested as moreinformative measures (see Section 3 of the online supplementary materials)and provide a valuable tool to fill the methodological gap and apprehend thetrue nature of the relative performance for problems such as the prediction ofcustomer insolvency.

(3) Finally, for our experiments, the numerical results are shown in Figures A2–A10 and Tables A1–A3 (see Section 3 of the online supplementary materials)seem to suggest that the PNN credit prediction model has brought significantimprovements to credit admission decisions. Therefore, the empirical resultsobtained support the conclusion that the PNN classifier is more robust than theSVM classifier in CDP analysis.

5 CONCLUSIONS

Financial institutions have been experiencing serious challenges and competition inrecent years due to financial crises. CDP has become an increasingly important issuefor them because of the generalization and propagation of systematic risk in a globalfinancial environment, the high social costs of customer failures and the growingdemand for consumer credit. The design of reliable models to predict credit-grantingdecisions is therefore crucial for many decision-making processes. Although a largenumber of models have been designed to predict customer insolvency, providing aninclusive appraisal of prediction models and recommending adequate models is stillan imperative and understudied area in credit risk management.

Consequently, this study is motivated by the need to develop a more accurate andflexible model for evaluating credit insolvency. Many methods have been appliedto this problem, among which SVMs and PNNs have shown an excellent ability tohandle credit prediction. The predictive ability of SVM and PNN classifiers is also

www.risk.net/journals Journal of Credit Risk

To subscribe to a Risk Journal visit subscriptions.risk.net/journals or email [email protected]

Page 30: Credit Risksubscriptions.risk.net/wp-content/uploads/2019/02/... · is at the forefront in tackling the many issues and challenges posed by the recent financial crisis, focusing

22 M. Z. Abedin et al

compared with DA, LR, the CART and MLP, the four most commonly used creditprediction models. All of these models have been applied to six real-world creditdatabases: Australian, German and Japanese credit databases, which are from theUCI machine-learning repository; a Chinese credit database, which is from one ofthe leading Chinese commercial banks; and the databases PAKDD and Kaggle credit,which have been provided by two financial institutions. In addition, for an efficientevolution of classifier performance in default prediction problems, a set of appraisalmetrics along with some novel criteria have been used. These metrics include averageaccuracy, TP rate, TN rate, PPV, NPV, AUC, F -measure, MCC and G-mean, whichare traditional measures, and our new evaluation criteria: Youden’s index (� ), cross-customer ratio (R) and discriminatory power (T ), which measure other characteristicsof classifiers. As per our best knowledge, none of the previous literature has attemptedthe simultaneous use of the above-mentioned criteria in CDP modeling.

The experimental results and the comparative statistical analysis show that the PNNis a more effective classifier than the SVM for credit default evaluation, and it hasgreat potential as a prospective classification approach for other applications. It is alsohighlighted that the findings gained from the scrutiny of traditional performance evalu-ations are more or less consistent with their new performance evaluation counterparts,except for some criteria. A multiple set of assessment metrics thus assists us in bettercomprehending the comparative performance of CDP algorithms. With these contri-butions, therefore, our investigations bring several new insights to default predictionand financial risk management. First, conventional CDP models depend on personaljudgments, which are based on prior knowledge. As database volumes and businessdemands expand, the customary approach cannot assess credit prediction efficiently.Thanks to the development of computer power and data-storage technologies, ourinvestigated algorithms can be used to rapidly predict credit customers’ behavior andaccordingly accelerate the analytical process. Second, the feasible algorithms used inthis study employing the PNN classifier offer higher predictive accuracies than con-ventional models in all aspects, thus adding to profits and minimizing possible losses.Further, our investigated algorithms offer several indirect advantages, for example, therapid processing of credit approval databases, efficient management of financial risk,fewer prerequisites for modelers of CDP and more optimal allotment of exploratoryresources.

Although our proposed classifier performs better than the existing methods forcredit prediction, some open issues still exist for the extension of the proposed method-ology. Thus, as a further research step, this study could be considerably extended byincorporating additional features. For instance, it would be interesting to see whatadvances would show up if a more complex model were to be employed, such as adeep neural network or extreme learning machine, by collecting more relevant vari-ables for improving credit prediction ability. The investigations could be extended

Journal of Credit Risk www.risk.net/journals

To subscribe to a Risk Journal visit subscriptions.risk.net/journals or email [email protected]

Page 31: Credit Risksubscriptions.risk.net/wp-content/uploads/2019/02/... · is at the forefront in tackling the many issues and challenges posed by the recent financial crisis, focusing

Credit default prediction 23

to include other financial products too, such as agricultural loans, house loans andmortgages.

DECLARATION OF INTEREST

The authors declare that there are no conflicts of interest regarding the publicationof this paper. This work has been supported by the Key Projects of National NaturalScience Foundation of China (71731003 and 71431002), the National Natural Sci-ence Foundation of China (71471027 and 71171031), the National Social ScienceFoundation of China (16BTJ017), the National Natural Science Foundation of ChinaYouth Project (71503199 and 71601041), the Social Science Foundation of Liaon-ing Province of China (L16BJY016) and the Key Projects of Economic and SocialDevelopment Foundation of Liaoning Province (2015lslktzdian-05). We thank theorganizations mentioned above.

ACKNOWLEDGEMENTS

We are extremely grateful to the editors-in-chief, Ashish Dev and Michael Gordy, andanonymous reviewers for their substantial contributions, because the manuscript hasbeen significantly improved with their assistance. I (the first author) would also liketo express the most profound sense of appreciation for my beloved teacher, ProfessorDr. M. Shamim Uddin Khan from the University of Chittagong in Bangladesh, whoshows a spirit of adventure with regard to my research works, and whose insightfulsuggestions and expertise have inspired me throughout my whole academic career.

REFERENCES

Abdou, H., Pointon, J., and Masry, A.E. (2008).Neural nets versus conventional techniquesin credit scoring in Egyptian banking. Expert Systems with Applications 35, 1275–1292(https://doi.org/10.1016/j.eswa.2007.08.030).

Abellán, J., and Castellano, J. (2017). A comparative study on base classifiers in ensemblemethods for credit scoring. Expert Systems with Applications 73, 1–10 (https://doi.org/10.1016/j.eswa.2016.12.020).

Abellán, J., and Mantas, C. (2014). Improving experimental studies about ensembles ofclassifiers for bankruptcy prediction and credit scoring.Expert Systems with Applications41, 3825–3830 (https://doi.org/10.1016/j.eswa.2013.12.003).

Adeodato, P., Arnaud, A., Vasconcelos, G., Cunha, R., and Monteiro, D. (2011). MLPensembles improve long term prediction accuracy over single networks. InternationalJournal of Forecasting 27, 661–671 (https://doi.org/10.1016/j.ijforecast.2009.05.029).

Afina, S. G., Jeroen, G. L., Martin, H. P., Gouke, J. B., and Patrick, B. (2003).The diagnosticodds ratio: a single indicator of test performance. Journal of Clinical Epidemiology 56,1129–1135 (http://doi: 10.1016/S0895-4356(03)00177-X).

www.risk.net/journals Journal of Credit Risk

To subscribe to a Risk Journal visit subscriptions.risk.net/journals or email [email protected]

Page 32: Credit Risksubscriptions.risk.net/wp-content/uploads/2019/02/... · is at the forefront in tackling the many issues and challenges posed by the recent financial crisis, focusing

24 M. Z. Abedin et al

Alweshah, M.(2014).Firefly algorithm with artificial neural network for time series problems.Research Journal of Applied Sciences, Engineering and Technology 7(19), 3978–3982(https://doi.org/10.19026/rjaset.7.757).

Barrow, D. K., and Crone, S. F. (2016). Cross-validation aggregation for combining auto-regressive neural network forecasts. International Journal of Forecasting 32, 1120–1137(https://doi.org/10.1016/j.ijforecast.2015.12.011).

Bensic, M., Sarlija, N., and Susac, M. Z. (2005). Modeling small business credit scoringby using logistic regression, neural networks and decision trees. Intelligent Systems inAccounting, Finance and Management 13, 133–150 (https://doi.org/10.1002/isaf.261).

Butaru, F., Chen, Q., Clark, B., Das, S., Lo, A. W., and Siddique, A. (2016). Risk and riskmanagement in the credit card industry. Journal of Banking & Finance 72, 218–239(https://doi.org/10.1016/j.jbankfin.2016.07.015).

Cao, F., Ye, H., and Wang, D. (2015). A probabilistic learning algorithm for robust modelingusing neural networks with random weights. Information Sciences 313, 62–78 (https://doi.org/10.1016/j.ins.2015.03.039).

Carbonero-Ruz, M., Martínez-Estudillo, F. J., Fernández-Navarro, F., Becerra-Alonso, D.,and Martínez-Estudillo, A. C. (2017). A two dimensional accuracy-based measure forclassification performance. Information Sciences: An International Journal 382(C), 60–80 (https://doi.org/10.1016/j.ins.2016.12.005).

Chaudhuri, A., and De, K. (2011). Fuzzy support vector machine for bankruptcy prediction.Applied Soft Computing 11, 2472–2486 (https://doi.org/10.1016/j.asoc.2010.10.003).

Chong, E., Han, C., and Park, F. (2017). Deep learning networks for stock market analysisand prediction: methodology, data representations, and case studies. Expert Systemswith Applications 83, 187–205 (https://doi.org/10.1016/j.eswa.2017.04.030).

Dabrowski, J., Beyers, C., and Villiers, J. (2016). Systemic banking crisis early warningsystems using dynamic Bayesian networks. Expert Systems with Applications 62, 225–242 (https://doi.org/10.1016/j.eswa.2016.06.024).

Deng, S.W., and Han, J.Q.(2016).Towards heart sound classification without segmentationvia autocorrelation feature and diffusion maps. Future Generation Computer Systems60, 13–21 (https://doi.org/10.1016/j.future.2016.01.010).

Desai, V. S., Crook, J. N., and Overstreet, G. A., Jr. (1996). A comparison of neural net-works and linear scoring models in the credit union environment. European Journal ofOperational Research 95(1), 24–37 (https://doi.org/10.1016/0377-2217(95)00246-4).

Ding, Y., Song, X., and Zen, Y. (2008). Forecasting financial condition of Chinese listedcompanies based on support vector machine. Expert Systems with Applications 34(4),3081–3089 (https://doi.org/10.1016/j.eswa.2007.06.037).

Donate, J. P., Cortez, P., Sánchez, G. G., and de Miguel, A. S. (2013).Time series forecast-ing using a weighted cross-validation evolutionary artificial neural network ensemble.Neurocomputing 109, 27–32 (https://doi.org/10.1016/j.neucom.2012.02.053).

Eskelinen, J. (2017). Comparison of variable selection techniques for data envelopmentanalysis in a retail bank. European Journal of Operational Research 259(2), 778–788(https://doi.org/10.1016/j.ejor.2016.11.009).

Fan, A., and Palaniswami, M. (2000). Selecting bankruptcy predictors using a supportvector machine approach. In Proceedings of the IEEE–INNS–ENNS International JointConference on Neural Networks, Volume 6, pp. 354–359. Institute of Electrical andElectronics Engineers, Piscataway, NJ.

Journal of Credit Risk www.risk.net/journals

To subscribe to a Risk Journal visit subscriptions.risk.net/journals or email [email protected]

Page 33: Credit Risksubscriptions.risk.net/wp-content/uploads/2019/02/... · is at the forefront in tackling the many issues and challenges posed by the recent financial crisis, focusing

Credit default prediction 25

Ferri, C., Hernández-Orallo, J., and Modroiu, R. (2009).An experimental comparison of per-formance measures for classification. Pattern Recognition Letters 30(1), 27–38 (https://doi.org/10.1016/j.patrec.2008.08.010).

Gajewski, J., and Valis, D. (2017). The determination of combustion engine condition andreliability using oil analysis by MLP and RBF neural networks. Tribology International115, 557–572 (https://doi.org/10.1016/j.triboint.2017.06.032).

Hájek, P. (2011). Municipal credit rating modelling by neural networks. Decision SupportSystems 51(1), 108–118 (https://doi.org/10.1016/j.dss.2010.11.033).

Huang, C. L., Chen, M. C., and Wang, C. J. (2007). Credit scoring with a data miningapproach based on support vector machines. Expert Systems with Applications 33(4),847–856 (https://doi.org/10.1016/j.eswa.2006.07.007).

Huang, Z., Chen, H., Hsu, C. J., Chen, W. H., and Wu, S. (2004). Credit rating analysis withsupport vector machines and neural networks a market comparative study. DecisionSupport Systems 37, 543–558 (https://doi.org/10.1016/S0167-9236(03)00086-1).

Hui, X.-F., and Sun, J. (2006).An application of support vector machine to companies’finan-cial distress prediction. In Modeling Decisions for Artificial Intelligence:MDAI 2006, Torra,V., Narukawa, Y., Valls, A., and Domingo-Ferrer, J. (eds), pp. 274–282. Lecture Notes inArtificial Intelligence, Volume 3885. Springer (https://doi.org/10.1007/11681960_27).

Jones, S., Johnstone, D., and Wilson, R. (2015).An empirical evaluation of the performanceof binary classifiers in the prediction of credit ratings changes. Journal of Banking &Finance 56(C), 72–85 (https://doi.org/10.1016/j.jbankfin.2015.02.006).

Kaggle (2017). Kaggle, Inc. URL: www.kaggle.com/c/GiveMeSomeCredit.Kao, L. J., Chiu, C.-C., and Chiu, F.Y. (2012). A Bayesian latent variable model with classifi-

cation and regression tree approach for behavior and credit scoring. Knowledge-BasedSystems 36, 245–252 (https://doi.org/10.1016/j.knosys.2012.07.004).

Kim, K. J., and Ahn, H. (2012). A corporate credit rating model using multi-class supportvector machines with an ordinal pairwise partitioning approach.Computers & OperationsResearch 39(8), 1800–1811 (https://doi.org/10.1016/j.cor.2011.06.023).

Kozeny, V. (2015). Genetic algorithms for credit scoring: alternative fitness function perfor-mance comparison. Expert Systems with Applications 42, 2998–3004 (https://doi.org/10.1016/j.eswa.2014.11.028).

Lee, S., and Choi, W. S. (2013). A multi-industry bankruptcy prediction model using back-propagation neural network and multivariate discriminant analysis. Expert Systems withApplications 40(8), 2941–2946 (https://doi.org/10.1016/j.eswa.2012.12.009).

Lichman, M. (2013). UCI machine learning repository. URL: http://archive.ics.uci.edu/ml/.Liu, Y., and Liao, S. (2017). Granularity selection for cross-validation of SVM. Information

Sciences 378, 475–483 (https://doi.org/10.1016/j.ins.2016.06.051).Liu, P., Choo, R., Wang, L., and Huang, F. (2016). SVM or deep learning? A compara-

tive study on remote sensing image classification. Soft Computing 21(23), 7053–7065(https://doi.org/10.1007/s00500-016-2247-2).

Lu, C. J., Lee, T. S., and Chiu, C. C. (2009). Financial time series forecasting using inde-pendent component analysis and support vector regression. Decision Support Systems47, 115–125 (https://doi.org/10.1016/j.dss.2009.02.001).

Min, J. H., and Lee, Y. C. (2005). Bankruptcy prediction using support vector machinewith optimal choice of kernel function parameters. Expert Systems with Applications 28,603–614 (https://doi.org/10.1016/j.eswa.2004.12.008).

www.risk.net/journals Journal of Credit Risk

To subscribe to a Risk Journal visit subscriptions.risk.net/journals or email [email protected]

Page 34: Credit Risksubscriptions.risk.net/wp-content/uploads/2019/02/... · is at the forefront in tackling the many issues and challenges posed by the recent financial crisis, focusing

26 M. Z. Abedin et al

Moges, H. T., Dejaeger, K., Lemahieu, W., and Baesens, B. (2013). A multidimensionalanalysis of data quality for credit risk management: new insights and challenges.Information & Management 50, 43–58 (https://doi.org/10.1016/j.im.2012.10.001).

Murty, M. N., and Raghava, R. (2016). Support Vector Machines and Perceptrons. Springer.PAKDD (2010). Pacific-Asia Knowledge Discovery and Data Mining conference (PAKDD

2010): data mining competition.Pan, W. T. (2008). Use of probabilistic neural network to construct early warning model

for business financial distress. Journal of Statistics and Management Systems 11(4),749–760 (https://doi.org/10.1080/09720510.2008.10701340).

Pradeepkumar, D., and Ravi, V. (2017). Forecasting financial time series volatility usingparticle swarm optimization trained quantile regression neural network. Applied SoftComputing 58, 35–52 (https://doi.org/10.1016/j.asoc.2017.04.014 1).

Rajan, R., and Ramcharan, R. (2016). Local financial capacity and asset values: evi-dence from bank failures. Journal of Financial Economics 120, 229–251 (https://doi.org/10.1016/j.jfineco.2015.01.006).

Rodríguez, L. C., Castaño, E. P., and Samblás, C. R. (2016). Quality performance metricsin multivariate classification methods for qualitative analysis. TrAC: Trends in AnalyticalChemistry 80, 612–624 (https://doi.org/10.1016/j.trac.2016.04.021).

Sartori, F., Mazzucchelli, A., and Gregorio, A. (2016). Bankruptcy forecasting using case-based reasoning: the CRePERIE approach. Expert Systems with Applications 64, 400–411 (https://doi.org/10.1016/j.eswa.2016.07.033).

Sever, A. (2013). A neural network algorithm to pattern recognition in inverse prob-lems. Applied Mathematics and Computation 221, 484–490 (https://doi.org/10.1016/j.amc.2013.06.094).

Shin, K. S., Lee, T. S., and Kim, H. J. (2005). An application of support vector machines inbankruptcy prediction model. Expert Systems with Applications 28(1), 127–135 (https://doi.org/10.1016/j.eswa.2004.08.009).

Sokolova, M., and Lapalme, G. (2009). A systematic analysis of performance measuresfor classification tasks. Information Processing and Management 45, 427–437 (https://doi.org/10.1016/j.ipm.2009.03.002).

Son,Y., Byun, H., and Lee, J. (2016).Nonparametric machine learning models for predictingthe credit default swaps: an empirical study. Expert Systems with Applications 58, 210–220 (https://doi.org/10.1016/j.eswa.2016.03.049).

Specht, D. (1990). Probabilistic neural networks. Neural Networks 3, 109–118 (https://doi.org/10.1016/0893-6080(90)90049-Q).

Thomas, L. C., Edelman, D. B., and Crook, J. N. (2002). Credit Scoring and Its Applications.SIAM Monographs on Mathematical Modeling and Computation. SIAM, Philadelphia, PA(https://doi.org/10.1137/1.9780898718317).

Tong, E., Mues, C., and Thomas, L. C. (2012). Mixture cure models in credit scoring: ifand when borrowers default. European Journal of Operational Research 218, 132–139(https://doi.org/10.1016/j.ejor.2011.10.007).

Tseng, F. M., and Hu, Y. C. (2010). Comparing four bankruptcy prediction models:logit, quadratic interval logit, neural and fuzzy neural networks. Expert Systems withApplications 37, 1846–1853 (https://doi.org/10.1016/j.eswa.2009.07.081).

Vapnik, V. N. (1995). The Nature of Statistical Learning Theory. Springer (https://doi.org/10.1007/978-1-4757-2440-0).

Journal of Credit Risk www.risk.net/journals

To subscribe to a Risk Journal visit subscriptions.risk.net/journals or email [email protected]

Page 35: Credit Risksubscriptions.risk.net/wp-content/uploads/2019/02/... · is at the forefront in tackling the many issues and challenges posed by the recent financial crisis, focusing

Credit default prediction 27

Vega, M., Oliver, A., Mejías, R., and Rubio, J. (2013). Improving the management ofmicrofinance institutions by using credit scoring models based on statistical learningtechniques. Expert Systems with Applications 40, 6910–6917 (https://doi.org/10.1016/j.eswa.2013.06.031).

Wauters, M., and Vanhoucke, M. (2017). A nearest neighbour extension to project durationforecasting with artificial intelligence. European Journal of Operational Research 259,1097–1111 (https://doi.org/10.1016/j.ejor.2016.11.018).

West, D. (2000). Neural network credit scoring models. Computers & Operations Research27, 1131–1152 (https://doi.org/10.1016/S0305-0548(99)00149-5).

Wu, D. D., Liang, L., and Yang, Z. (2008). Analyzing the financial distress of Chi-nese public companies using probabilistic neural networks and multivariate discrimi-nate analysis. Socio-Economic Planning Sciences 42, 206–220 (https://doi.org/10.1016/j.seps.2006.11.002).

Xia,Y., Liu, C., Li,Y.Y., and Liu, N. (2017).A boosted decision tree approach using Bayesianhyper-parameter optimization for credit scoring. Expert Systems with Applications 78,225–241 (https://doi.org/10.1016/j.eswa.2017.02.017).

Xiao, H., Xiao, Z., and Wang, Y. (2016). Ensemble classification based on supervisedclustering for credit scoring. Applied Soft Computing 43, 73–86 (https://doi.org/10.1016/j.asoc.2016.02.022).

Xie, C., Luo, C., and Yu, X. (2011). Financial distress prediction based on SVM and MDAmethods: the case of Chinese listed companies. Quality & Quantity 45, 671–686 (https://doi.org/10.1007/s11135-010-9376-y).

Yang, Z. R., Platt, M. B., and Platt, H. D. (1999). Probabilistic neural networks in bankruptcyprediction. Journal of Business Research 44, 67–74 (https://doi.org/10.1016/S0148-2963(97)00242-7).

Yang, Z. R., Wu, D., Fu, G., and Luo, C. (2008). Credit risk evaluation using neural networks.In New Frontiers in Enterprise Risk Management, Olson, D.L., and Wu, D. (eds), pp.163–180. Springer (https://doi.org/10.1007/978-3-540-78642-9_11).

Youden, W. (1950). Index for rating diagnostic tests. Cancer 3, 32–35 (https://doi.org/10.1002/1097-0142(1950)3:1<32::AID-CNCR2820030106>3.0.CO;2-3).

Zhang, Y., and Yang, Y. (2015). Cross-validation for selecting a model selection procedure.Journal of Econometrics 187, 95–112. (https://doi.org/10.1016/j.jeconom.2015.02.006).

www.risk.net/journals Journal of Credit Risk

To subscribe to a Risk Journal visit subscriptions.risk.net/journals or email [email protected]

Page 36: Credit Risksubscriptions.risk.net/wp-content/uploads/2019/02/... · is at the forefront in tackling the many issues and challenges posed by the recent financial crisis, focusing

To subscribe to a Risk Journal visit subscriptions.risk.net/journals or email [email protected]

Page 37: Credit Risksubscriptions.risk.net/wp-content/uploads/2019/02/... · is at the forefront in tackling the many issues and challenges posed by the recent financial crisis, focusing

Journal of Credit Risk 14(2), 29–43DOI: 10.21314/JCR.2017.235

Research Paper

Modeling dependent risk factors withCreditRiskC

Xiaohang Zhang,1,2 SuBang Choe,3 Ji Zhu2 and Jill Bewick3

1School of Economics and Management, Beijing University of Posts and Telecommunications,10 Xitucheng Road, Beijing 100876, People’s Republic of China;email: [email protected] of Statistics, University of Michigan, 455 West Hall,1085 South University Avenue, Ann Arbor, MI 48019, USA;email: [email protected] Credit Company, 1 American Road, Dearborn, MI 48126, USA;emails: [email protected], [email protected]

(Received December 3, 2016; revised September 21, 2017; accepted February 5, 2018)

ABSTRACT

The CreditRiskC model has been widely used for calculating the loss distribution ofa credit portfolio. However, its basic assumption of independent risk factors is notconsistent with reality. Although the dependent structure can be mimicked by settingfactor weights, a reasonable way to introduce correlated risk factors is needed. Inthis paper, an extension of the CreditRiskC model, called the mixed vector model,is proposed. This model incorporates some common background factors with posi-tive and negative correlations, so it can accommodate the complicated dependencestructure of risk factors. The mixed vector model can rebuild the negative correlationsbetter than other extended CreditRiskC models. Moreover, it can be translated into theoriginal CreditRiskC framework with conditionally independent risk factors, so the

Corresponding author: X. Zhang Print ISSN 1744-6619 j Online ISSN 1755-9723© 2018 Infopro Digital Risk (IP) Limited

29Journal of Credit Risk

To subscribe to a Risk Journal visit subscriptions.risk.net/journals or email [email protected]

www.risk.net/journals

Page 38: Credit Risksubscriptions.risk.net/wp-content/uploads/2019/02/... · is at the forefront in tackling the many issues and challenges posed by the recent financial crisis, focusing

30 X. Zhang et al

numerical algorithm for calculating the loss distribution for the CreditRiskC modelcan be reused with little modification.

Keywords: credit portfolio risk; CreditRiskC model; dependent structure; risk factors; mixed vectormodel.

1 INTRODUCTION

One of the major tasks in modeling credit portfolio risk is capturing the underly-ing structure of interdependency between the obligors’ default events. CreditRiskC

(Credit Suisse Financial Products 1997) has been widely employed as a portfoliomodel in business units such as treasury, credit treasury and portfolio management.CreditRiskC assumes that the probability of default is affected by some systematicrisk factors, which can be industries, geographical regions, etc, and the systematic riskfactors S D .S1; S2; : : : ; SK/ are independent gamma-distributed random variableswith

EŒSk� D 1; varŒSk� D �2k :

Conditional on the risk factors, the probability of default of obligor A 2 A is givenby

pSA D pA

�wA;0 C

KXkD1

wA;kSk

�; (1.1)

where pA is the unconditional (expected) probability of default, and wA;k are thefactor weights satisfying

KXkD0

wA;k D 1; 0 6 wA;k 6 1; k D 0; : : : ; K:

In the CreditRiskC model, the default event DA is assumed to follow a Poissondistribution with conditional intensity pS

A and, conditional on risk factors, any twodefault events are independent. Therefore, the correlation of default events can beobtained by

corrŒDA; DB � D EŒcovŒDA; DB j S�� C covŒEŒDA j S�; EŒDB j S��pvarŒDA� varŒDB �

Dr

pApB

.1 � pA/.1 � pB/

KXk;k0D1

wA;kwB;k0 covŒSk; Sk0 �: (1.2)

Because of the assumption of independent risk factors in the original CreditRiskC

model, the only way to introduce dependence between obligors’ defaults is by setting

Journal of Credit Risk www.risk.net/journals

To subscribe to a Risk Journal visit subscriptions.risk.net/journals or email [email protected]

Page 39: Credit Risksubscriptions.risk.net/wp-content/uploads/2019/02/... · is at the forefront in tackling the many issues and challenges posed by the recent financial crisis, focusing

Modeling dependent risk factors with CreditRiskC 31

the corresponding factor weights so that the obligors share some common risk factors.Although assigning more than one risk factor to an obligor is often necessary, thismethod should not be misused in order to mimic the underlying dependence structurethat often arises in the dependence between risk factors (Giese 2004).

To introduce the interdependent risk factors, some extensions to the originalCreditRiskC model have been proposed. Burgisser et al (1999) examined the cor-relations between industries and derived a formula for the unexpected loss and riskcontributions. The compound gamma model (Giese 2004) introduced dependence byincorporating one or more common background variables into the shape parametersof the risk factor distributions. The hidden gamma model (Giese 2004), the commonfactor model (Han and Kang 2008) and the common vector model (Fischer and Dietz2011) introduced dependence by directly adding one or more common backgroundfactors to the independent risk factors. Wang et al (2015) modeled the dependenceusing a class of extreme copulas. For all of the extended methods mentioned above,the loss distribution can be calculated using a numerical algorithm designed for theoriginal CreditRiskC model.

We have been inspired by previous research to propose a mixed vector model withinthe CreditRiskC framework, which can accommodate a wide range of positive or neg-ative dependent structures. The existing numerical algorithm for the loss distributioncalculation can be reused with little modification.

2 EXTENDED CREDITRISK+ MODELS WITH DEPENDENTSTRUCTURES

The compound gamma model (Giese 2004) introduces several background factorsinto the shape parameters of the risk factor distributions. The risk factors are definedas

Sk � Gamma

� MXiD1

˛k;iTi ; ˇk

(where Ti � Gamma.��1i ; �i /, i D 1; : : : ; M , are independent) and are independent

conditional on T1; : : : ; TM . Gamma.��1i ; �i / denotes the gamma distribution with

shape parameter ��1i and scale parameter �i . Here, ˛k;i > 0 for k D 1; : : : ; K and

i D 1; : : : ; M . Because of the background factors, Sk , k D 1; : : : ; K, are dependent.The hidden gamma model (Giese 2004) adds a common factor that affects all

risk factors and therefore introduces the covariance structure of the risk factors. Thedependence of the risk factors can then be interpreted from an economic point ofview: an underlying macroeconomic factor is involved in all risk factors and resultsin correlation. The risk factors in the hidden gamma model are defined as

Sk D ˛k.Yk C OY /

www.risk.net/journals Journal of Credit Risk

To subscribe to a Risk Journal visit subscriptions.risk.net/journals or email [email protected]

Page 40: Credit Risksubscriptions.risk.net/wp-content/uploads/2019/02/... · is at the forefront in tackling the many issues and challenges posed by the recent financial crisis, focusing

32 X. Zhang et al

where Yk � Gamma.�k; 1/ and OY � Gamma. O�; 1/. The specific risk factors Yk , k D1; : : : ; K, and the common factor OY are independent. Here, ˛k > 0 for k D 1; : : : ; K.

The common factor model (Han and Kang 2008) generalizes the restricted covari-ance structure of the hidden gamma model by relaxing the condition that the coefficientof Yk must be the same as that of OY :

Sk D ˛kYk C ˇkOY

with Yk � Gamma.�k; 1/, OY � Gamma. O�; 1/, ˛k > 0 for k D 1; : : : ; K, andPKkD1 wA;kˇk > 0 for A 2 A. In this generalized model, Sk is no longer gamma

distributed as long as ˛k ¤ ˇk .The common vector model (Fischer and Dietz 2011) further generalizes the

common factor model as follows:

Sk D ˛kYk CMX

iD1

ˇk;iOYi

with Yk � Gamma.�k; 1/ and OYi � Gamma. O�i ; 1/, where Yk and OYi are independent.Here, ˛k > 0 for k D 1; : : : ; K, and

PKkD1 wA;kˇk;i > 0 for A 2 A and i D

1; : : : ; M .We may find that it is difficult to introduce a negative dependence covariance

structure into the compound gamma model, the hidden gamma model, the commonfactor model and the common vector model. In reality, however, negatively correlatedrisk factors can appear. For example, if two industries or two geographical areas arein a competitive situation, they have opposing effects on their obligors’ defaults. TheCA;B-copula-based model (Wang et al 2015) addresses this problem by introducinga more complicated dependent structure of the risk factors:

Sk D IA

Ck

SCk

C IA?k

S?k C IA�

kS�

k ;

with SCk

D F �1k

.U /, S�k

D F �1k

.1 � U / and S?k

D F �1k

.Yk/. Within theCreditRiskC framework F �1

k.�/ is the inverse of the gamma distribution function.

The common latent variable U is a uniformly distributed random factor on Œ0; 1�,and Yk , k D 1; : : : ; K, are independent and identically distributed UnifŒ0; 1� randomvariables. For each k D 1; : : : ; K, fAC

k; A?

k; A�

kg is a partition of the probability

space and PŒACk

� D aCk

, PŒA?k

� D a?k

, PŒA�k

� D a�k

. The random event vectors.AC

k; A?

k; A�

k/, k D 1; : : : ; K, are mutually independent, and are independent of the

common latent variable U and the random variables Yk . Therefore, the values of a?k

,

Journal of Credit Risk www.risk.net/journals

To subscribe to a Risk Journal visit subscriptions.risk.net/journals or email [email protected]

Page 41: Credit Risksubscriptions.risk.net/wp-content/uploads/2019/02/... · is at the forefront in tackling the many issues and challenges posed by the recent financial crisis, focusing

Modeling dependent risk factors with CreditRiskC 33

aCk

and a�k

determine the independence, and the positive and negative relationshipsof the risk factors, respectively.

3 THE MIXED VECTOR MODEL

Inspired by the models mentioned in Section 2, we propose the following mixedvector model:

Sk D a?k Y ?

k CMX

iD1

aCk;i

U Y Ci C

MXiD1

a�k;i .1 � U /Y �

i (3.1)

with

Y ?k � Gamma.�?

k ; 1/; k D 1; : : : ; K;

Y Ci � Gamma.�C

i ; 1/; Y �i � Gamma.��

i ; 1/; i D 1; : : : ; M;

U � UnifŒ0; 1�;

a?k ; aC

k;i; a�

k;i > 0:

Here, Y ?k

, k D 1; : : : ; K, Y Ci , Y �

i , i D 1; : : : ; M , and U are independent. Y ?k

arethe specific influence factors, and U Y C

i and .1 � U /Y �j , i; j D 1; : : : ; M , denote

two common opposite influence vectors. Y ?k

is independent of U Y Ci and .1�U /Y �

j ,

but U Y Ci and .1 � U /Y �

j are negatively correlated:

corrŒU Y Ci ; .1 � U /Y �

j � D �s

�Ci

�Ci C 4

��j

��j C 4

I (3.2)

U Y Ci and U Y C

j (.1 � U /Y �i and .1 � U /Y �

j ) are positively correlated:

corrŒU Y Ci ; U Y C

j � D

vuut �Ci

�Ci C 4

�Cj

�Cj C 4

; i ¤ j;

corrŒ.1 � U /Y �i ; .1 � U /Y �

j � Ds

��i

��i C 4

��j

��j C 4

; i ¤ j:

If aCk;i

D a�k;i

D 0, a?k

> 0, k D 1; : : : ; K, i D 1; : : : ; M , then the risk factors areindependent and the model is exactly the original CreditRiskC model. If a�

k;iD 0 and

a?k;i

; aCk;i

> 0, then the risk factors are positive dependent and the model is similarto the common factor model. If a?

k;iD a?

k0;i D a�k0;i D aC

k;iD 0 and a�

k;i; aC

k0;i > 0,

www.risk.net/journals Journal of Credit Risk

To subscribe to a Risk Journal visit subscriptions.risk.net/journals or email [email protected]

Page 42: Credit Risksubscriptions.risk.net/wp-content/uploads/2019/02/... · is at the forefront in tackling the many issues and challenges posed by the recent financial crisis, focusing

34 X. Zhang et al

then the risk factors Sk and Sk0 are negatively correlated. The advantage of ourproposed model is that the parameters of a?

k, aC

k;iand a�

k;ican be adjusted to reflect

the complicated dependence structure of risk factors.

3.1 Properties of the proposed model

The first and second moments of the risk factors can be derived as follows:

EŒSk� D a?k �?

k C 1

2

MXiD1

.aCk;i

�Ci C a�

k;i��i /;

varŒSk� D .a?k /2�?

k

CMX

iD1

f.aCk;i

/2. 112

.�Ci /2 C 1

3�C

i / C .a�k;i /

2. 112

.��i /2 C 1

3��

i /g

C 1

12

MXi;j D1Ii¤j

.aCk;i

aCk;j

�Ci �C

j C a�k;ia

�k;j ��

i ��j /

� 1

6

MXi;j D1

aCk;i

a�k;j �C

i ��j ;

covŒSk; Sk0 � DMX

iD1

faCk;i

aCk0;i .

112

.�Ci /2 C 1

3�C

i / C a�k;ia

�k0;i .

112

.��i /2 C 1

3��

i /g

C 1

12

MXi;j D1Ii¤j

.aCk;i

aCk0;j �C

i �Cj C a�

k;ia�k0;j ��

i ��j /

� 1

12

MXi;j D1

.aCk;i

a�k0;j �C

i ��j C a�

k;iaCk0;j ��

i �Cj /; k ¤ k0:

The variance of risk factors has four parts: the variance of specific factors Y ?k

, thevariance of U Y C

i (and .1 � U /Y �i ), the positive covariance between U Y C

i and U Y Cj

(and between .1 � U /Y �i and .1 � U /Y �

j ) and the negative covariance between U Y Ci

and .1 � U /Y �j . The covariance of risk factors has a similar structure, except for

the specific factors. Because the negative part of the risk factors’ covariance comesonly from the covariance between U Y C

i and .1 � U /Y �j (they also contribute to the

positive part of covariance), the covariance has a lower bound.

Proposition 3.1 For the mixed vector model, the lower bound of covŒSk; Sk0 �

is �13

.

Journal of Credit Risk www.risk.net/journals

To subscribe to a Risk Journal visit subscriptions.risk.net/journals or email [email protected]

Page 43: Credit Risksubscriptions.risk.net/wp-content/uploads/2019/02/... · is at the forefront in tackling the many issues and challenges posed by the recent financial crisis, focusing

Modeling dependent risk factors with CreditRiskC 35

Proof of Proposition 3:1 The covariance covŒSk; Sk0 � can be rewritten as

covŒSk; Sk0 � D 1

12

MXi;j D1

.aCk;i

�Ci � a�

k;i��i /.aC

k0;j �Cj � a�

k0;j ��j /

C 1

3

MXiD1

.aCk;i

aCk0;i�

Ci C a�

k;ia�k0;i�

�i /:

According to the definition of Sk , the parameters satisfy

aCk;i

; aCk0;i ; a�

k;i ; a�k0;i > 0; �C

i ; ��i > 0; i D 1; : : : ; M:

Thus, when aCk;i

D a�k0;i D 0, a�

k;i; aC

k0;i > 0 or a�k;i

D aCk0;i D 0, aC

k;i; a�

k0;i > 0,covŒSk; Sk0 � can reach the minimum. Without loss of generality, we assume thataC

k;iD a�

k0;i D 0, a�k;i

; aCk0;i > 0, i D 1; : : : ; M , and obtain

covŒSk; Sk0 � D � 1

12

MXi;j D1

a�k;i�

�i aC

k0;j �Cj :

Because of the original CreditRiskC model’s basic assumption that EŒSk� D 1, k D1; : : : ; K, we get

a?k �?

k C 1

2

MXiD1

a�k;i�

�i D 1; a?

k0�?k0 C 1

2

MXiD1

aCk0;i�

Ci D 1

and

max

� MXiD1

a�k;i�

�i

�D max

� MXiD1

aCk0;i�

Ci

�D 2:

Thus, we can obtain

max

� MXi;j D1

a�k;i�

�i aC

k0;j �Cj

�D 4:

Therefore, the minimum of covŒSk; Sk0 � is �13

. �

Proposition 3:1 illustrates that the approximation of the negative covariancestructure is limited and is not related to the value of M .

www.risk.net/journals Journal of Credit Risk

To subscribe to a Risk Journal visit subscriptions.risk.net/journals or email [email protected]

Page 44: Credit Risksubscriptions.risk.net/wp-content/uploads/2019/02/... · is at the forefront in tackling the many issues and challenges posed by the recent financial crisis, focusing

36 X. Zhang et al

The default covariance between obligor A and B can be stated as follows:

covŒDA; DB �

D covŒpSA ; pS

B �

D pApB

� KXkD1

wA;kwB;k.a?k /2�?

k

CKX

k;k0D1

wA;kwB;k0

�MX

iD1

faCk;i

aCk0;i .

112

.�Ci /2 C 1

3�C

i / C a�k;ia

�k0;i .

112

.��i /2 C 1

3��

i /g

C 1

12

KXk;k0D1

wA;kwB;k0

MXi;j D1Ii¤j

.aCk;i

aCk0;j �C

i �Cj C a�

k;ia�k0;j ��

i ��j /

� 1

12

KXk;k0D1

wA;kwB;k0

MXi;j D1

.aCk;i

a�k0;j �C

i ��j C a�

k;iaCk0;j ��

i �Cj /

�:

(3.3)

The first term corresponds to the covariance of the standard CreditRiskC model, whilethe second and third terms are the adjustments due to the positive correlation of riskfactors. The fourth term is the adjustment for negative correlation.

3.2 Calibration of model parameters

Following the method for the common factor model (Han and Kang 2008), the param-eters of the proposed model can be calibrated by minimizing the distance betweenthe observed covariance matrix and the estimated covariance matrix. Let � denote theobserved covariance matrix. The distance can be defined as

f .a?k ; aC

k;i; a�

k;i ; �?k ; �C

i ; ��i /

DKX

kD1

.varŒSk� � �k;k/2 CKX

kD1

k�1Xk0D1

.covŒSk; Sk0 � � �k;k0/2: (3.4)

The minimization problem can be stated as

mina?

k;a

Ck;i

;a�k;i

;�?k

;�Ci

;��i

f .a?k ; aC

k;i; a�

k;i ; �?k ; �C

i ; ��i /

Journal of Credit Risk www.risk.net/journals

To subscribe to a Risk Journal visit subscriptions.risk.net/journals or email [email protected]

Page 45: Credit Risksubscriptions.risk.net/wp-content/uploads/2019/02/... · is at the forefront in tackling the many issues and challenges posed by the recent financial crisis, focusing

Modeling dependent risk factors with CreditRiskC 37

subject to

a?k �?

k C 1

2

MXiD1

.aCk;i

�Ci C a�

k;i��i / D 1; k D 1; : : : ; K;

a?k ; aC

k;i; a�

k;i > 0; �?k ; �C

i ; ��i > 0; k D 1; : : : ; K; i D 1; : : : ; M:

The first constraint ensures that EŒSk� D 1. If variances are considered to be moreaccurately measured than covariance, the variance can be matched exactly and thesecond term in the distance function is used to calibrate the parameters.

In the distance function, each element in the covariance matrix has equal loading.In practice, the risk factors may play different roles due to the difference in the numberof obligors affected by each risk factor. Thus, we can adjust the distance function byadding loading weights:

Qf .a?k ; aC

k;i; a�

k;i ; �?k ; �C

i ; ��i /

DKX

kD1

�k;k.varŒSk� � �k;k/2 CKX

kD1

k�1Xk0D1

�k;k0.covŒSk; Sk0 � � �k;k0/2 (3.5)

with

�k;k0 DpP

A2A wA;k

PA2A wA;k0

PKlD1

Pll 0D1

pPA2A wA;l

PA2A wA;l 0

:

3.3 Calculation of loss distribution

In order to fit the proposed model into the original CreditRiskC framework, pSA can

be written as

pSA D pA

�wA;0 C

KXkD1

wA;kSk

D pA

�wA;0 C

KXkD1

wA;ka?k Y ?

k CMX

iD1

U Y Ci

KXkD1

wA;kaCk;i

CMX

iD1

.1 � U /Y �i

KXkD1

wA;ka�k;i

WD pA

�QwA;0 C

KC2MXkD1

QwA;kQSk

�(3.6)

www.risk.net/journals Journal of Credit Risk

To subscribe to a Risk Journal visit subscriptions.risk.net/journals or email [email protected]

Page 46: Credit Risksubscriptions.risk.net/wp-content/uploads/2019/02/... · is at the forefront in tackling the many issues and challenges posed by the recent financial crisis, focusing

38 X. Zhang et al

with

QwA;k WD

8ˆˆˆ<ˆˆˆ:

wA;0; k D 0;

wA;ka?k

; k D 1; : : : ; K;

KXlD1

wA;laCl;k�K

; k D K C 1; : : : ; K C M;

KXlD1

wA;la�l;k�K�M ; k D K C M C 1; : : : ; K C 2M;

and, given U ,

QSk WD

8ˆˆ<ˆˆ:

Y ?k

� Gamma.�?k

; 1/; k D 1; : : : ; K;

U Y Ck�K

� Gamma.�Ck�K

; U /; k D K C 1; : : : ; K C M;

.1 � U /Y �k�K�M

� Gamma.��k�K�M

; 1 � U /; k D K C M; : : : ; K C 2M:

Therefore, conditional on U , QSk , k D 1; : : : ; K C2M , are independent. According toGundlach (2004), the conditional probability generating function of the default lossL D

PA2A DAvA is

GL.z j U / D exp

� XA2A

QwA0pA.zvA � 1/

CKC2MX

kD1

˛k.�1/ ln

�1 � ˇk

XA2A

QwAkpA.zvA � 1/

��; (3.7)

where ˛k and ˇk are the shape and scale parameters of QSk , respectively, and vA isthe net exposure of obligor A 2 A. Given U , the numerical procedure for the lossdistribution calculation in Haaf et al (2004) can be written as follows.

(1) The coefficients a.k/ before the expansion of the logarithm are calculated as

a.k/0 D 1 C ˇk

XA2A

QwAkpA; a.k/n D ˇk

XA2A W vADn

QwAkpA; n > 1;

k D 1; : : : ; K C 2M:

(2) The coefficients b.k/ after the expansion of the logarithm are calculated as

b.k/0 D � ln.a

.k/0 /; b.k/

n D 1

a.k/0

�a.k/

n C 1

n

n�1Xj D1

jb.k/j a

.k/n�j

�; n > 1;

k D 1; : : : ; K C 2M:

Journal of Credit Risk www.risk.net/journals

To subscribe to a Risk Journal visit subscriptions.risk.net/journals or email [email protected]

Page 47: Credit Risksubscriptions.risk.net/wp-content/uploads/2019/02/... · is at the forefront in tackling the many issues and challenges posed by the recent financial crisis, focusing

Modeling dependent risk factors with CreditRiskC 39

(3) The coefficients c before the exponential expansion are calculated as

c0 D �XA2A

QwA0 OpA CKC2MX

kD1

˛kb.k/0 ;

cn DX

A2A W vADn

QwA0 OpA CKC2MX

kD1

˛kb.k/n ; n > 1:

(4) The coefficients g after the exponential expansion are calculated as

PŒL D 0 j U � D g0 D exp.c0/;

PŒL D n j U � D gn DnX

j D1

j

ngn�j cj ; n > 1:

The probability of default loss can be calculated by

PŒL D n� D EŒPŒL D n j U �� DZ 1

0

PŒL D n j U D u� du:

The above integral does not have a closed form and must be calculatednumerically.

4 NUMERICAL EXAMPLES

In this section, we apply the proposed model to an example data set under three differ-ent scenarios. This example data set is generated randomly and contains 3000 obligors,each of which is affected by only one of eight risk factors. The three scenarios repre-sent experiments without the negative dependence between risk factors, with negativecovariance less than �1

3and with negative covariance greater than �1

3, respectively.

In each scenario, eight random variables (x1; x2; : : : ; x8) are generated according tosome specified formulas, and used to generate the covariance, Spearman’s rho andKendall’s tau of the risk factors. The parameters of the models are estimated with theconstraint that the variance is matched exactly. The parameters of the CA;B-copula-based model are estimated by approximating the Spearman’s rho and Kendall’s taucorrelation matrixes of risk factors (see Appendix A online). The optimization prob-lems of estimating model parameters are solved by setting thirty groups of initialvalues for each model, and the parameters with the optimal objective function valueare adopted. For the CA;B-copula-based model, Monte Carlo simulation is used toestimate the covariance matrix.

www.risk.net/journals Journal of Credit Risk

To subscribe to a Risk Journal visit subscriptions.risk.net/journals or email [email protected]

Page 48: Credit Risksubscriptions.risk.net/wp-content/uploads/2019/02/... · is at the forefront in tackling the many issues and challenges posed by the recent financial crisis, focusing

40 X. Zhang et al

Following the work of Fischer and Dietz (2011), three normalized distance mea-sures between the generated covariance matrix (˙ij ) and the estimated covariance ofthe models ( O

ij ) are calculated to quantify the goodness-of-fit:

D1 Dvuut 1

K.K � 1/=2

Xi<j

. Oij � ˙ij /2;

D2 D 1

K.K � 1/=2

Xi<j

j Oij � ˙ij j;

D3 D maxi<j

j Oij � ˙ij j:

4.1 Scenario 1: covariance with only positive dependence

In scenario 1, the eight random variables are generated by

xi D(

.0:9 C 0:1ui /� C 0:8�i ; i D 1; : : : ; 7;

�i ; i D 8;

where ui � UnifŒ0; 1�, � � Gamma.0:8; 1/, �i � Gamma.vi ; 1/ and vi �UnifŒ0:3; 1:5�; ui , �i , vi , i D 1; : : : ; 8, and � are independent. The parametersused for generating the random variables are chosen such that the first seven variablesare positively correlated and are independent of the eighth. Examples of a covariancematrix of xi and its estimated matrixes are shown in Tables B.1 and B.2, respectively,of Appendix B online. The values-at-risk (VaRs) for different confidence levels aresummarized in Table B.3.

To evaluate the goodness-of-fit between the models, thirty experiments are exe-cuted. The mean values and standard deviations of D1, D2 and D3 are shown inTable 1. The paired t -tests verify that the D1, D2 and D3 measures of the mixed vectormodel (M D 2) are significantly less than those of the other models at level 0.01.

4.2 Scenario 2: covariance with negative dependence less than �13

The eight random variables in scenario 2 are generated by

xi D

8ˆ<ˆ:

�.0:5 C 0:3ui /� C 0:8�i ; i D 1; : : : ; 3;

.0:9 C 0:1ui /� C 0:8�i ; i D 4; : : : ; 7;

�i ; i D 8;

where ui � UnifŒ0; 1�, � � Gamma.0:8; 1/, �i � Gamma.vi ; 1/ and vi �UnifŒ0:3; 1:5�; ui , �i , vi , i D 1; : : : ; 8, and � are independent. The parameters

Journal of Credit Risk www.risk.net/journals

To subscribe to a Risk Journal visit subscriptions.risk.net/journals or email [email protected]

Page 49: Credit Risksubscriptions.risk.net/wp-content/uploads/2019/02/... · is at the forefront in tackling the many issues and challenges posed by the recent financial crisis, focusing

Modeling dependent risk factors with CreditRiskC 41

TABLE 1 Goodness-of-fit measures of scenario 1 experiments.

Model D1 D2 D3

Hidden gamma 0.248 (0.069) 0.211 (0.062) 0.481 (0.146)Compound gamma M D 2 0.017 (0.004) 0.013 (0.003) 0.044 (0.015)Compound gamma M D 4 0.011 (0.004) 0.008 (0.002) 0.032 (0.015)Common vector M D 2 0.012 (0.003) 0.008 (0.002) 0.033 (0.015)Common vector M D 4 0.009 (0.004) 0.006 (0.002) 0.028 (0.016)Mixed vector M D 1 0.010 (0.002) 0.008 (0.002) 0.026 (0.009)Mixed vector M D 2 0.002 (0.001) 0.002 (0.001) 0.007 (0.002)CA;B -copula 0.117 (0.034) 0.088 (0.023) 0.255 (0.090)

TABLE 2 Goodness-of-fit measures of scenario 2 experiments.

Model D1 D2 D3

Hidden gamma 0.462 (0.047) 0.397 (0.041) 0.704 (0.075)Compound gamma M D 2 0.325 (0.037) 0.248 (0.029) 0.579 (0.075)Compound gamma M D 4 0.316 (0.035) 0.211 (0.023) 0.579 (0.075)Common vector M D 2 0.316 (0.035) 0.212 (0.024) 0.579 (0.075)Common vector M D 4 0.316 (0.035) 0.211 (0.023) 0.579 (0.075)Mixed vector M D 1 0.163 (0.033) 0.126 (0.022) 0.331 (0.088)Mixed vector M D 2 0.136 (0.035) 0.100 (0.025) 0.280 (0.077)CA;B -copula 0.238 (0.05) 0.198 (0.041) 0.455 (0.102)

used for generating the random variables are chosen such that the first three variablesare positively correlated; these are negatively correlated with the fourth, fifth, sixthand seventh variables, and are independent of the eighth one, while some elementsof the covariance matrix are less than �1

3. Appendix C online shows an example

of the generated covariance matrix, its estimated covariance and the VaRs. Fromthe example, we can see that the negative covariance is estimated to be zero byboth the compound gamma model and the common vector model. The mixed vec-tor model and the CA;B-copula model can estimate the negative coefficients. TheVaR results of the mixed vector model at high levels are much smaller than those ofthe other models because the mixed vector model rebuilds the negatively correlatedstructure.

The mean values and standard deviations of D1, D2 and D3 in thirty experimentsare shown in Table 2. The paired t tests verify that the D1, D2 and D3 measures of themixed vector model (M D 1; 2) are significantly less than those of the other modelsat level 0.01.

www.risk.net/journals Journal of Credit Risk

To subscribe to a Risk Journal visit subscriptions.risk.net/journals or email [email protected]

Page 50: Credit Risksubscriptions.risk.net/wp-content/uploads/2019/02/... · is at the forefront in tackling the many issues and challenges posed by the recent financial crisis, focusing

42 X. Zhang et al

TABLE 3 Goodness-of-fit measures of scenario 3 experiments.

Model D1 D2 D3

Hidden gamma 0.320 (0.034) 0.280 (0.029) 0.482 (0.054)Compound gamma M D 2 0.206 (0.027) 0.180 (0.023) 0.319 (0.044)Compound gamma M D 4 0.179 (0.025) 0.123 (0.017) 0.319 (0.044)Common vector M D 2 0.179 (0.025) 0.123 (0.017) 0.319 (0.044)Common vector M D 4 0.179 (0.025) 0.122 (0.017) 0.319 (0.044)Mixed vector M D 1 0.062 (0.024) 0.050 (0.018) 0.136 (0.054)Mixed vector M D 2 0.030 (0.020) 0.022 (0.015) 0.075 (0.046)CA;B -copula 0.160 (0.034) 0.137 (0.028) 0.299 (0.066)

4.3 Scenario 3: covariance with negative dependence greater than�1

3

Scenario 3 generates the eight random variables by

xi D

8ˆ<ˆ:

�.0:54 C 0:06ui /� C 0:5� .1/ C 0:8�i ; i D 1; : : : ; 3;

.0:54 C 0:06ui /� C 0:5� .2/ C 0:8�i ; i D 4; : : : ; 7;

�i ; i D 8;

where ui � UnifŒ0; 1�, �; � .1/; � .2/ � Gamma.0:8; 1/, �i � Gamma.vi ; 1/ andvi � UnifŒ0:3; 1:5�; ui , �i , vi , i D 1; : : : ; 8, � .1/, � .2/ and � are independent. Theparameters used for generating the random variables are chosen such that the firstthree risk factors are positively correlated, are negatively correlated with the fourth,fifth, sixth and seventh risk factors, and are independent of the eighth risk factor,while all elements of the covariance matrix are greater than �1

3. Appendix D online

shows an example of the generated covariance matrix, its estimated covariance andthe VaRs.

The mean values and standard deviation of D1, D2 and D3 in thirty experimentsare shown in Table 3. The paired t -tests verify that the D1, D2 and D3 measuresof the mixed vector model (M D 1; 2) are significantly less than those of the othermodels at level 0.01.

5 SUMMARY

In this paper, we proposed a mixed vector model within the CreditRiskC frame-work, which can accommodate a complicated dependence structure of risk factors.Compared with other extended CreditRiskC models, our model can better rebuildthe negative correlations of risk factors. Moreover, it can be translated into the

Journal of Credit Risk www.risk.net/journals

To subscribe to a Risk Journal visit subscriptions.risk.net/journals or email [email protected]

Page 51: Credit Risksubscriptions.risk.net/wp-content/uploads/2019/02/... · is at the forefront in tackling the many issues and challenges posed by the recent financial crisis, focusing

Modeling dependent risk factors with CreditRiskC 43

CreditRiskC framework with conditionally independent risk factors, so the numeri-cal algorithm for loss calculation can be used directly with little modification. Thelimitation of the mixed vector model is that the approximation of negative covariancestructure is limited.

DECLARATION OF INTEREST

The authors report no conflicts of interest. The authors alone are responsible for thecontent and writing of the paper.

ACKNOWLEDGEMENTS

The work of X. Zhang and J. Zhu was partly supported by the NSFC (grant 71371034).

REFERENCES

Burgisser, P., Kurth, A., Wagner, A., and Wolf, M. (1999). Integrating correlations. Risk12(7), 1–7.

Credit Suisse Financial Products (1997). CreditRiskC: a credit risk management frame-work. Technical paper. URL: www.csfb.com/institutional/research/assets/creditrisk.pdf.

Fischer, M., and Dietz, C. (2011). Modeling sector correlations with CreditRiskC: the com-mon background vector model. The Journal of Credit Risk 7(4), 23–43 (https://doi.org/10.21314/JCR.2011.134).

Giese, G. (2004). Dependent risk factors. In CreditRiskC in the Banking Industry, Grund-lach, M., and Lehrbass, F. (eds), pp. 153–165. Springer (https://doi.org/10.1007/978-3-662-06427-6_10).

Gundlach, V. M. (2004). Basics of CreditRiskC. In CreditRiskC in the Banking Industry,Grundlach, M., and Lehrbass, F. (eds), pp. 7–24. Springer (https://doi.org/10.1007/9783662-06427-6_2).

Haaf, H., Reiß, O., and Schoenmakers, J. (2004). Numerically stable computation ofCreditRiskC. The Journal of Risk 6(4), 1–10 (https://doi.org/10.21314/JOR.2004.097).

Han, C., and Kang, J. (2008). An extended CreditRiskC framework for portfolio creditrisk management. The Journal of Credit Risk 4(4), 63–80 (https://doi.org/10.21314/JCR.2008.080).

Wang, R., Peng, L., and Yang, J. (2015). CreditRiskC model with dependent risk factors.North American Actuarial Journal 19(1), 24–40 (https://doi.org/10.1080/10920277.2014.976311).

www.risk.net/journals Journal of Credit Risk

To subscribe to a Risk Journal visit subscriptions.risk.net/journals or email [email protected]

Page 52: Credit Risksubscriptions.risk.net/wp-content/uploads/2019/02/... · is at the forefront in tackling the many issues and challenges posed by the recent financial crisis, focusing

To subscribe to a Risk Journal visit subscriptions.risk.net/journals or email [email protected]

Page 53: Credit Risksubscriptions.risk.net/wp-content/uploads/2019/02/... · is at the forefront in tackling the many issues and challenges posed by the recent financial crisis, focusing

Journal of Credit Risk 14(2), 45–74DOI: 10.21314/JCR.2017.236

Research Paper

Consumer risk appetite, the credit cycle andthe housing bubble

Joseph L. Breeden1 and José J. Canals-Cerdá2

1Prescient Models LLC, 300 Catron Street, Suite B, Santa Fe, NM 87501, USA;email: [email protected] Reserve Bank of Philadelphia, Ten Independence Mall, Philadelphia, PA 19106, USA;email: [email protected]

(Received December 3, 2016; revised September 21, 2017; accepted February 15, 2018)

ABSTRACT

In this paper, we explore the role of consumer risk appetite in the initiation of creditcycles and as an early trigger of the US mortgage crisis. We analyze a panel data setof mortgages originated between 2000 and 2009 and follow their performance up to2014. After controlling for all of the usual observable effects, we show that a strongresidual vintage effect remains. This vintage effect correlates well with consumermortgage demand, as measured by the Federal Reserve Board’s Senior Loan OfficerOpinion Survey, and with changes in mortgage pricing at the time the loan wasoriginated. Our findings are consistent with an economic environment in which theincentives of low-risk consumers to obtain a mortgage decrease when the cost ofobtaining a loan rises. As a result, mortgage originators generate mortgages from apool of consumers with changing risk profiles over the credit cycle. The unobservablecomponent of the shift in credit risk, relative to the usual underwriting criteria, maybe thought of as macroeconomic adverse selection.

Keywords: credit risk; credit cycle; mortgages; lending standards; financial crisis.

Corresponding author: J. L. Breeden Print ISSN 1744-6619 j Online ISSN 1755-9723© 2018 Infopro Digital Risk (IP) Limited

45Journal of Credit Risk

To subscribe to a Risk Journal visit subscriptions.risk.net/journals or email [email protected]

www.risk.net/journals

Page 54: Credit Risksubscriptions.risk.net/wp-content/uploads/2019/02/... · is at the forefront in tackling the many issues and challenges posed by the recent financial crisis, focusing

46 J. L. Breeden and J. J. Canals-Cerdá

1 INTRODUCTION

In our experience developing models for forecasting and stress testing portfolio creditrisk through the US mortgage crisis, we have often observed the suboptimal perfor-mance of standard underwriting measures, which is insufficient to explain observedvariations in credit quality. In this paper, we explore the possible causes of this unex-plained variation and conjecture that consumer risk appetite may be a root cause.We refer to this effect as “macroeconomic adverse selection” to emphasize that loansexhibit anomalous credit risk because of consumers’ perception of macroeconomicconditions.

Changes in default risk that cannot be observed via standard credit scores andare suspected of being caused by consumer behavior are generally referred to asadverse selection. The macroeconomic adverse selection mechanism we consider inthis paper relates to anomalous credit risk associated with consumers’ perception ofmacroeconomic conditions. In this regard, the aim of our paper is similar to that ofBreeden et al (2008) and Calem et al (2011), which will be discussed in some detaillater in this section.

In contrast, in the standard example, adverse selection can impact a specific lenderwhen it fails to respond to precautionary product or pricing changes made by itspeers. Through the lender’s inaction, consumers with lower credit risk are drawn toother lenders, leaving only the riskier borrowers for the unresponsive lender. In thisscenario, the credit risk faced by the lender for the originated pool of loans can bemuch worse than what could be expected using traditional measures of credit quality,such as borrower credit scores. In terms of nomenclature, we have chosen to relabelthis standard form of adverse selection “competitive adverse selection” to differentiateit from the macroeconomic adverse selection mechanism that is the subject of analysisin this paper.

The adverse selection just described for retail lending is a specific example of thebroader adverse selection problem that arises from asymmetric information (Akerlof1970). Asymmetric information and the creation of adverse selection have been stud-ied in employment (Bar-Isaac et al 2007) and insurance (Rothschild and Stiglitz1976). In the context of the competitive adverse selection mentioned above, Stiglitzand Weiss (1981) explored loan pricing versus credit risk. Ausubel (1998) studiedcredit card default risk and observed that inferior product offers resulted in pools ofinferior borrowers.

In general terms, the same result is sought in the current study. Rather than an offerbeing inferior because of pricing terms relative to other offers in the market, mightborrowers view all offers to be inferior during certain economic conditions? Thelender cannot know the personal motivations and value assessments of the individualborrowers, which creates an information asymmetry.

Journal of Credit Risk www.risk.net/journals

To subscribe to a Risk Journal visit subscriptions.risk.net/journals or email [email protected]

Page 55: Credit Risksubscriptions.risk.net/wp-content/uploads/2019/02/... · is at the forefront in tackling the many issues and challenges posed by the recent financial crisis, focusing

Consumer risk appetite, the credit cycle and the housing bubble 47

When the real estate bubble in the United States and across several Europeancountries burst, it precipitated a deep financial crisis accompanied by an unsettlingsovereign crisis in Europe. Understanding the mechanisms that led to the creationof the real estate bubble can prove extremely helpful, particularly for implementingappropriate policies to minimize the risks of asset bubbles in the future. Recognizingthis, the analysis of the leading factors contributing to the real estate bubble hasgenerated a growing body of research. In the following, we review some of the mostplausible proposed explanations and highlight our contribution to this literature.

Researchers in the empirical macroeconomics field have pointed to the simultaneityof rising asset values and current account deficits in the United States as well as othercountries affected by real estate bubbles (see Adam et al 2011; Bergin 2011; Gete2014; In’t Veld et al 2014). Their analysis suggests that current account deficitsneed to be accompanied by mispricing risk and falling lending standards to generatebubbles. In a similar vein, some economists have pointed out that the unusually lowinterest rates in the years before the crisis may have exacerbated the housing boomand bust (Taylor 2014). Other authors, however, are critical of that view. Bernanke(2010) argues that monetary policy during that period was close to his preferred Taylorrule and was appropriate, given deflationary concerns at the time. Further, significantincreases in house prices preceded the period of accommodative monetary policy.In addition, cross-country analysis does not support the view that monetary policyplayed a fundamental role in the housing bubble. MacGee (2010) points out thatCanada followed a monetary policy similar to that of the United States but did notsuffer from a housing bubble.

Existing empirical microeconomics research points to mispricing risk and fallinglending standards as fundamental catalysts of the crisis. In particular, researchershave considered the impact of investors in the mortgage market, either through directpurchases of houses or through the purchase of mortgage-backed securities. Haugh-wout et al (2011) point to the increasing role played by investors during the bubbleyears. Specifically, they document that investors were responsible for almost half ofpurchase mortgage originations at the peak of the market bubble. Investors were alsoassociated with higher rates of default after the bubble burst. The Financial CrisisInquiry Commission (2011) report concluded that irresponsible (and in some caseseven egregious and predatory) lending practices and failures of risk management,financial regulation and supervision were the main reasons for a financial crisis thatcould have been avoided.

Several authors have argued that securitized loans were originated using lowerlending standards than loans held in bank portfolios. Elul (2016) calculates that,after controlling for observable risk factors, loans that are privately securitized havea 20% higher rate of becoming delinquent. His finding is consistent with researchby Keys et al (2010), who point out that the securitization framework can reduce

www.risk.net/journals Journal of Credit Risk

To subscribe to a Risk Journal visit subscriptions.risk.net/journals or email [email protected]

Page 56: Credit Risksubscriptions.risk.net/wp-content/uploads/2019/02/... · is at the forefront in tackling the many issues and challenges posed by the recent financial crisis, focusing

48 J. L. Breeden and J. J. Canals-Cerdá

lenders’ incentives to monitor lending standards (see also Nadauld and Sherlund2013). Securitization may also have contributed to lower lending standards morebroadly through its effects in a competitive market. Levitin et al (2009) argue thatprivate-label securitization was not just a contributor to the crisis, but was, in fact, atthe root of it.1 Ruckes (2004) describes theoretically a mechanism for the transmissionof low screening activity resulting from intense price competition among lenders.2

Foote et al (2012) take a contrarian view and argue that investment decisions madeduring the bubble years were rational and logical, given investors’beliefs about futurehouse prices at the time.

Several authors have focused their attention on the way lending standards werelowered during the years before the bubble burst. Dell’Ariccia et al (2012), usingmortgage origination information from the Home Mortgage Disclosure Act (HMDA),document the lowering of lending standards, particularly in areas that experiencedfaster growth in credit demand. Demyanyk and Hemert (2011) document that thequality of loans deteriorated for six years prior to the crisis. Palmer (2014), using datafrom privately securitized subprime mortgages, points out that mortgages originatedin the two years before the cycle were about three times more likely to default withina three-year period than mortgages that originated around 2003. He argues that one-third of the increase in defaults can be attributed to changing borrower and loancharacteristics, while the remaining two-thirds can be attributed to the price cycle.

Previous studies of the US mortgage crisis have suggested that factors beyondthose visible to the lenders had a strong impact on credit quality. Breeden (2011)analyzed a fifteen-year data set of mortgage performance by employing a dual-timedynamics approach (Breeden et al 2008) and found that dramatic cycles in creditquality occurred three times during the observation period, even after segmentingby product type, credit score and loan-to-value (LTV). Further, Breeden found thatthese cycles correlate with macroeconomic factors, such as changes in housing pricesand mortgage interest rates. Similarly, Calem et al (2011) used a combination ofcompeting risk models and panel regression to show that riskier households tendedto borrow more on their home equity loans when the expected unemployment riskincreased.

In this paper, we quantify the impact of macroeconomic adverse selection on adata set of first-lien, installment, fixed-rate, conventional mortgages. We intentionallyavoid option adjustable-rate mortgages and negative amortizing products, to focusspecifically on the question of the impact of macroeconomic adverse selection effectsin this core mortgage product. We create a complete loan-level probability of default

1 Levitin and Wachter (2013) argue that securitization was also responsible for the commercial realestate bubble in the United States.2 See Berlin (2009) for a survey of alternative theories of the bank lending cycle.

Journal of Credit Risk www.risk.net/journals

To subscribe to a Risk Journal visit subscriptions.risk.net/journals or email [email protected]

Page 57: Credit Risksubscriptions.risk.net/wp-content/uploads/2019/02/... · is at the forefront in tackling the many issues and challenges posed by the recent financial crisis, focusing

Consumer risk appetite, the credit cycle and the housing bubble 49

model that includes all of the standard predictive factors (loss timing versus age,also known as the life cycle; credit risk scoring attributes, such as FICO score, LTV,etc; and macroeconomic drivers, such as unemployment and house prices) and, usingthis framework, we demonstrate that a strong vintage-based effect persists beyondthese observables. In addition, we demonstrate that this residual credit risk is highlycorrelated with consumer mortgage demand based on the Federal Reserve Board’s(FRB) Senior Loan Officer Opinion Survey (SLOOS) and changes in mortgage pricingat the time of loan origination. To the best of our knowledge, this is the first paperto affirm the correlation of credit risk with consumer demand and macroeconomicfactors for residential mortgages, after controlling for all available scoring attributes.

In the next section, we present the data and provide descriptive statistics for someof the key variables in our sample. Section 3 contains the empirical methodology, andSection 4 presents the empirical model results. Section 5 concludes the paper.

2 DATA AND DESCRIPTIVE ANALYSIS

We analyze mortgage industry data from the McDash Analytics residential mortgageservicing database. This database is mainly composed of the servicing portfolios of thelargest residential mortgage servicers in the United States, and covers about two-thirdsof installment-type loans in the residential mortgage servicing market. The databaseincludes mortgages from Fannie Mae, Freddie Mac, Ginnie Mae and private securi-tized portfolios, as well as banks’ portfolios. The original data set contains monthlyloan performance data from mortgages originating from 1992. The data includes abroad range of loan attributes from the underwriting process (such as product type,documentation type, loan purpose, property type and zip code), borrower character-istics (such as credit score, debt-to-income ratio and owner occupancy) and dynamicloan-level attributes (such as delinquency status, loan balance, current interest rateand investor type).

Our sample of mortgage industry data includes the full performance history ofa randomly selected sample of loans in the McDash residential mortgage servicingdatabase. Much has been written about how negative amortizing loans and secondliens caused exceptionally high loss rates. To focus our analysis on the question ofmacroeconomic adverse selection, we restrict our analysis to fixed-term, fixed-rate,first-lien mortgages. We also restrict the analysis sample to loan performance datafrom 2000–14 on mortgages that originated from 2000 through 2009.

We focus on modeling loan delinquency status to between sixty and eighty-ninedays past due (DPD), as the later delinquency data was significantly thinner in thesample. Thus, we consider a loan in default if it reaches or exceeds this delinquencystate, including foreclosure or real estate owned. Many lenders will fully or partiallycharge off a mortgage that reaches this level of delinquency. Further, this delinquency

www.risk.net/journals Journal of Credit Risk

To subscribe to a Risk Journal visit subscriptions.risk.net/journals or email [email protected]

Page 58: Credit Risksubscriptions.risk.net/wp-content/uploads/2019/02/... · is at the forefront in tackling the many issues and challenges posed by the recent financial crisis, focusing

50 J. L. Breeden and J. J. Canals-Cerdá

TABLE 1 Variable definitions.

Variable Definition

Risk score Borrower’s FICO credit score

Risk score dummies By ranges: 250–539, 540–619, 620–59,660–99, 700–39, 740–79, 780–819 and 820+

Debt-to-income (DTI) Ratio of loan

Jumbo Dummy variable for jumbo loan type

Private mortgage insurance (PMI) PMI dummy

Term Loan term (months)

Term dummies Term dummies for ranges: up to 120, 120–80,180–240, 240–360, 360+

Documentation: Loan documentation type;or unknown if type is not known

� full Full documentation� low Low documentation� no No documentation

Loan-to-value (LTV) Ratio of loan balance to current home value(ie, at each observation point in time)

LTV dummies By ranges: 0–0.75, 0.75–0.80, 0.80–0.85,0.85–0.90, 0.90–0.95, 0.95–1.00, 1.00+

Loan purpose: Purpose of the loan;or unknown if type is not known

� new New loan� refinance Refinance loan� other Other (home improvement, debt consolidation, etc)

Loan source: Loan origination source;or unknown if type is not known

� retail New loan originated by client organization� wholesale Wholesale origination� correspondent Correspondent and flow/co-issue loans� transfer Servicing rights purchased or transferred� other Other loan source

Occupancy: Occupancy type� owner Owner-occupied� nonowner Nonowner-occupied� other/unknown Other occupancy type

Vintage year dummies Dummy variables specific to the origination date

threshold is consistent with other relevant papers in the literature that have adoptedthis definition of default (see, for example, Gerardi et al 2008).

Table 1 lists the primary risk drivers used in our statistical analysis of credit risk.Relevant variables include loan-specific characteristics such as term, documentation,

Journal of Credit Risk www.risk.net/journals

To subscribe to a Risk Journal visit subscriptions.risk.net/journals or email [email protected]

Page 59: Credit Risksubscriptions.risk.net/wp-content/uploads/2019/02/... · is at the forefront in tackling the many issues and challenges posed by the recent financial crisis, focusing

Consumer risk appetite, the credit cycle and the housing bubble 51

LTV (defined as the ratio of the loan amount to the appraisal value at origination givenas percentage), loan purpose, loan source and occupancy.

Other borrower-specific characteristics include FICO scores at origination and debt-to-income (DTI) ratios at origination. Several variables included in our model specifi-cations are represented as dummy variables, reflecting nonoverlapping ranges acrossthe overall variable range. This approach allows us to estimate the potential nonlin-ear impact of particular variables without having to rely on specific functional formassumptions.

Table 2 presents descriptive statistics across origination vintages for the represen-tative sample used in our analysis.

Observed changes in loan characteristics at origination are consistent with ourexpectation. We first observe a decrease in origination FICO scores across the yearsprior to 2007, with the most significant decreases occurring in 2006 and 2007, and areversal in this trend after that. The percentage of originated loans with full documen-tation increased significantly during the crisis years, although this variable includesa significant proportion of noncategorized loans. As expected, we also observe adecrease in nonowner-occupied loans during the crisis years. Overall, while weobserve changes in the average characteristics of loans originated over the years,these changes are by no means dramatic. Thus, loan origination characteristics inthe segment of the market composed of the fixed-term, fixed-rate, first-lien mort-gages considered in our study remained relatively stable across the years and acrossobservable risk dimensions.

3 THE MODELING APPROACH

We follow the lives of the loans in our sample from their origination to the time eachloan is paid off or defaults. Our primary test for macroeconomic adverse selection is tocreate a loan-level model that includes all available origination scoring factors, macro-economic factors and vintage fixed effects. The vintage fixed effects are intended toallow us to quantify the magnitude of adverse selection, if any, through time. It willbe important to compensate for life cycle, as a function of months on books, andfor changes in the macroeconomic environment that can contribute to higher lossesacross vintages.3 The comparison of estimation results from models with and withoutvintage effects will assist us in ascertaining the presence and relevance of a residualcomponent that cannot be explained by standard scoring factors.

The default probability is considered monthly, relative to the active accounts in theprevious period. This compensates for the competing risk of loan payoff, also known

3 Also known as the loss timing, seasoning or credit loss hazard function. All of these refer to thechanging probability of loss as a function of the age of the loan (months on books).

www.risk.net/journals Journal of Credit Risk

To subscribe to a Risk Journal visit subscriptions.risk.net/journals or email [email protected]

Page 60: Credit Risksubscriptions.risk.net/wp-content/uploads/2019/02/... · is at the forefront in tackling the many issues and challenges posed by the recent financial crisis, focusing

52 J. L. Breeden and J. J. Canals-Cerdá

TAB

LE

2D

escr

iptiv

est

atis

tics

ator

igin

atio

nby

vint

age

for

fixed

term

and

rate

,firs

t-lie

nm

ortg

ages

.[Ta

ble

cont

inue

son

next

page

.]

2000

–320

0420

0520

0620

0720

0820

09

Ris

ksc

ore

(mea

n):

715

710

710

704

705

715

738

�sc

ore

in[2

50,5

40)

0.80

1.09

1.37

1.81

1.47

0.67

0.15

�sc

ore

in[5

40,6

60)

13.0

115

.78

15.7

318

.80

19.6

817

.99

10.7

2�

scor

ein

[660

,700

)12

.12

14.6

114

.67

15.4

915

.52

14.1

011

.97

�sc

ore

in[7

00,7

40)

15.2

516

.81

16.5

716

.51

16.8

416

.44

15.9

3�

scor

ein

[740

,900

)30

.52

30.5

230

.96

29.3

230

.95

37.1

351

.02

Jum

bo3.

594.

385.

523.

903.

611.

983.

30

Term

inm

onth

s(m

ean)

:30

532

333

534

034

333

833

3�

term

upto

180

m28

.24

21.6

412

.51

8.26

7.04

10.6

312

.31

�te

rm36

0+m

66.0

371

.93

82.0

187

.41

89.2

586

.32

84.3

2

Doc

umen

tatio

n:�

full

25.5

425

.97

28.0

330

.35

39.2

648

.72

54.0

1�

low

6.07

6.71

7.21

7.57

8.87

5.95

4.93

�no

docu

men

tatio

n1.

563.

133.

735.

094.

745.

793.

35�

unkn

own

66.8

364

.19

61.0

356

.98

47.1

239

.54

37.7

1

Loan

-to-

valu

e(m

ean)

:71

.80

70.5

068

.10

66.2

069

.80

76.3

075

.10

�LT

Vin

[0,0

.75)

44.3

443

.39

42.1

737

.68

35.4

234

.96

40.5

4�

LTV

in[0

.75,

0.90

)31

.05

34.2

637

.32

38.2

835

.07

29.8

326

.65

�LT

Vin

[0.9

0,1.

00)

17.1

515

.65

13.4

414

.94

19.3

729

.10

24.7

7�

LTV

in[1

.00+

)7.

476.

707.

089.

1010

.15

6.11

8.04

Journal of Credit Risk www.risk.net/journals

To subscribe to a Risk Journal visit subscriptions.risk.net/journals or email [email protected]

Page 61: Credit Risksubscriptions.risk.net/wp-content/uploads/2019/02/... · is at the forefront in tackling the many issues and challenges posed by the recent financial crisis, focusing

Consumer risk appetite, the credit cycle and the housing bubble 53

TAB

LE

2C

ontin

ued.

2000

–320

0420

0520

0620

0720

0820

09

Loan

purp

ose:

�ne

w30

.37

36.6

339

.03

42.3

739

.60

36.4

930

.00

�re

finan

ce4.

336.

8816

.30

16.0

917

.74

15.5

215

.85

�ot

her

50.1

639

.20

27.2

623

.70

23.8

826

.25

39.4

3�

unkn

own

15.1

417

.30

17.4

117

.84

18.7

821

.75

14.7

3

Loan

sour

ce:

�br

anch

39.1

737

.17

34.6

933

.72

37.7

241

.55

46.0

5�

corr

espo

nden

t22

.85

24.9

325

.04

25.7

926

.69

32.5

536

.29

�tr

ansf

er16

.84

16.8

515

.87

14.8

010

.01

7.65

4.96

�ot

her

12.6

314

.97

16.7

717

.36

20.0

816

.20

11.5

1�

unkn

own

8.51

6.08

7.63

8.33

5.50

2.06

1.18

Occ

upan

cy:

�ow

ner

91.6

689

.32

83.7

283

.06

86.1

987

.92

92.6

7�

nono

wne

r5.

947.

287.

798.

698.

095.

602.

68�

othe

r/un

know

n2.

403.

408.

498.

255.

726.

484.

65

Num

ber

ofob

serv

atio

ns:

�by

orig

inat

ion

326

902

01

030

870

115

019

01

016

150

837

210

623

840

736

460

Dat

aso

urce

:the

McD

ash

Ana

lytic

sre

side

ntia

lmor

tgag

ese

rvic

ing

data

base

.

www.risk.net/journals Journal of Credit Risk

To subscribe to a Risk Journal visit subscriptions.risk.net/journals or email [email protected]

Page 62: Credit Risksubscriptions.risk.net/wp-content/uploads/2019/02/... · is at the forefront in tackling the many issues and challenges posed by the recent financial crisis, focusing

54 J. L. Breeden and J. J. Canals-Cerdá

as loan attrition. This is in the spirit of the popular nonparametric Kaplan–Meierestimator for survival models and compensates for a reduction in active accountsdue to causes other than default. Developing a full attrition model was not necessaryfor our research, since the impact upon default rates was fully compensated in thisapproach. Instead, we focus on the probability of default, which compensates for earlypayoff indirectly by having active accounts as the denominator. For the historical data,the equivalent default rate is defined as

default rate.a; v; t/ D defaults.a; v; t/

active accounts.a � 1; v; t � 1/;

where a denotes the age of a loan (or months on book), v denotes the loan’s vintage byorigination date, t denotes the calendar date and i denotes a loan-specific identifier.The aim of our research is explanation, not forecasting, so this adjustment in thehistorical estimator of default rates removes the effects of time-varying payoff rates,thereby avoiding contamination of the results from prepayment.

The monthly odds of a loan defaulting can be represented as a combination ofthe average population odds of default (ie, the average performance across all loans)and the idiosyncratic odds (ie, divergence of an individual loan from the mean of thepopulation) (Thomas 2009):

log odds of default.a; v; t; i/

D log.population odds.a; v; t// C log.idiosyncratic odds.i//:

Attempting to simultaneously estimate both the population odds and the idiosyncraticodds can lead to instability because of the potential collinearity of macroeconomicand scoring factors when modeled on short timescales relative to the economic cycle.Therefore, we first create a model of the population odds of default as a function ofmonths on books, vintage origination date and calendar date. The population oddsare used as a fixed input to a panel data model such that the idiosyncratic odds aremeasured relative to the calendar date and age-varying population mean.

The two-stage approach of creating the population odds model and then the idiosyn-cratic odds model allows us to make explicit assumptions and tests around the lineartrend specification error present in any model that includes age, vintage and timeeffects. We can solve this in the population odds forecast before computing theidiosyncratic odds, so that the results will be robust. In the following subsections,we describe our approach to modeling population odds and idiosyncratic odds.

3.1 Modeling population odds

When modeling population odds, we are focused on drivers affecting all loans ratherthan idiosyncratic effects. The most important systematic factors for modeling default

Journal of Credit Risk www.risk.net/journals

To subscribe to a Risk Journal visit subscriptions.risk.net/journals or email [email protected]

Page 63: Credit Risksubscriptions.risk.net/wp-content/uploads/2019/02/... · is at the forefront in tackling the many issues and challenges posed by the recent financial crisis, focusing

Consumer risk appetite, the credit cycle and the housing bubble 55

rate are the life cycle versus the age of the loan and environmental effects versuscalendar date.

Survival models have long been used to capture the risk of a terminal event (such asdefault) as a function of the age of the account. To add an environment function wherethe net impact versus calendar is estimated, a Cox proportional hazards model couldbe used with dummy variables for the observation month. However, since the databeing modeled is monthly, the problem simplifies to a discrete time survival model.

In estimating the life cycle and the environment function, the analysis can beimproved by including a fixed effect for vintage origination date. Once that is includedin the discrete time survival model, it becomes an age–period–cohort (APC) model.

APC models have an extensive literature and publicly available estimation methods.To model the population odds in our research, a Bayesian APC model was used(Schmid and Held 2007). Each rate was decomposed into a life cycle function withage of the account, F.a/; vintage quality, G.v/; and environment function with time,H.t/. Specifically,

log

�p.a; v; t/

1 � p.a; v; t/

�D F.a/ C G.v/ C H.t/; (3.1)

where a logistic link function has been chosen because the probability of defaultfollows a binomial distribution. This formulation does not consider any idiosyncraticvariation; it just captures the mean of the distribution through age, vintage and time.

A Bayesian APC algorithm was chosen because it creates a nonparametric estimateof the three functions, which provides the greatest possible resolution of changes.Relative to an initial mean-zero prior for each function, the values of the functionsare adjusted to optimally predict the in-sample performance. A detailed descriptionof the Bayesian APC algorithm is given in the online appendix.

The life cycle captures the fact that newly underwritten loans have much lowerdefault rates than loans that are a few years old. Further, very old loans will haveseasoned and are low risk. The precise shape of this life cycle function will depend onthe specific product and is usually measured nonparametrically, as in survival models.The life cycle function is also referred to as a hazard function or loss timing function.

Environmental impacts are traditionally thought of as the macroeconomic environ-ment experienced by all active loans. Changes in unemployment and house prices arethe primary drivers of mortgage defaults by calendar date. However, other portfoliomanagement drivers may be present. Because we are conducting an industry-widestudy, these drivers would have to be industry-wide portfolio management trends,which may occur. By using the approach in which an environmental function is esti-mated directly from the data, we do not need to explicitly include macroeconomicfactors in the model. In this way, we will capture the net effect of both macroeconomicdrivers and portfolio management trends.

www.risk.net/journals Journal of Credit Risk

To subscribe to a Risk Journal visit subscriptions.risk.net/journals or email [email protected]

Page 64: Credit Risksubscriptions.risk.net/wp-content/uploads/2019/02/... · is at the forefront in tackling the many issues and challenges posed by the recent financial crisis, focusing

56 J. L. Breeden and J. J. Canals-Cerdá

TABLE 3 High-level model description.

Model design Life cycle Vintage Environment

Primary model Single function Single function By stateScore segmentation By score By score By statePurpose segmentation By purpose By purpose By state

As an example of this process, a refreshed LTV for a given loan may be esti-mated by comparing the house price index (HPI) for a particular geographic regionbetween the origination date and the observation date. However, rather than updatingLTV, the environment function captures the net impact of house prices and any otherimpacts on that calendar date without introducing any additional estimation error ofthe approximating refreshed LTV.

Similarly, no economic factors are used in estimating the vintage effects, G.v/.Although G.v/ may be correlated with economics, and that will be part of the lateranalysis, the initial estimate of the vintage effect is intended to capture the net impactof all possible drivers. Including only macroeconomic or observed factors would riskmissing some of the variation in credit quality by vintage. Therefore, the fixed effectsapproach captures the maximum variability by vintage. At the same time, the useof a vintage function avoids overfitting relative to macroeconomic factors, since thespan of the data is only one economic cycle and the risk of spurious correlations issignificant.

Any model that includes factors related to the age of the loan, calendar date and vin-tage will have a linear specification error because of the simple relationship, a D t �v,where a is age, t is time and v is vintage (Breeden and Thomas 2016). This specifica-tion error is explained well in the APC literature (Mason and Fienberg 1985; Glenn2005), and no general solution exists. In cases in which some of these dimensionsare excluded, as with traditional credit scores that rely solely on information fromthe origination (vintage) date, a unique solution is obtained, but at the cost of beingunable to predict probabilities in future time periods.

In some of the later analyses, the data was segmented so we could study variouseffects. For segmented data, we can choose to segment any or all of the previouslydefined functions. For example, segmenting the environment function, H.t/, at thestate level allows us to estimate it separately by US state. Using this approach, we areable to include variations caused by the local economic environment in our estimates.Similarly, we will use segmentation to explore differences in the vintage functionacross segments. Note that, in all the segments tested, the life cycle function wasunchanged across the segments.

Journal of Credit Risk www.risk.net/journals

To subscribe to a Risk Journal visit subscriptions.risk.net/journals or email [email protected]

Page 65: Credit Risksubscriptions.risk.net/wp-content/uploads/2019/02/... · is at the forefront in tackling the many issues and challenges posed by the recent financial crisis, focusing

Consumer risk appetite, the credit cycle and the housing bubble 57

FIGURE 1 Life cycle function estimated from the APC algorithm for the full data set.

0.014

0.012

0.011

0.008

0.006

0.004

0.002

00 12 24 36 48 60 72 80 96 108 120

Pro

babi

lity

of 6

0–89

DP

D

Months on books (age of loan)

The expected probability of delinquency for the entire sample along with 5% and 95% confidence intervals. Resultsderived using the McDash Analytics residential mortgage servicing database.

FIGURE 2 Credit risk function estimated from the APC algorithm for the full data set.

1.5

1.0

0.5

0

–0.5

–1.0

–1.52000 2001 2002 2003 2004 2005 2006 2007 2008 2009 2010

Vintage date

Results are derived using the McDash Analytics residential mortgage servicing database.

www.risk.net/journals Journal of Credit Risk

To subscribe to a Risk Journal visit subscriptions.risk.net/journals or email [email protected]

Page 66: Credit Risksubscriptions.risk.net/wp-content/uploads/2019/02/... · is at the forefront in tackling the many issues and challenges posed by the recent financial crisis, focusing

58 J. L. Breeden and J. J. Canals-Cerdá

To test variations in the population odds by segment, we create a set of models, aslisted in Table 3.

When the APC algorithm is applied to create the primary model, the algorithmprovides point estimates for each value of the life cycle along with 5% and 95% con-fidence intervals (Figure 1). This represents the expected probability of delinquencyfor the entire sample. By estimating via the APC algorithm, it is normalized for port-folio variations in credit quality and environment, but it is conceptually equivalent toa hazard function.

Figure 2 shows the credit risk function obtained from the APC algorithm. It showsthat loans originated in 2002–4 had lower than average log odds of delinquency,whereas loans originated in 2006–8 had significantly higher than average log odds ofdelinquency. The rest of this paper will focus on testing the possible causes of thiscredit cycle.

To measure the environment function, we segmented by state. As seen in Figure 3,in a summary across all risk bands, the states were highly correlated through theaftermath of the 2001 and 2009 recessions. In Figure 4, the large outliers in 2005were Louisiana and Mississippi following Hurricane Katrina. In the final analysis,we segmented the environment function by both state and risk bands.

The environment function shows the change in log odds of delinquency for all loansactive on a given calendar date. The life cycle serves as the baseline against whichthe change is computed, so loans of different ages will be adjusted relative to theirlife cycle estimates.

The results shown for life cycle, credit quality and environment form a completeportfolio model in themselves but without causal explanation.

No macroeconomic model is needed for the macroeconomic adverse selectionstudy. The environment function from the Bayesian APC algorithm will remove themaximum amount of temporal variability from the signal, most of which should bedriven by the economy, but effects such as those from Hurricane Katrina are alsoobvious in the data. By using the environment function, any deviation as a functionof calendar date will be removed, regardless of cause. That said, we created a paneldata model of the environment functions measured by state segmentation. We built asingle model to simultaneously predict the environment functions for all states, butwe included fixed effects for states to allow for level shifts between them. The pur-pose of the panel data modeling of the environment functions with macroeconomicfactors was to test for the necessity of a secular trend. If adding a ct term were sta-tistically significant, where ct is an estimated constant for specific calendar date t ,this would indicate that the environment functions are nonstationary with respect tomacroeconomic effects. We designed the Bayesian APC to produce stationary envi-ronment functions, but the actual constraint we want is that the residuals be stationarywhen modeled against macroeconomic data. By showing that no time component

Journal of Credit Risk www.risk.net/journals

To subscribe to a Risk Journal visit subscriptions.risk.net/journals or email [email protected]

Page 67: Credit Risksubscriptions.risk.net/wp-content/uploads/2019/02/... · is at the forefront in tackling the many issues and challenges posed by the recent financial crisis, focusing

Consumer risk appetite, the credit cycle and the housing bubble 59

FIGURE 3 Environmental function by state segments estimated for sixty to eighty-nineDPD.

1.0

0

0.5

1.5

–0.5

–1.02000 2002 2004 2006 2008 2010 2012 2014

Cha

nge

in lo

g od

ds o

f 60–

89 D

PD

Results derived using the McDash Analytics residential mortgage servicing database.

FIGURE 4 Environmental function for selected state segments estimated for sixty toeighty-nine DPD.

3.0

2.5

2.0

1.5

1.0

0

0.5

–0.5

–1.02000 2002 2004 2006 2008 2010 2012 2014

Cha

nge

in lo

g od

ds o

f 60–

89 D

PD

CA FL LA MS TX

California and Florida show the biggest swings through the recessions, but Louisiana and Mississippi showgreater impacts from Hurricane Katrina. Results derived using the McDash Analytics residential mortgage servicingdatabase.

www.risk.net/journals Journal of Credit Risk

To subscribe to a Risk Journal visit subscriptions.risk.net/journals or email [email protected]

Page 68: Credit Risksubscriptions.risk.net/wp-content/uploads/2019/02/... · is at the forefront in tackling the many issues and challenges posed by the recent financial crisis, focusing

60 J. L. Breeden and J. J. Canals-Cerdá

is necessary, we can accept the decomposition as stable with respect to our designgoals.

3.2 Modeling idiosyncratic odds

The final step in our analysis is to create loan-level models using first origination andthen refreshed FICO and LTV attributes. The goal is to model the idiosyncratic oddsseparately from the population odds estimated via the APC algorithm. To create ascore that incorporates the systematic effects (population odds) caused by life cycleand environmental impacts, we include the population odds as fixed offsets to ageneralized linear model (GLM).4 This has the effect of adjusting the log odds on theleft-hand side by the population odds as reflected by F.a/ and H.t/:

log

�pi .a; v; t; i/

1 � pi .a; v; t; i/

�D c0 C offset.F.a/ C H.t// C

nsXj D1

cj xij CnvX

vD1

gv; (3.2)

where xij are the values of scoring attribute j for account i , cj are the correspond-ing coefficients and ns is the number of scoring attributes. Again, pi .a; v; t/ is theprobability of a loan being sixty to eighty-nine days past due.

The vintage function, G.v/, is not included in the offset (population odds) becausewe want to explicitly test how much of the population odds shift by vintage can beexplained by population shifts in the scoring factors. Therefore, rather than includeG.v/ for the overall vintage function, we include fixed effects (dummy variables)for the vintages gv to capture the residual vintage performance not explained by thescoring variables.

The method described here is broadly equivalent to a discrete time survival modelwith the added nuance of carefully controlling for the linear trend ambiguity. Withthe F.a/ and H.t/ functions as fixed offsets, the linear trend cannot be changed bythe inclusion of scoring factors. As soon as one function in a, v or t is held fixed, theother two will be uniquely determined, as explained in the APC literature.

4 AGE–PERIOD–COHORT MODEL RESULTS

To estimate the population odds, we estimated the Bayesian APC algorithm withsixty to eighty-nine days DPD as our proxy for default. The life cycle functions weresegmented as subprime, prime and superprime. In general, it also may be advisableto segment the life cycles by loan term. In our data, the loan terms were primarilyfor ten, fifteen, twenty and thirty years, and, even with our large panel, we could notdistinguish differences in the life cycles with this additional level of segmentation.

4 In the language of the GLM, “fixed offsets” are factors that have a coefficient identically equalto 1.

Journal of Credit Risk www.risk.net/journals

To subscribe to a Risk Journal visit subscriptions.risk.net/journals or email [email protected]

Page 69: Credit Risksubscriptions.risk.net/wp-content/uploads/2019/02/... · is at the forefront in tackling the many issues and challenges posed by the recent financial crisis, focusing

Consumer risk appetite, the credit cycle and the housing bubble 61

FIGURE 5 Life cycle average expected monthly delinquency rate.

0.07

0.06

0.05

0.04

0.03

0.02

0.01

00 12 24 36 48 60 72 84 96 108 120

Pro

babi

lity

of 6

0–89

DP

D

Superprime Prime Subprime

The y-axis represents the monthly conditional probability of default. Results derived using the McDash Analyticsresidential mortgage servicing database.

The y-axis of the life cycle graph (Figure 5) is the expected average monthlydelinquency rate averaged across the full time range.

Figure 6 measures credit risk across vintages. Credit risk is measured as a relativescaling of the log odds. The values shown represent the relative risk of a given vintagefor the entire life of those loans. The concept of subprime, prime and superprimemortgage loans is broadly utilized in the mortgage industry and the related academicliterature, but it is not consistently defined. Chomsisengphet and Pennington-Cross(2006) define subprime lending as a “segment of the mortgage market that expandsthe pool of credit to borrowers who, for a variety of reasons, would otherwise bedenied credit”. For the purposes of this paper, we define subprime as less than 660FICO, prime as 660–780 and superprime as 780 and above.

We observe that subprime loans have a smaller dynamic range than prime andsuperprime loans. Thus, subprime loans tend to be less sensitive to the economiccycle in terms of underwriting (credit quality versus vintage) and the environmentfunctions versus calendar date. This is a well-established result. However, in termsof total numbers of delinquent loans, the subprime segment will see the most growthfor risky loans.5

5 These findings regarding the performance of subprime versus prime loans over the life cycle arenot specific to mortgages. Specifically, Canals-Cerdá and Kerr (2015) report similar findings incredit cards.

www.risk.net/journals Journal of Credit Risk

To subscribe to a Risk Journal visit subscriptions.risk.net/journals or email [email protected]

Page 70: Credit Risksubscriptions.risk.net/wp-content/uploads/2019/02/... · is at the forefront in tackling the many issues and challenges posed by the recent financial crisis, focusing

62 J. L. Breeden and J. J. Canals-Cerdá

FIGURE 6 Relative credit risk by vintage.

Superprime Prime Subprime

2.0

1.5

1.0

0.5

0

–0.5

–1.0

–1.5

Rel

ativ

e ch

ange

in lo

g od

ds

1999 2000 2001 2002 2003 2004 2005 2006 2007 2008 2009Vintage date

Results derived using the McDash Analytics residential mortgage servicing database.

The results in Figure 6 disagree somewhat with those obtained by Demyanyk andHemert (2011), who also attempted to adjust for observed underwriting factors anda set of vintage dummies. Their result showed a monotonic increase in risk from2001 through 2007. The disagreement between 2001 and 2003 could be due to alower volume of data in their sample during that time period, but the consequence isthat it can change the interpretation of the results. What they saw as a trend due tosecuritization looks here to be a cycle requiring a broader explanation.

4.1 Idiosyncratic odds results

The estimated credit risk function by vintage date from the APC algorithm capturesboth the known changes due to observable shifts in underwriting and possible unob-served effects for which we are searching. To distinguish between these two effects,we specify a loan-level probability model where the life cycle versus age by risk bandand the environment function versus date by state are used as fixed offsets (see (3.2)).In addition to these inputs, we also include the typical scoring attributes listed inTable 1. In particular, we estimate models with and without quarterly vintage effectsand separately for subprime, prime and superprime segments. (Tables of parameterestimates are available from the authors.)

Applying this method to predicting the probability of being sixty to eighty-nineDPD for the first-lien mortgage data provides the scoring results reported in Table 4.

Journal of Credit Risk www.risk.net/journals

To subscribe to a Risk Journal visit subscriptions.risk.net/journals or email [email protected]

Page 71: Credit Risksubscriptions.risk.net/wp-content/uploads/2019/02/... · is at the forefront in tackling the many issues and challenges posed by the recent financial crisis, focusing

Consumer risk appetite, the credit cycle and the housing bubble 63

TABLE 4 Output coefficients from the GLM analysis of mortgage delinquency.

Variable Coefficient t-value Variable Coefficient t-value

Intercept 2.268 67.83 Source channel:Jumbo loan �0.128 �16.57 � retail ControlDocumentation: � wholesale 0.217 63.27

� full Control � correspondent 0.100 30.25� low 0.103 25.48 � transfer 0.240 38.63� no �0.030 �5.16 � other 0.437 32.19� unknown 0.135 40.61 Occupancy:

FICO at origination: � owner Control� up to 540 Control � nonowner �0.109 �17.75� 540–580 �0.188 �17.33 � other or unknown �0.187 �17.84� 580–620 �0.435 �44.66 PMI:� 620–660 �0.807 �85.86 � no Control� 660–700 �1.373 �145.22 � yes 0.119 32.06� 700–740 �1.956 �202.66 � unknown 0.170 37.92� 740–780 �2.671 �264.46 Term:� 780–820 �3.380 �283.96 � 0–120 Control� 820C �3.623 �52.71 � 120–180 0.144 10.39

Loan-to-value: � 180–240 0.407 27.38� 0.00–0.75 Control � 240–360 0.593 44.05� 0.75–0.80 0.157 40.09 � 360+ 0.640 36.25� 0.80–0.850 0.221 46.80 Purpose:� 0.85–0.90 0.247 42.01 � purchase Control� 0.90–0.95 0.262 42.53 � refinance �0.001 �0.41� 0.95–1.00 0.305 46.79 � purpose U �0.462 �64.29� 1.00–1.13 0.285 36.81 � purpose Z 0.090 5.08

Debt-to-income 0.007 73.35

The model specification also includes quarterly vintage dummies that are not reported in this table. Results derivedusing the McDash Analytics residential mortgage servicing database.

The table provides the GLM output for the full sample for all parameters except thevintage effects (graphed later), where the life cycle function and environment functionby state are included in the model as fixed offsets. The coefficients shown are in linewith industry intuition.

Since a binary outcome is being modeled, pseudo R-squared was used to measuregoodness-of-fit.6

6 The following definition for pseudo R2 was employed:

pseudo R2 D 1 � residual deviance

null deviance:

www.risk.net/journals Journal of Credit Risk

To subscribe to a Risk Journal visit subscriptions.risk.net/journals or email [email protected]

Page 72: Credit Risksubscriptions.risk.net/wp-content/uploads/2019/02/... · is at the forefront in tackling the many issues and challenges posed by the recent financial crisis, focusing

64 J. L. Breeden and J. J. Canals-Cerdá

FIGURE 7 APC vintage function versus origination score vintage effects.

1.5

1.0

0.5

0

–0.5

–1.0

–1.5Cha

nge

in lo

g-od

ds o

f 60–

89 D

PD

2000 2001 2002 2003 2004 2005 2006 2007 2008 2009 2010Vintage date

AVT vintage functionOrigination score vintage fixed effects

Results derived using the McDash Analytics residential mortgage servicing database.

FIGURE 8 Relative credit risk by vintage after controlling for scoring attributes.

Superprime Prime Subprime

0.8

0.6

0.4

0.2

0

–0.2

–0.4

–0.8

–0.6

Rel

ativ

e ch

ange

in lo

g od

ds o

f 60–

89 D

PD

2000

Q2

2000

Q4

2001

Q2

2001

Q4

2002

Q2

2002

Q4

2003

Q2

2003

Q4

2004

Q2

2004

Q4

2005

Q2

2005

Q4

2006

Q2

2006

Q4

2007

Q2

2007

Q4

2008

Q2

2008

Q4

2009

Q2

2009

Q4

Vintage date

Results derived using the McDash Analytics residential mortgage servicing database.

Journal of Credit Risk www.risk.net/journals

To subscribe to a Risk Journal visit subscriptions.risk.net/journals or email [email protected]

Page 73: Credit Risksubscriptions.risk.net/wp-content/uploads/2019/02/... · is at the forefront in tackling the many issues and challenges posed by the recent financial crisis, focusing

Consumer risk appetite, the credit cycle and the housing bubble 65

FIGURE 9 Comparison of vintage fixed effects by reason for obtaining a mortgage.

1.21.00.80.60.40.2

0–0.2–0.4

–0.8–0.6

Cha

nge

in lo

g od

ds

2000

Q2

2000

Q4

2001

Q2

2001

Q4

2002

Q2

2002

Q4

2003

Q2

2003

Q4

2004

Q2

2004

Q4

2005

Q2

2005

Q4

2006

Q2

2006

Q4

2007

Q2

2007

Q4

2008

Q2

2008

Q4

2009

Q2

2009

Q4

Purchase RefinanceOther

Origination date

Results derived using the McDash Analytics residential mortgage servicing database.

For the regression in Table 4 without vintage fixed effects, pseudo R2 D 0:097.When vintage fixed effects were included in the model (shown in Figure 8),pseudo R2 D 0:114. Including vintage fixed effects significantly improved the model.The overall values of pseudo R-squared are typical when creating monthly panelmodels of a roughly 1% likely event.

Even though we included all the available scoring factors, the fixed effect in vintageis still significant. When theAPC decomposition is compared with having fixed effectsin the scores, the major variation is still present (Figure 7). However, by includingthe scoring factors, the dynamic range for the vintage fixed effects is less pronouncedthan for the original credit risk function by vintage. In addition, the transition in 2009is less dramatic. Both measures are normalized for life cycle and environment, whichsuggests only half of the variation in credit risk observed with APC is explainable byobservable underwriting changes.

To test the result seen in Figure 7, the analysis was rerun segmented by score band,with entirely separate models built for each segment. Figure 8 shows the vintagefixed effect functions extracted from these models (see (3.2)). The results are nearlyidentical, with the exception of the most recent history where the superprime functionimproves even more than the others.

www.risk.net/journals Journal of Credit Risk

To subscribe to a Risk Journal visit subscriptions.risk.net/journals or email [email protected]

Page 74: Credit Risksubscriptions.risk.net/wp-content/uploads/2019/02/... · is at the forefront in tackling the many issues and challenges posed by the recent financial crisis, focusing

66 J. L. Breeden and J. J. Canals-Cerdá

Compare Figure 8, which resulted from the model with scoring factors, with Fig-ure 6, which resulted purely from the APC analysis. There is a smaller range ofvariation in Figure 8 than in Figure 6, but the vintage effects are much more alignedafter adjustment for scoring factors. Thus, the inclusion of scoring factors does noteliminate the structure observed in Figure 6. Rather, the scoring factors clarify theunobserved vintage effects.

The analysis by risk band in Figures 7 and 8 included loan purpose as a factor inthe overall score. Rather than just including a scalar, we also tested to see if eachloan-purpose segment exhibited the same dynamism with vintage. Figure 9 showsthe separate estimates for the vintage fixed-effect functions when segmented by loanpurpose. We observe that “other” as a loan purpose segment is more risky across allvintages relative to purchase or refinance, but it is just a level shift equivalent to thescaling observed in the original model of Table 4. Thus, we conclude that the samecredit risk cycle is present across loan purpose segments as well as risk bands.

No matter how we segment the data, we continue to observe that typical scoringfactors do not capture all of the variation in credit risk by vintage. Vintage fixed effects(dummies) add significantly to the analysis and show that risk rose steadily from alow in early 2003 to a peak in 2007.

4.2 Comparison with the Senior Loan Officer Opinion Survey

Given the similarity between risk bands in Figure 8, we continue now with a singlecredit risk function for all mortgages derived the same way as before but withoutsegmentation. When looking for possible ways to explain the variation in credit riskafter adjusting for available observed factors, we consider the FRB’s SLOOS (Boardof Governors of the Federal Reserve System 2014). This quarterly survey askedquestions of senior loan officers of up to eighty large domestic banks and twenty-four US branches and agencies of foreign banks regarding loan origination practicesfor several loan types.

Before 2007, a single question was asked regarding mortgage underwriting: “Overthe past three months, how have your bank’s credit standards for approving applica-tions from individuals for mortgage loans to purchase homes changed?” After 2007,this was separated into three questions for subprime, prime and superprime mortgageorigination. To create a continuous history, we computed the survey average after2007.

Figure 10 shows the history with SLOOS (dashed line) for loosening and tighteningof underwriting standards (left y-axis). The solid line is the vintage fixed effect forfirst-lien, fixed-rate installment loans. These two lines should be anticorrelated. Asunderwriting standards are tightened, credit risk should decrease. In fact, a smallpositive correlation of 0:41 ˙ 0:38 is observed between these two measures.

Journal of Credit Risk www.risk.net/journals

To subscribe to a Risk Journal visit subscriptions.risk.net/journals or email [email protected]

Page 75: Credit Risksubscriptions.risk.net/wp-content/uploads/2019/02/... · is at the forefront in tackling the many issues and challenges posed by the recent financial crisis, focusing

Consumer risk appetite, the credit cycle and the housing bubble 67

FIGURE 10 SLOOS reported underwriting standards versus first-lien vintage effects.

80

60

40

20

0

–20

–40

Tig

hter

sta

ndar

dsLo

oser

sta

ndar

ds

0.8

0.6

0.4

0.2

0

–0.2

–0.4

–0.6

–0.8 Cha

nge

in lo

g od

ds o

f 60–

89 D

PD

Jan2000

Jan2001

Jan2002

Jan2003

Jan2004

Jan2005

Jan2006

Jan2007

Jan2008

Jan2009

Jan2010

Net percentage of domestic respondents tightening standards for mortgage loans: all

Net percentage of domestic respondents tightening standards for mortgage loans: average

Score vintage fixed effects

Origination date

Results derived using SLOOS and the McDash Analytics residential mortgage servicing database.

The same survey asks the same senior loan officers a related question aboutconsumer demand for loans:

Apart from normal seasonal variation, how has demand from individuals for mort-gages to purchase homes changed over the past three months? (Please consider onlynew originations as opposed to the refinancing of existing mortgages.)

Figure 11 compares this measure of consumer demand (left y-axis, dashed line) withthe same measure of credit risk. Again, we computed the average demand index fordata after 2007. Unlike the previous graph, this one shows significant anticorrelationof �0:69˙0:30. When consumer demand is high, credit risk is low, or when consumerdemand is low, credit risk is high.

In 2006 and 2007, as shown in Figure 7, the separation between the total creditrisk assumed by lenders (solid line) and the share of that credit coming from adverseselection (dashed line) was significant. Of the total credit risk, half would have beenexplainable from the observed underwriting metrics, but half was unobserved.

In 2008, this gap disappeared. Lenders apparently tightened their underwritingstandards, reinforced by the SLOOS study (Figure 10). However, this tighteningdid not affect the share of risk coming from consumer adverse selection. Lenderswere being more selective but were still selecting from an inherently risky pool of

www.risk.net/journals Journal of Credit Risk

To subscribe to a Risk Journal visit subscriptions.risk.net/journals or email [email protected]

Page 76: Credit Risksubscriptions.risk.net/wp-content/uploads/2019/02/... · is at the forefront in tackling the many issues and challenges posed by the recent financial crisis, focusing

68 J. L. Breeden and J. J. Canals-Cerdá

FIGURE 11 SLOOS reported mortgage demand versus first-lien vintage effects.

Jan2000

Jan2001

Jan2002

Jan2003

Jan2004

Jan2005

Jan2006

Jan2007

Jan2008

Jan2009

Jan2010

80

60

40

20

0

–20

–40

–60

–80

–100

Mor

e de

man

dLe

ss d

eman

d

Origination date

0.8

0.6

0.4

0.2

0

–0.2

–0.4

–0.6

–0.8

Net percentage of domestic respondents reportingstronger demand for mortgage loans: allNet percentage of domestic respondents reportingstronger demand for mortgage loans: average

Score vintage fixed effects

Results derived using SLOOS and the McDash Analytics residential mortgage servicing database.

consumers, ie, risky in ways not observable from the usual loan application and bureauinformation.

Only in 2009, when lenders dramatically curtailed mortgage lending, did the riskdrop dramatically. At the same time, mortgage demand recovered, returning theadverse selection measure to normal levels.

The irony of these graphs is that the same senior loan officers answered bothquestions, and therefore had all of the information shown here available to them, yettheir expectations on credit risk do not align with portfolio realities.

4.3 Comparison with economic drivers

Because consumer demand changes significantly through time, we want to understandwhat might cause these changes. Figures 12 and 13 compare the SLOOS mortgagedemand index with the change in thirty-year mortgage rates and the change in HPI.The interest rate story is clear. We found that the optimal relationship was to the change

Journal of Credit Risk www.risk.net/journals

To subscribe to a Risk Journal visit subscriptions.risk.net/journals or email [email protected]

Page 77: Credit Risksubscriptions.risk.net/wp-content/uploads/2019/02/... · is at the forefront in tackling the many issues and challenges posed by the recent financial crisis, focusing

Consumer risk appetite, the credit cycle and the housing bubble 69

FIGURE 12 SLOOS mortgage demand index versus the change in thirty-year mortgagerates.

80

60

40

20

0

–20

–40

–60

–80

–100

Mor

e de

man

dLe

ss d

eman

d

Origination date1990 1992 1994 1996 1998 2000 2002 2004 2006 2008 2010 2012 2014

0.10

0.05

0

–0.05

–0.10

–0.15

–0.20

Cha

nge

in th

irty

-yea

rm

ortg

age

inte

rest

rat

e

SLOOS mortgage demand

Thirty-year mortgage interest rate, twenty-four month log ratio

Data source: SLOOS.

FIGURE 13 SLOOS mortgage demand index versus the change in HPI.

80

60

40

20

0

–20

–40

–60

–80

–100

Mor

e de

man

dLe

ss d

eman

d

Origination date1990 1992 1994 1996 1998 2000 2002 2004 2006 2008 2010 2012 2014

0.06

0.05

0.04

0.03

0.02

0.01

0

–0.04

–0.03

–0.02

–0.01 Cha

nge

in H

PI

SLOOS mortgage demand FHFA HPI, twelve-month log ratio

Data source: SLOOS.

www.risk.net/journals Journal of Credit Risk

To subscribe to a Risk Journal visit subscriptions.risk.net/journals or email [email protected]

Page 78: Credit Risksubscriptions.risk.net/wp-content/uploads/2019/02/... · is at the forefront in tackling the many issues and challenges posed by the recent financial crisis, focusing

70 J. L. Breeden and J. J. Canals-Cerdá

FIGURE 14 Change in log odds of default versus change in interest rate.

0.10

0.05

0

–0.05

–0.10

–0.15

–0.20

Origination date2000 2002 2004 2006 20082001 2003 2005 2007 2009 2010

Cha

nge

in in

tere

st r

ate

1.0

0.8

0.6

0.4

0.2

0

–0.2

–0.4

–0.6

–0.8

Cha

nge

in lo

g od

ds o

f 60–

89 D

PD

Score vintage fixed effects

Thirty-year mortgage interest rate, twenty-four month log ratio

Results derived using the McDash Analytics residential mortgage servicing database.

over a twenty-four-month horizon with a correlation of �0:56. The interpretation isthat consumer demand rises when interest rates have experienced a significant declineover an extended period of time, and conversely for rising rates.

Figure 13 shows the relationship between mortgage demand and changes in theHPI. In a regression including changes in the thirty-year interest rate and in HPI, wefind that both are significant, and there is a positive relationship between demand andHPI. However, we really only have a single event in HPI against which to model. Theresults would be more reliable if we could conduct the analysis by geographic region,but demand is only available as a national measure.

Overall, the relationship between demand and interest rates is stronger and moreintuitive. In Figure 14, we compare the vintage fixed effects for sixty to eighty-nineDPD directly with the twenty-four-month change in the thirty-year mortgage interestrate without the intermediate measure of mortgage demand. Again, the relationshipis clear.

Our best interpretation of these results is that consumer risk appetite changes witheconomic conditions. Credit risk for a loan is a function of the economic conditionsat the time the loan was originated as well as conditions later on, should they worsenduring the life of the loan.

As an industry, we tend to assume that credit risk is driven primarily by underwrit-ing. However, it would be more accurate to say that underwriting is the process of

Journal of Credit Risk www.risk.net/journals

To subscribe to a Risk Journal visit subscriptions.risk.net/journals or email [email protected]

Page 79: Credit Risksubscriptions.risk.net/wp-content/uploads/2019/02/... · is at the forefront in tackling the many issues and challenges posed by the recent financial crisis, focusing

Consumer risk appetite, the credit cycle and the housing bubble 71

selecting the best borrowers among those who are interested in getting loans. Thus,the primary drivers of risk may actually be the conditions that change the consumers’perspectives on their financial risks. Therefore, consumer risk appetite determines thepool of interested borrowers. Underwriting selects from those. When interest ratesare falling, homes are more affordable, and naturally conservative consumers come tothe market to borrow. When interest rates rise, demand from conservative consumersdries up, and we are left with those consumers who are riskier in ways not alwaysvisible to bureau scores and other typical underwriting factors.

From the perspective of understanding the housing bubble, this suggests that thefinancially conservative consumers withdrew from the market in 2005 just as theproblems with poor underwriting were taking hold. The pool of interested borrowershad a high proportion of risky consumers, and lenders went deeply into that pool. Adisaster was in the making.

5 CONCLUSION

Although many explanations have been offered for the US mortgage crisis, ourresearch advocates that shifts in consumer risk appetite were a major contributingfactor. In our approach, we used an APC model to capture trends in the populationodds. The age and period functions were then included in a generalized linear modelof delinquency, which also included all available scoring factors. The original cohortfunction was thereby replaced with the scoring factors and a series of fixed effects tocapture any residual structure. Although we had normalized for product life cycles,macroeconomic conditions by state and all available scoring factors, the remainingvintage fixed effects were both significant and persistent through multiple segments.

The residual vintage fixed effects demonstrated a strong credit risk cycle, but cor-relations with external information suggest possible causes. Using the SLOOS, wefound that self-reported changes in underwriting standards did not correlate with thevintage fixed effects. This is reasonable because those changes in underwriting mightalready be captured in the scoring factors incorporated into the model. Surprisingly,the changes in consumer mortgage demand reported by the SLOOS correlated stronglywith the vintage fixed effects, suggesting that periods of high demand correspond tolow-risk vintages, and periods of low demand correspond to high-risk vintages.

Further investigation of the SLOOS-reported changes in demand showed that bothdemand and the vintage fixed effects correlate strongly with long-term changes ininterest rates. This suggests that declining interest rates drive increased demand froma broad spectrum of consumers, including the important low-risk borrowers. Wheninterest rates are rising, the low-risk consumers no longer want mortgages, so theresulting vintages are lower in volume but much higher in risk.

www.risk.net/journals Journal of Credit Risk

To subscribe to a Risk Journal visit subscriptions.risk.net/journals or email [email protected]

Page 80: Credit Risksubscriptions.risk.net/wp-content/uploads/2019/02/... · is at the forefront in tackling the many issues and challenges posed by the recent financial crisis, focusing

72 J. L. Breeden and J. J. Canals-Cerdá

Modern risk management relies heavily on statistical models. Often models areestimated over a short time horizon that does not cover a full cycle or a cycle with asufficiently severe downturn period. Our paper emphasizes the importance of estimat-ing models over a full cycle whenever possible. It is also vital to pay special attentionto the credit cycle when conducting model validation and for the analysis of modelrisk in particular. Regulatory guidance on model risk from the Board of Governorsof the Federal Reserve System and Office of the Comptroller of the Currency (2011)highlights the increased awareness of regulatory agencies on this subject. Further,the Basel II framework and Comprehensive Capital Analysis and Review frameworkemphasize the use of models for effective supervision and surveillance. Our analysisstresses the importance of accounting for the credit cycle as an important element ofmodel development, implementation, validation and control. It is of particular impor-tance for bank supervisors to improve their understanding of the credit cycle as acatalyst of credit bubbles and its effects on the procyclicality of capital.

DECLARATION OF INTEREST

The authors report no conflicts of interest. The authors alone are responsible for thecontent and writing of the paper.Any remaining errors or omissions are their own. Theviews expressed in the paper are those of the authors and do not necessarily reflect theviews of the Federal Reserve Bank of Philadelphia or the Federal Reserve System. Anearly draft of this paper, including the appendix, is available free of charge at www.philadelphiafed.org/research-and-data/publications/working-papers.

ACKNOWLEDGEMENTS

We thank Sharon Tang for outstanding research support and Amy Sill for outstandinglogistic support. We are particularly grateful to William W. Lang for his supportand assistance on this project. We also gratefully acknowledge the assistance of theEditorial Services team at the Federal Reserve Bank of Philadelphia.

REFERENCES

Adam, K., Kuang, P., and Marcet, A. (2011). House price booms and the current account.Working Paper 17224, National Bureau of Economic Research (https://doi.org/10.3386/w17224).

Akerlof, G. (1970).The market for “lemons”: quality uncertainty and the market mechanism.Quarterly Journal of Economics 84(3), 488–500 (https://doi.org/10.2307/1879431).

Ausubel, L. (1998). Adverse selection in the credit card market. Unpublished Manuscript,University of Maryland, College Park, MD.

Journal of Credit Risk www.risk.net/journals

To subscribe to a Risk Journal visit subscriptions.risk.net/journals or email [email protected]

Page 81: Credit Risksubscriptions.risk.net/wp-content/uploads/2019/02/... · is at the forefront in tackling the many issues and challenges posed by the recent financial crisis, focusing

Consumer risk appetite, the credit cycle and the housing bubble 73

Bar-Isaac, H., Jewitt, I., and Leaver, C. (2007). Information and human capital management.Economics Series Working Paper 367, Department of Economics, University of Oxford(https://doi.org/10.2139/ssrn.1026300).

Bergin, P. (2011).Asset price booms and current account deficits.Economic Letter 2011-37,December 5, Federal Reserve Bank of San Francisco.

Berlin, M. (2009). Bank credit standards. Federal Reserve Bank of Philadelphia BusinessReview Q2, 1–10.

Bernanke, B. S. (2010). Monetary policy and the housing bubble. Speech, January 3,Annual Meeting of the American Economic Association, Atlanta, GA. URL: https://bit.ly/2GKotYi.

Board of Governors of the Federal Reserve System (2014). Senior Loan Officer Opin-ion Survey on bank lending practices. URL: www.federalreserve.gov/BoardDocs/snloansurvey.

Board of Governors of the Federal Reserve System and Office of the Comptroller ofthe Currency (2011). Supervisory guidance on model risk management. OCC Bulletin2011-12 and FRB SR Letter 11-7. URL: www.federalreserve.gov/bankinforeg/srletters/sr1107a1.pdf.

Breeden, J. L. (2011). Macroeconomic adverse selection: how consumer demand drivescredit quality. In Proceedings of the Credit Scoring and Credit Control XII Conference,Edinburgh, Scotland. University of Edinburgh Business School.

Breeden, J. L., and Thomas, L. C. (2016). Solutions to specification errors in stress testingmodels. Journal of the Operations Research Society 67(6), 830–840 (https://doi.org/10.1057/jors.2015.97).

Breeden, J. L., Thomas, L. C., and McDonald III, J. (2008). Stress testing retail loan port-folios with dual-time dynamics.The Journal of Risk Model Validation 2(2), 43–62 (https://doi.org/10.21314/JRMV.2008.033).

Calem, P., Cannon, M., and Nakamura, L. I. (2011). Credit cycle and adverse selectioneffects in consumer credit markets:evidence from the HELOC market.Working Paper 11-13, Federal Reserve Bank of Philadelphia.

Canals-Cerdá, J., and Kerr, S. (2015). Forecasting credit card portfolio losses in the GreatRecession: a study in model risk.The Journal of Credit Risk 11(1), 29–57 (https://doi.org/10.21314/JCR.2015.187).

Chomsisengphet, S., and Pennington-Cross, A. (2006).The evolution of the subprime mort-gage market. Federal Reserve Bank of St. Louis Review 88(1), 31–56 (https://doi.org/10.20955/r.88.31-56).

Dell’Ariccia, G., Igan, D., and Laeven, L. (2012). Credit booms and lending standards:evidence from the subprime mortgage market. Journal of Money, Credit and Banking44(2), 367–384 (https://doi.org/10.1111/j.1538-4616.2011.00491.x).

Demyanyk, Y., and Hemert, O. V. (2011). Understanding the subprime mortgage crisis.Review of Financial Studies 24(6), 1848–1880 (https://doi.org/10.1093/rfs/hhp033).

Elul, R. (2016). Securitization and mortgage default. Journal of Financial Services Re-search 49(2), 281–309 (https://doi.org/10.1007/s10693-015-0220-3).

Financial Crisis Inquiry Commission (2011). The Financial Crisis Inquiry Report: FinalReport of the National Commission on the Causes of the Financial and Economic Crisisin the United States, January 25. Government Printing Office, Washington, DC.

www.risk.net/journals Journal of Credit Risk

To subscribe to a Risk Journal visit subscriptions.risk.net/journals or email [email protected]

Page 82: Credit Risksubscriptions.risk.net/wp-content/uploads/2019/02/... · is at the forefront in tackling the many issues and challenges posed by the recent financial crisis, focusing

74 J. L. Breeden and J. J. Canals-Cerdá

Foote, C., Gerardi, K. S., and Willen, P. S. (2012). Why did so many people make so manyex post bad decisions? The causes of the foreclosure crisis. Public Policy DiscussionPaper 12-2, Federal Reserve Bank of Boston (https://doi.org/10.3386/w18082).

Gerardi, K., Lehnert, A., Sherlund, S. M., and Willen, P. (2008). Making sense of thesubprime crisis. Brookings Papers on Economic Activity, Fall, 69–145 (https://doi.org/10.2139/ssrn.1341853).

Gete, P. (2014). Housing markets and current account dynamics. Unpublished Manuscript,Georgetown University. URL: http://faculty.georgetown.edu/pg252/H&CAbyGete.pdf.

Glenn, N. D. (2005). Cohort Analysis, 2nd edn. Sage, London (https://doi.org/10.4135/9781412983662).

Haughwout, A., Lee, D., Tracy, J., and van der Klaauw, W. (2011). Real estate investors,the leverage cycle, and the housing market crisis. Staff Report 514, Federal ReserveBank of New York (https://doi.org/10.2139/ssrn.1926858).

In’t Veld, J., Kollmann, R., Pataracchia, B., and Ratto, M. (2014). International capital flowsand the boom–bust cycle in Spain. Working Paper 181, Globalization and MonetaryPolicy Institute, Federal Reserve Bank of Dallas (https://doi.org/10.24149/gwp181).

Keys, B. J., Mukherjee, T., Seru, A., and Vig, V. (2010). Did securitization lead to lax screen-ing? Evidence from subprime loans. Quarterly Journal of Economics 125(1), 307–362(https://doi.org/10.1162/qjec.2010.125.1.307).

Levitin, A. J., and Wachter, S. M. (2013). The commercial real estate bubble. HarvardBusiness Law Review 3, 83–118.

Levitin, A. J., Pavlov, A. D., and Wachter, S. M. (2009). Securitization: case or remedy ofthe financial crisis? Research Paper 09-31, Institute for Law and Economics, Universityof Pennsylvania Law School.

MacGee, J. (2010). Not here? Housing market policy and the risk of a housing bust. E-brief,August 31, C. D. Howe Institute, Toronto.

Mason, W. M., and Fienberg, S. (1985). Cohort Analysis in Social Research: Beyond theIdentification Problem. Springer (https://doi.org/10.1007/978-1-4613-8536-3).

Nadauld, T. D., and Sherlund, S. M. (2013). The impact of securitization on the expansionof subprime credit. Journal of Financial Economics 107(2), 454–476 (https://doi.org/10.1016/j.jfineco.2012.09.002).

Palmer, C. (2014). Why did so many subprime borrowers default during the crisis: loosecredit or plummeting prices? Working Paper, University of California, Berkeley, CA.

Rothschild, M., and Stiglitz, J. (1976). Equilibrium in competitive insurance markets: anessay on the economics of imperfect information. Quarterly Journal of Economics 90(4),629–649 (https://doi.org/10.2307/1885326).

Ruckes, M. (2004). Bank competition and credit standards. Review of Financial Studies17(4), 1073–1102 (https://doi.org/10.1093/rfs/hhh011).

Schmid, V. J., and Held, L. (2007). Bayesian age–period–cohort modeling and prediction:BAMP.Journal of Statistical Software 21(8), 1–15 (https://doi.org/10.18637/jss.v021.i08).

Stiglitz, J., and Weiss, A. (1981). Credit rationing in markets with imperfect information.American Economic Review 71(3), 393–410.

Taylor, J. B. (2014). The role of policy in the Great Recession and the weak recovery.American Economic Review 104(5), 61–66 (https://doi.org/10.1257/aer.104.5.61).

Thomas, L. C. (2009). Consumer Credit Models: Pricing, Profit, and Portfolios. OxfordUniversity Press (https://doi.org/10.1093/acprof:oso/9780199232130.001.1).

Journal of Credit Risk www.risk.net/journals

To subscribe to a Risk Journal visit subscriptions.risk.net/journals or email [email protected]