Credit Risk Prediction for Peer-To-Peer Lending Platforms: An Explainable Machine Learning Approach

Authors

  • Chong Pei Swee Faculty of Computer Science and Information Technology, Universiti Malaysia Sarawak, 94300 Kota Samarahan, Sarawak, Malaysia
  • Farid Meziane School of Computing and Engineering, University of Derby, UK https://orcid.org/0000-0001-9811-6914
  • Jane Labadin Faculty of Computer Science and Information Technology, Universiti Malaysia Sarawak, 94300 Kota Samarahan, Sarawak, Malaysia https://orcid.org/0000-0003-0508-4277

DOI:

https://doi.org/10.33736/jcsi.4761.2022

Keywords:

Credit Risk Evaluation, Peer-to-Peer Lending, Logistic Regression; Explainable Machine Learning; Explainable AI.

Abstract

Small and medium enterprises face the challenge of obtaining start-up fund due to the strict rules and conditions set by banks and financial institutions. The plight yields to the growth in popularity of online peer-to-peer lending platforms which are an easier way to obtain loan as they have fewer rigid rules. However, high flexibility of loan funding in peer-to-peer lending comes with high default probability of loan funded to high-risk start-ups. An efficient model for evaluating credit risk of borrowers in peer-to-peer lending platforms is important to encourage investors to fund loans and justify the rejection of unsuccessful applications to satisfy financial regulators and increase transparency. This paper presents a supervised machine learning model with logistic regression to address this issue and predicts the probability of default of a loan funded to borrowers through peer-to-peer lending platforms. In addition, factors that affect the credit levels of borrowers are identified and discussed. The research shows that the most important features that affect probability of default are debt-to-income ratio, number of mortgage account, and Fair, Isaac and Company Score.

References

Avery, R. B., Brevoort, K. P., & Canner, G. (2012). Does Credit Scoring Produce a Disparate Impact? Real Estate Economics, 40. https://doi:10.1111/j.1540-6229.2012.00348.x

Bachmann, A., Becker, A., Buerckner, D., Hilker, M., Kock, F., Lehmann, M., & Tiburtius, P. (2011). Online Peer-to-Peer Lending – A Literature Review. Journal of Internet Banking and Commerce, 16(23).

Blackburn, M. L., & Vermilyea, T. (2012). The prevalence and impact of misstated incomes on mortgage loan applications. Journal of Housing Economics, 21(2), 151–168. https://doi.org/10.1016/j.jhe.2012.04.003

Chowdhury, M. Z. I., & Turin, T. C. (2020). Variable selection strategies and its importance in clinical prediction modelling. Family Medicine and Community Health, 8(1), e000262. https://doi.org/10.1136/fmch-2019-000262

Coenen, L., Verbeke, W., & Guns, T. (2021). Machine learning methods for short-term probability of default: A comparison of classification, regression and ranking methods. Journal of the Operational Research Society, 73(1), 191–206. https://doi.org/10.1080/01605682.2020.1865847

Davis, J., & Goadrich, M. (2006). The relationship between Precision-Recall and ROC curves. Proceedings of the 23rd International Conference on Machine Learning - ICML ’06. https://doi.org/10.1145/1143844.1143874

Diaz-Serrano, L. (2005). Income volatility and residential mortgage delinquency across the EU. Journal of Housing Economics, 14(3), 153–177. https://doi.org/10.1016/j.jhe.2005.07.003

Dong, G., Lai, K. K., & Yen, J. (2010). Credit scorecard based on logistic regression with random coefficients. Procedia Computer Science, 1(1), 2463–2468. https://doi.org/10.1016/j.procs.2010.04.278

Emekter, R., Tu, Y., Jirasakuldech, B., & Lu, M. (2014). Evaluating credit risk and loan performance in online Peer-to-Peer (P2P) lending. Applied Economics, 47(1), 54–70. https://doi.org/10.1080/00036846.2014.962222

Fisher, R. A. (2022b). Statistical Methods for Research Workers, 12th Ed. Rev. (Twelfth Edition). Oliver and Boyd.

George, N. (2018). All Lending Club loan data 2007 through current Lending Club accepted and rejected loan data. Kaggle. https://www.kaggle.com/wordsforthewise/lending-club?select=accepted_2007_to_2018Q4.csv.gz

Kim, H., & Devaney, S. A. (2001). The Determinants of Outstanding Balances Among Credit Card Revolvers. Journal of Financial Counseling and Planning, 12(1).

Meyer, T. (2007, July 10). Online P2P lending nibbles at banks’ loan business. Retrieved from http://www.venturewoods.org/wp-content/uploads/2007/11/p2p-lending.pdf

Namvar, E. (2013). An Introduction to Peer to Peer Loans as Investments. SSRN Electronic Journal. https://doi.org/10.2139/ssrn.2227181

Saito, T., & Rehmsmeier, M. (2015). The Precision-Recall Plot Is More Informative than the ROC Plot When Evaluating Binary Classifiers on Imbalanced Datasets. PLOS ONE, 10(3), e0118432. https://doi.org/10.1371/journal.pone.0118432

Scully, M. (2017, June 14). Biggest online lenders don't always check key borrower data. Retrieved August 29, 2022, from https://www.bloomberg.com/news/articles/2017-06-14/biggest-online-lenders-don-t-always-check-key-borrower-details

Setiawan, N., Suharjito, & Diana. (2019). A Comparison of Prediction Methods for Credit Default on Peer to Peer Lending using Machine Learning. Procedia Computer Science, 157, 38–45. https://doi.org/10.1016/j.procs.2019.08.139

Wang, H., Xu, Q., & Zhou, L. (2015). Large Unbalanced Credit Scoring Using Lasso-Logistic Regression Ensemble. PLOS ONE, 10(2), e0117844. https://doi.org/10.1371/journal.pone.0117844

Wang, Z., Jiang, C., Ding, Y., Lyu, X., & Liu, Y. (2018). A Novel behavioral scoring model for estimating probability of default over time in peer-to-peer lending. Electronic Commerce Research and Applications, 27, 74–82. https://doi.org/10.1016/j.elerap.2017.12.006

Yen, S. J., & Lee, Y. S. (2006). Under-Sampling Approaches for Improving Prediction of the Minority Class in an Imbalanced Dataset. Intelligent Control and Automation, 731–740. https://doi.org/10.1007/978-3-540-37256-1_89

Zhou, J., Li, W., Wang, J., Ding, S., & Xia, C. (2019). Default prediction in P2P lending from high-dimensional data based on machine learning. Physica A: Statistical Mechanics and Its Applications, 534, 122370. https://doi.org/10.1016/j.physa.2019.122370

Zhou, Y., & Wei, X. (2020). Joint liability loans in online peer-to-peer lending. Finance Research Letters, 32, 101076. https://doi.org/10.1016/j.frl.2018.12.024

Downloads

Published

2022-09-19

How to Cite

Pei Swee, C., Meziane, F., & Labadin, J. . (2022). Credit Risk Prediction for Peer-To-Peer Lending Platforms: An Explainable Machine Learning Approach. Journal of Computing and Social Informatics, 1(2), 1–16. https://doi.org/10.33736/jcsi.4761.2022