Credit Risk Prediction for Peer-To-Peer Lending Platforms: An Explainable Machine Learning Approach
DOI:
https://doi.org/10.33736/jcsi.4761.2022Keywords:
Credit Risk Evaluation, Peer-to-Peer Lending, Logistic Regression; Explainable Machine Learning; Explainable AI.Abstract
Small and medium enterprises face the challenge of obtaining start-up fund due to the strict rules and conditions set by banks and financial institutions. The plight yields to the growth in popularity of online peer-to-peer lending platforms which are an easier way to obtain loan as they have fewer rigid rules. However, high flexibility of loan funding in peer-to-peer lending comes with high default probability of loan funded to high-risk start-ups. An efficient model for evaluating credit risk of borrowers in peer-to-peer lending platforms is important to encourage investors to fund loans and justify the rejection of unsuccessful applications to satisfy financial regulators and increase transparency. This paper presents a supervised machine learning model with logistic regression to address this issue and predicts the probability of default of a loan funded to borrowers through peer-to-peer lending platforms. In addition, factors that affect the credit levels of borrowers are identified and discussed. The research shows that the most important features that affect probability of default are debt-to-income ratio, number of mortgage account, and Fair, Isaac and Company Score.
References
Avery, R. B., Brevoort, K. P., & Canner, G. (2012). Does Credit Scoring Produce a Disparate Impact? Real Estate Economics, 40. https://doi:10.1111/j.1540-6229.2012.00348.x
Bachmann, A., Becker, A., Buerckner, D., Hilker, M., Kock, F., Lehmann, M., & Tiburtius, P. (2011). Online Peer-to-Peer Lending – A Literature Review. Journal of Internet Banking and Commerce, 16(23).
Blackburn, M. L., & Vermilyea, T. (2012). The prevalence and impact of misstated incomes on mortgage loan applications. Journal of Housing Economics, 21(2), 151–168. https://doi.org/10.1016/j.jhe.2012.04.003
Chowdhury, M. Z. I., & Turin, T. C. (2020). Variable selection strategies and its importance in clinical prediction modelling. Family Medicine and Community Health, 8(1), e000262. https://doi.org/10.1136/fmch-2019-000262
Coenen, L., Verbeke, W., & Guns, T. (2021). Machine learning methods for short-term probability of default: A comparison of classification, regression and ranking methods. Journal of the Operational Research Society, 73(1), 191–206. https://doi.org/10.1080/01605682.2020.1865847
Davis, J., & Goadrich, M. (2006). The relationship between Precision-Recall and ROC curves. Proceedings of the 23rd International Conference on Machine Learning - ICML ’06. https://doi.org/10.1145/1143844.1143874
Diaz-Serrano, L. (2005). Income volatility and residential mortgage delinquency across the EU. Journal of Housing Economics, 14(3), 153–177. https://doi.org/10.1016/j.jhe.2005.07.003
Dong, G., Lai, K. K., & Yen, J. (2010). Credit scorecard based on logistic regression with random coefficients. Procedia Computer Science, 1(1), 2463–2468. https://doi.org/10.1016/j.procs.2010.04.278
Emekter, R., Tu, Y., Jirasakuldech, B., & Lu, M. (2014). Evaluating credit risk and loan performance in online Peer-to-Peer (P2P) lending. Applied Economics, 47(1), 54–70. https://doi.org/10.1080/00036846.2014.962222
Fisher, R. A. (2022b). Statistical Methods for Research Workers, 12th Ed. Rev. (Twelfth Edition). Oliver and Boyd.
George, N. (2018). All Lending Club loan data 2007 through current Lending Club accepted and rejected loan data. Kaggle. https://www.kaggle.com/wordsforthewise/lending-club?select=accepted_2007_to_2018Q4.csv.gz
Kim, H., & Devaney, S. A. (2001). The Determinants of Outstanding Balances Among Credit Card Revolvers. Journal of Financial Counseling and Planning, 12(1).
Meyer, T. (2007, July 10). Online P2P lending nibbles at banks’ loan business. Retrieved from http://www.venturewoods.org/wp-content/uploads/2007/11/p2p-lending.pdf
Namvar, E. (2013). An Introduction to Peer to Peer Loans as Investments. SSRN Electronic Journal. https://doi.org/10.2139/ssrn.2227181
Saito, T., & Rehmsmeier, M. (2015). The Precision-Recall Plot Is More Informative than the ROC Plot When Evaluating Binary Classifiers on Imbalanced Datasets. PLOS ONE, 10(3), e0118432. https://doi.org/10.1371/journal.pone.0118432
Scully, M. (2017, June 14). Biggest online lenders don't always check key borrower data. Retrieved August 29, 2022, from https://www.bloomberg.com/news/articles/2017-06-14/biggest-online-lenders-don-t-always-check-key-borrower-details
Setiawan, N., Suharjito, & Diana. (2019). A Comparison of Prediction Methods for Credit Default on Peer to Peer Lending using Machine Learning. Procedia Computer Science, 157, 38–45. https://doi.org/10.1016/j.procs.2019.08.139
Wang, H., Xu, Q., & Zhou, L. (2015). Large Unbalanced Credit Scoring Using Lasso-Logistic Regression Ensemble. PLOS ONE, 10(2), e0117844. https://doi.org/10.1371/journal.pone.0117844
Wang, Z., Jiang, C., Ding, Y., Lyu, X., & Liu, Y. (2018). A Novel behavioral scoring model for estimating probability of default over time in peer-to-peer lending. Electronic Commerce Research and Applications, 27, 74–82. https://doi.org/10.1016/j.elerap.2017.12.006
Yen, S. J., & Lee, Y. S. (2006). Under-Sampling Approaches for Improving Prediction of the Minority Class in an Imbalanced Dataset. Intelligent Control and Automation, 731–740. https://doi.org/10.1007/978-3-540-37256-1_89
Zhou, J., Li, W., Wang, J., Ding, S., & Xia, C. (2019). Default prediction in P2P lending from high-dimensional data based on machine learning. Physica A: Statistical Mechanics and Its Applications, 534, 122370. https://doi.org/10.1016/j.physa.2019.122370
Zhou, Y., & Wei, X. (2020). Joint liability loans in online peer-to-peer lending. Finance Research Letters, 32, 101076. https://doi.org/10.1016/j.frl.2018.12.024
Downloads
Published
How to Cite
Issue
Section
License
Copyright Transfer Statement for Journal
1) In signing this statement, the author(s) grant UNIMAS Publisher an exclusive license to publish their original research papers. The author(s) also grant UNIMAS Publisher permission to reproduce, recreate, translate, extract or summarise, and to distribute and display in any forms, formats, and media. The author(s) can reuse their papers in their future printed work without first requiring permission from UNIMAS Publisher, provided that the author(s) acknowledge and reference publication in the Journal.
2) For open access articles, the author(s) agree that their articles published under UNIMAS Publisher are distributed under the terms of the CC-BY-NC-SA (Creative Commons Attribution-Non Commercial-Share Alike 4.0 International License) which permits unrestricted use, distribution, and reproduction in any medium, for non-commercial purposes, provided the original work of the author(s) is properly cited.
3) For subscription articles, the author(s) agree that UNIMAS Publisher holds copyright, or an exclusive license to publish. Readers or users may view, download, print, and copy the content, for academic purposes, subject to the following conditions of use: (a) any reuse of materials is subject to permission from UNIMAS Publisher; (b) archived materials may only be used for academic research; (c) archived materials may not be used for commercial purposes, which include but not limited to monetary compensation by means of sale, resale, license, transfer of copyright, loan, etc.; and (d) archived materials may not be re-published in any part, either in print or online.
4) The author(s) is/are responsible to ensure his or her or their submitted work is original and does not infringe any existing copyright, trademark, patent, statutory right, or propriety right of others. Corresponding author(s) has (have) obtained permission from all co-authors prior to submission to the journal. Upon submission of the manuscript, the author(s) agree that no similar work has been or will be submitted or published elsewhere in any language. If submitted manuscript includes materials from others, the authors have obtained the permission from the copyright owners.
5) In signing this statement, the author(s) declare(s) that the researches in which they have conducted are in compliance with the current laws of the respective country and UNIMAS Journal Publication Ethics Policy. Any experimentation or research involving human or the use of animal samples must obtain approval from Human or Animal Ethics Committee in their respective institutions. The author(s) agree and understand that UNIMAS Publisher is not responsible for any compensational claims or failure caused by the author(s) in fulfilling the above-mentioned requirements. The author(s) must accept the responsibility for releasing their materials upon request by Chief Editor or UNIMAS Publisher.
6) The author(s) should have participated sufficiently in the work and ensured the appropriateness of the content of the article. The author(s) should also agree that he or she has no commercial attachments (e.g. patent or license arrangement, equity interest, consultancies, etc.) that might pose any conflict of interest with the submitted manuscript. The author(s) also agree to make any relevant materials and data available upon request by the editor or UNIMAS Publisher.