Loan Default Prediction with Machine Learning
Himberg, Tomi (2021)
Himberg, Tomi
2021
Julkaisu on tekijänoikeussäännösten alainen. Teosta voi lukea ja tulostaa henkilökohtaista käyttöä varten. Käyttö kaupallisiin tarkoituksiin on kielletty.
Julkaisun pysyvä osoite on
https://urn.fi/URN:NBN:fi-fe2021120359162
https://urn.fi/URN:NBN:fi-fe2021120359162
Tiivistelmä
Giving credit is one of the core businesses in banking and the importance of credit risk management was highlighted in the 2008 financial crisis. Increased number of loan defaults was one of the reasons behind the crisis, which led to more regulations in loan granting. Predicting loan defaults has become important as banks try to follow laws and regulations, grant credits to qualified customers, mitigate credits to unqualified customers and to make their application processes efficient. This research studies credit risk in banking, discusses banking regulations which affect loan granting and presents how machine learning is utilized in lending. In addition, the literature review explains machine learning and the steps in building machine learning models. The empirical study is conducted with a loan data set retrieved from Kaggle.com. Predictions are executed with four machine learning algorithms and predictive power is evaluated based on sensitivity, specificity and the area under the ROC curve. The four algorithms used are logistic regression, classification tree, random forest and extreme gradient boosting (XGBoost). Research questions are answered based on the literature review and the results from the empirical study. The results suggest that lenders have various reasons to utilize machine learning in their loan application processes and machine learning enables classifying the majority of qualified and unqualified applicants correctly.