loan-default-prediction-model-using-python This project builds a loan default prediction model using synthetic data inspired by SBI’s FY2025 Annual Report.
Data Source
SBI Annual Report FY2025 Key Credit Risk Metrics: Gross NPA Ratio: 1.82% Net NPA Ratio: 0.47% Sector-wise Advances & NPAs (Agriculture, Industry, Services, Personal Loans) Methodology
Generated synthetic loan-level dataset using SBI’s sector-wise NPA ratios. Simulated borrower attributes (income, credit score, loan amount, tenure). Built a Logistic Regression model to classify defaults. Evaluated using ROC-AUC and classification metrics. Tech Stack
Python (pandas, numpy, scikit-learn) Jupyter Notebook (EDA + Modelling) Future Enhancements
Try advanced ML models (Random Forest, XGBoost). Link model outputs to Expected Credit Loss (ECL) and Basel capital requirements. Extend with open banking datasets.