MODELADO PREDICTIVO DEL SCORE CREDITICIO PARA CLIENTES NO BANCARIZADOS: UNA APLICACIÓN DE CATBOOST
MODELADO PREDICTIVO DEL SCORE CREDITICIO PARA CLIENTES NO BANCARIZADOS: UNA APLICACIÓN DE CATBOOST
-
DOI: https://doi.org/10.22533/at.ed.128112517036
-
Palavras-chave: Score crediticio, CatBoost, variables categóricas, riesgo de default, SHAP values.
-
Keywords: Credit score, CatBoost, categorical variables, default risk, SHAP values
-
Abstract: : This work presents the design and evaluation of a credit admission scoring model for “No Hit” clients, that is, individuals without a formal credit history in the financial system. The model aims to predict credit risk by classifying applicants as “good” or “bad” based on the probability of a payment default exceeding 60 days within 12 months after credit issuance. The model was developed using the CatBoost algorithm, implemented in Python. The process involved extensive data preprocessing, including the imputation of missing values and the treatment of categorical variables. It was trained on 54,337 records and validated with 17,940. The results demonstrate competitive performance, with metrics such as the GINI coefficient (45.14% in training and 42.84% in validation), AUC, and KS. Additionally, the efficiency table shows an inverse relationship between the score and the default rate. Model interpretability was achieved using SHAP values, allowing for transparent identification of the most influential variables. Overall, this approach contributes to responsible financial inclusion by enabling credit risk assessment in traditionally underserved populations.
- Carlos Alberto Peña Miranda
- Jesús Adalberto Zelaya Contreras
- Elizabeth Cosi Cruz