Research Article

Research on Credit Risk Prediction under Unbalanced Dataset Based on Ensemble Learning

Algorithm 3

RAdaBoost-FLLGBM algorithm.
Input: data set D = {(x1, y1), (x2, y2), (x3, y3), …, (xn, yn)}, number of iterations T, feature dimension in random subspace ;
Output: final integrated classifier
(1) Initialize sample weights (i) = 1/n, i = 1, 2, …, n;
(2)  For t = 1 to T do
(3)   Use random subspace method to generate feature subspace St with dimension on data set D
(4)   Train the base classifier FLLGBM according to the sample weight and feature subspace St to obtain Ht
(5)   Calculate the training error of the base classifier Ht: , that is, is equivalent to the sum of the weights of the misclassified samples
(6)   if εt > 0.5
    then break;
(7)   Calculate the base classifier coefficients , update training sample weights , , where is the normalized coefficient.
(8)Fuse the output results of each classifier and output