Research Article

Leukemia can be Effectively Early Predicted in Routine Physical Examination with the Assistance of Machine Learning Models

Table 3

Accuracy scores of model performances on different scenarios, train-set/test-set ratios, and regularizations. (Strain and Stest are accuracy scores of models on train data subset and test data subset respectively according to the score() function of sklearn; Sauc is the area under the curve score of model according to the roc_auc_score() function of sklearn; Rtrain/total is the ratio of train data to the total data) for random forest model for XGboost model (lambda is the overfitting suppression factor).

lambdaRtrain/total0.750.50.25
ScenarioABCDEABCDEABCDE

0.1Strain0.9985111111111110.999511
Stest0.99170.98780.98440.98180.98730.9920.98980.98550.98630.98990.99560.99460.98990.98810.9866
Sauc0.98060.98140.97780.98010.98510.97530.98230.97370.98290.98690.98860.9870.98360.98340.9812

1Strain0.99850.99860.9987110.99920.99930.99870.99880.99880.99950.99910.998310.9992
Stest0.99220.98690.98320.98220.98730.9920.98920.98230.98510.98720.99560.98920.98860.98450.9811
Sauc0.98230.97920.97640.980.98510.97530.98060.97080.9820.98320.98860.97890.98110.97630.9738

10Strain0.99850.99460.99490.9940.99550.99850.99590.99490.99520.99610.9990.9950.99530.9960.9974
Stest0.99220.98560.98280.98140.98360.9920.98650.98230.98160.98610.99420.98650.98860.98340.9766
Sauc0.97940.97240.97350.97720.98020.97750.97140.970.97550.98240.98330.97240.98270.97430.9689