Research Article

Cross-Project Defect Prediction Based on Two-Phase Feature Importance Amplification

Table 3

F1-measure, AUC, and MCC of models with different classifications, including Naive Bayes (NB), Logistic Regression (LR), Decision Tree (DT), Support Vector Classification (SVC), and Random Forest (RF).

 F1-measureAUCMCC
LRDTNBSVCRFLRDTNBSVCRFLRDTNBSVCRF

EQ-JDT0.5940.7290.5600.5710.7900.7580.8520.7130.7270.8730.4790.6550.4670.4650.735
EQ-LC0.4350.6190.4040.2240.7610.6750.8890.6540.5750.9030.3870.6010.3610.1360.738
EQ-ML0.3750.6720.3330.3760.7420.6470.8660.6110.6370.8750.2710.6270.2530.2880.702
EQ-PDE0.3790.6600.3610.3860.7760.6640.8510.6210.6670.8640.2590.6070.2830.2690.740
JDT-EQ0.5950.8290.4920.4590.8330.6800.8570.6440.6320.8620.3790.7270.3580.3470.719
JDT-LC0.3620.5730.3730.3370.7520.6940.8370.6720.6870.8750.2930.5390.3030.2660.726
JDT-ML0.3760.6370.3610.4180.7450.6440.8540.6320.6790.8830.2760.5890.2650.3200.707
JDT-PDE0.3730.6510.3560.3810.7610.6680.8410.6190.6790.8660.2530.5950.2790.2650.721
LC-EQ0.6720.8060.5780.7290.8310.7270.8390.6780.7730.8620.4530.6650.3890.5420.710
LC-JDT0.5750.7210.5850.5250.8160.7550.8600.7430.7300.9110.4510.6480.4730.3840.768
LC-ML0.3930.6550.3500.4060.7650.6470.8430.6200.6590.8550.3070.6040.2720.3140.732
LC-PDE0.3940.6600.3670.3780.7870.6750.8430.6280.6650.8660.2790.6050.2780.2580.755
ML-EQ0.6960.8260.5600.6320.8550.7300.8570.6740.6870.8860.4550.7030.3970.3690.755
ML-JDT0.5670.7250.5800.5390.7870.7480.8550.7280.7420.8830.4400.6500.4830.4040.730
ML-LC0.3430.5970.4220.3380.8240.6860.8890.6810.6560.9250.2730.5830.3630.2610.806
ML-PDE0.4010.6530.3810.3950.7630.6840.8370.6350.6800.8650.2880.5970.2950.2810.724
PDE-EQ0.6400.7910.4970.5250.8730.7040.8260.6450.6380.8970.4100.6430.3570.2990.785
PDE-JDT0.5850.7010.6020.5210.8060.7460.8340.7520.7330.8810.4700.6190.4950.3820.754
PDE-LC0.3820.5840.4380.3620.7840.7020.8570.6900.6390.8730.3150.5570.3800.3060.763
PDE-ML0.3850.6290.3120.4060.7540.6460.8310.6000.6660.8530.2910.5730.2320.3080.718
Average0.4760.6860.4460.4450.7900.6940.8510.6620.6780.8780.3510.6190.3490.3230.739

The data in bold shows the classification with the best perfomance in each set of experiments.