Research Article
Handling Imbalance Classification Virtual Screening Big Data Using Machine Learning Algorithms
Table 3
Sensitivity and specificity results of the three datasets in numeric and fingerprint descriptors.
| Algorithm | PaDEL numeric descriptor | PaDEL fingerprint | No-sample | SMOTE | KSMOTE | No-sample | SMOTE | KSMOTE | Sensitivity | Specificity | Sensitivity | Specificity | Sensitivity | Specificity | Sensitivity | Specificity | Sensitivity | Specificity | Sensitivity | Specificity |
| AID 440 | RF | 0.028 | 0.998 | 0.32 | 0.997 | 0.91 | 0.997 | 0.09 | 0.985 | 0.2 | 0.998 | 0.948 | 0.99 | DT | 0.25 | 0.992 | 0.36 | 0.986 | 0.9 | 0.98 | 0.257 | 0.99 | 0.214 | 0.985 | 0.93 | 0.98 | MLP | 0.37 | 0.996 | 0.26 | 0.993 | 0.94 | 0.993 | 0.22 | 0.99 | 0.25 | 0.994 | 0.951 | 0.99 | LG | 0.314 | 0.998 | 0.46 | 0.98 | 0.94 | 0.98 | 0.17 | 0.998 | 0.267 | 0.976 | 0.94 | 0.98 | GBT | 0.05 | 0.997 | 0.32 | 0.993 | 0.94 | 0.991 | 0.228 | 0.99 | 0.17 | 0.995 | 0.95 | 0.98 |
| AID 624202 | RF | 0.2 | 0.993 | 0.39 | 0.965 | 0.905 | 0.993 | 0.25 | 0.997 | 0.4 | 0.987 | 0.93 | 0.99 | DT | 0.351 | 0.958 | 0.4 | 0.957 | 0.92 | 0.956 | 0.305 | 0.946 | 0.33 | 0.934 | 0.924 | 0.958 | MLP | 0.57 | 0.967 | 0.53 | 0.993 | 0.928 | 0.96 | 0.418 | 0.961 | 0.24 | 0.964 | 0.956 | 0.971 | LG | 0.4 | 0.807 | 0.8 | 0.806 | 0.921 | 0.81 | 0.775 | 0.987 | 0.76 | 0.86 | 0.825 | 0.984 | GBT | 0.25 | 0.985 | 0.38 | 0.957 | 0.91 | 0.985 | 0.249 | 0.994 | 0.23 | 0.9848 | 0.924 | 0.995 |
| AID 651820 | RF | 0.529 | 0.983 | 0.648 | 0.944 | 0.94 | 0.94 | 0.56 | 0.98 | 0.67 | 0.966 | 0.9 | 0.95 | DT | 0.57 | 0.923 | 0.579 | 0.876 | 0.91 | 0.9 | 0.65 | 0.9 | 0.63 | 0.896 | 0.88 | 0.95 | MLP | 0.728 | 0.95 | 0.716 | 0.943 | 0.9 | 0.913 | 0.58 | 0.93 | 0.673 | 0.933 | 0.9 | 0.93 | LG | 0.62 | 0.964 | 0.79 | 0.856 | 0.94 | 0.9 | 0.59 | 0.97 | 0.69 | 0.876 | 0.88 | 0.95 | GBT | 0.52 | 0.97 | 0.56 | 0.93 | 0.89 | 0.914 | 0.63 | 0.972 | 0.63 | 0.967 | 0.91 | 0.91 |
|
|