|
| Refs. | Models | Tools | Data | Validation method | Advantages | Disadvantages |
|
| [27] | C5.0 | Microsoft.NET framework | 500 | Accuracy | ∗Careful classification | Increase the depth of the tree |
| Proportion of positive and negative ∗quick update |
| [28] | Decision tree, logistic regression | Statistical | 402 | Prediction efficiency (PE), examination effort (EF), strike rate (SR) | LR is easier to implement, interpret, and very efficient to train | Overfitting |
| [29] | Association rule | DB miner | — | Accurate rate, error rate | It is appropriate for low transaction dataset | It needs multiple passes over the dataset |
| [30] | MLP, SVM, LR, HSA | Statistical | 4504 | Accuracy, sensitivity, specificity, AUROC | Fast convergence, increase efficiency, increase detection accuracy | Overfitting MLP is sensitive to feature scaling |
| [31] | Linear regression, SVM | Statistical | — | Accuracy | SVM is more effective in high dimensional spaces, proper performance of SVM in memory usage | Kernel function is not easy long training time for large datasets |
| [32] | Bayesian networks | Statistical | 10028 | Speedup | A strong and mathematically coherent framework for the analysis | The memory utilization is more |
| [33] | Colored network-based model (CNBM) | Framework | 31,910,000 | Accuracy | Accurate detection of samples segmentation of samples based on the weight of the samples | In CNBM model, the selection of optimal parameters values is required |
| [34] | Conditional maximum mean discrepancy (CMMD) | Coding | — | Accuracy | Accurate detection of samples | Increasing complexity |
| [35] | LR, k-medoids | Statistical | — | Mean absolute deviation (MAD), root mean square error (RMSE) | Discover the exact center for the samples | High prediction complexity for large datasets |
| [36] | MLP, SVM, logistic regression, random forest | Statistical | 700,000 | Accuracy, AUCROC, precision, F1-score, recall | Fast convergence, | Overfitting |
| Increase efficiency, |
| Increase detection accuracy |
| [37] | Transaction network representation | Light-GBM | 9,422,952 | True positive, true negative, false negative, false positive, error rate, precision, recall, F-measure, ROC | Quick calculation time | With big dataset, the prediction stage might be slow |
| Search space is correct. |
| Inexpensive testing of each instance |
| [38] | Deep learning | Light-GBM | 20444 | Accuracy, F1-Score, AUC | No need for labeling of data | Increasing complexity |
| Suitable for bulk data |
| Learning of layers based on calculate of individual neurons |
| [39] | BP-ANN, CHAID tree | Intelligent miner | 12458 | Accuracy rate, error rate | Low prediction complexity for large datasets | The memory utilization is more |
| [40] | ANN-MLP | Statistical | 2,000,000 | Sensitivity | Fast convergence, | MLP may suffer from over fitting |
| Increase efficiency |
| [41] | Clustering | Statistical | — | Accuracy | Simple execution | The memory utilization is more |
| [42] | LR, SVM, KNN, MLP, DT, RF | Statistical | — | Accuracy, precision, recall | KNN finds the k-nearest data points in the training set | Kernel function is not easy |
| [43] | CART, CHAID | Statistical | — | Accuracy rate, error rate | High accuracy | Stuck in the local optimal |
| Discover important features |
|