|
Refs. | Models | Tools | Data | Validation method | Advantages | Disadvantages |
|
[27] | C5.0 | Microsoft.NET framework | 500 | Accuracy | ∗Careful classification | Increase the depth of the tree |
Proportion of positive and negative ∗quick update |
[28] | Decision tree, logistic regression | Statistical | 402 | Prediction efficiency (PE), examination effort (EF), strike rate (SR) | LR is easier to implement, interpret, and very efficient to train | Overfitting |
[29] | Association rule | DB miner | — | Accurate rate, error rate | It is appropriate for low transaction dataset | It needs multiple passes over the dataset |
[30] | MLP, SVM, LR, HSA | Statistical | 4504 | Accuracy, sensitivity, specificity, AUROC | Fast convergence, increase efficiency, increase detection accuracy | Overfitting MLP is sensitive to feature scaling |
[31] | Linear regression, SVM | Statistical | — | Accuracy | SVM is more effective in high dimensional spaces, proper performance of SVM in memory usage | Kernel function is not easy long training time for large datasets |
[32] | Bayesian networks | Statistical | 10028 | Speedup | A strong and mathematically coherent framework for the analysis | The memory utilization is more |
[33] | Colored network-based model (CNBM) | Framework | 31,910,000 | Accuracy | Accurate detection of samples segmentation of samples based on the weight of the samples | In CNBM model, the selection of optimal parameters values is required |
[34] | Conditional maximum mean discrepancy (CMMD) | Coding | — | Accuracy | Accurate detection of samples | Increasing complexity |
[35] | LR, k-medoids | Statistical | — | Mean absolute deviation (MAD), root mean square error (RMSE) | Discover the exact center for the samples | High prediction complexity for large datasets |
[36] | MLP, SVM, logistic regression, random forest | Statistical | 700,000 | Accuracy, AUCROC, precision, F1-score, recall | Fast convergence, | Overfitting |
Increase efficiency, |
Increase detection accuracy |
[37] | Transaction network representation | Light-GBM | 9,422,952 | True positive, true negative, false negative, false positive, error rate, precision, recall, F-measure, ROC | Quick calculation time | With big dataset, the prediction stage might be slow |
Search space is correct. |
Inexpensive testing of each instance |
[38] | Deep learning | Light-GBM | 20444 | Accuracy, F1-Score, AUC | No need for labeling of data | Increasing complexity |
Suitable for bulk data |
Learning of layers based on calculate of individual neurons |
[39] | BP-ANN, CHAID tree | Intelligent miner | 12458 | Accuracy rate, error rate | Low prediction complexity for large datasets | The memory utilization is more |
[40] | ANN-MLP | Statistical | 2,000,000 | Sensitivity | Fast convergence, | MLP may suffer from over fitting |
Increase efficiency |
[41] | Clustering | Statistical | — | Accuracy | Simple execution | The memory utilization is more |
[42] | LR, SVM, KNN, MLP, DT, RF | Statistical | — | Accuracy, precision, recall | KNN finds the k-nearest data points in the training set | Kernel function is not easy |
[43] | CART, CHAID | Statistical | — | Accuracy rate, error rate | High accuracy | Stuck in the local optimal |
Discover important features |
|