|
Reference | Year | Techniques | Datasets | Accuracy | Limitations |
|
Ghosh et al [23] | 2021 | Relief, LASSO | Cleveland, long beach, Hungarian, stat log | 99.05% | Classification algorithm is dependent on feature selection method. |
Alirezanejad et al. [24] | 2020 | Heuristic methods | Colon, leukemia | 92% | Meta heuristics can be applied to remove unnecessary attribute prior to classification |
Khan et al. [25] | 2019 | Gabor filter bank + SVM | DDSM database | 92.48% | The proposed system computational cost is much higher and they proposed further research to optimize the computational cost |
Lü et al. [26] | 2019 | RBM + SVM | MNIST handwritten database | 81.87% | Due to the determinacy of configuration parameters, the RBM feature extraction is not feasible, which needs to be improved in further research. |
Shrinivas D. Desai et al. [27] | 2019 | BPNN + LR | Cleveland dataset | 78.88% | The proposed model cannot be used as a clinical expert, it only complements the decision of clinician for taking better diagnostic decisions |
Vijayashree et al. [28] | 2018 | PSO + SVM | Cleveland heart disease | — | The proposed model can further be improved using ensemble classifiers. |
Kalantaria et al. [29] | 2018 | GA-SVM | California at Irvine (UCI) machine learning repository | 84.44% | The proposed system further optimization needs in terms of achieving high performance in detection on medical datasets. |
Dwivedi et al. [30] | 2018 | SVM | Statlog heart disease dataset | 90% | The proposed system cannot be used for the predication of disease levels. |
Jianguo Chena et al. [31] | 2018 | Disease diagnosis and treatment recommendation system | PubMed dataset | 90% | The security prospective is not been addressed. Feature selection is not considered. |
Tayefi, et al. [32] | 2017 | Decision tree, hs-CRP | UCI dataset | 94% | Due to some risk factors in diabetic patients, the proposed model does not consider some key factors to evaluate the system for high performance. |
Hoque et al. [33] | 2016 | Decision tree | UCI dataset | 71.2% | To incorporate incremental fuzzy feature selection technique for classification of DDoS attack traffic. |
Random forests | 83.12% |
Naïve bayes | 28.60% |
kNN | 94.50% |
SVM | 51.14% |
Bennasar et al. [34] | 2015 | mRMR | Sonar datasets | 88% | It disregards the interaction between the features and the classifier, as well as the higher dimensional joint mutual information between more than two features, which sometimes can lead to a suboptimal choice of features. |
JMIM | 87% |
NJMIM | 86% |
EMary et al. [35] | 2015 | Gray wolf optimization | UCI dataset | — | Improvement can be made using advanced feature selection method. |
Veronica et al. [36] | 2015 | ReliefF | Micro array datasets | 90.24% | The developed techniques should be tested on multiplatform, distributed learning, and real-time processing. It provides a new line of research for researchers to work on datasets with numerous increases in dataset use in feature selection. |
Information gain | 82.32% |
mRMR | 65.23% |
CFS | 65.36% |
FCBF | 55.21% |
Chi-Squared | 81.2% |
Zhang et al. [37] | 2014 | Sequential forward floating selection (SFFS) | Spam based data set | 94.07% | Slow processing, used only decision tree for classification |
SBS | 95.28% |
MBPSO | 91.97% |
Verónica et al. [38] | 2014 | Correlation-based feature selection (CFS) | Brain (UCI) | 66.67% | To distribute the microarray data vertically (i.e., by features) in order to reduce the heavy computational burden when applying wrapper methods. |
Chi-square | CNS (UCI) | 65.00% |
Minimum redundancy maximum relevance | GLI (UCI) | 69.41% |
Support vector machine | | 96.99% |
|