Research Article
Network Traffic Classification Based on SD Sampling and Hierarchical Ensemble Learning
Table 3
Related works of balanced datasets.
| Detailed method | Literature | Description | Dataset | Best accuracy (%) |
| Data | [26] | The variational autoencoder generates samples | UNSW-NB15 | 96.13 | [27] | The wGAN-GP method generates samples | NSL-KDD/UNSW-NB15/CICIDS2017 | 86.69/94.90/99.84 | [28] | A three-point domain sample generation method based on the SMOTE algorithm | NSL-KDD | 99.00 | [29] | The SVR model predicts SMOTE sampling proportion | KDD Cup 1999 | 98.10 | [30] | The SMOTE algorithm combining clustering and instance hardness | DoHBrw2020/CIC_Bot/CIC_Inf/DOS2017/UNSW/Botnet2014 | AUC = 89.60/90.21/92.09/75.92/93.19/73.81 | [31] | A technique for sampling samples based on the difficulty of sample classification | NSL-KDD/CICIDS2018 | 82.84/96.99 | [32] | A method combining TGAN and slow start | KDD cCup 1999 | 93.98 | [33] | An encrypted traffic generation method based on GAN | ISCX | 99.10 |
| Algorithm | [34] | A cost-sensitive deep neural network | NSL-KDD/CIDDS-001/CICIDS2017 | 92.00/99.00/92.00 | [35] | A method of weighted extreme learning machine | UNSW-NB15/KDD cup 1999 | 96.12/99.71 | [36] | The HM-loss cost method | Personal real data | F1 = 87.00 | [37] | A method of batch balancing datasets based on deep learning | CHB-MIT/BonnEEG/FAHXJU | 95.96/100.00/87.93 |
|
|