Research Article
Network Traffic Classification Based on SD Sampling and Hierarchical Ensemble Learning
| Input: Imbalanced train set , scaling factor , instance hardness threshold , and sample threshold UB | | Output: New train set | (1) | Step1: Distinguish between easy sets and difficult sets for each sampledo | (2) | Compute its nearest neighbors and ifthen | (3) | Put the samples into the difficult set | (4) | end | (5) | end | (6) | Difficult set and easy set | (7) | Step2: Compress the majority samples in the difficult set by the cluster centroid | (8) | Take all the majority samples from and set it as | (9) | Use the K-means algorithm with cluster | (10) | Use the coordinates of cluster centroids and replace the majority samples in | (11) | Compressed the majority sample set | (12) | Step3: Sample the minority samples in the difficult set using the SMOTE algorithm | (13) | Take all the majority samples from and set it as | (14) | for each sample do | (15) | Using SMOTE sampling, the sampling threshold is set to | (16) | Putting new samples into | (17) | end | (18) | Step4: Merge sample sets | (19) | New train set |
|