Research Article

Network Traffic Classification Based on SD Sampling and Hierarchical Ensemble Learning

Algorithm 1

SDsampling Algorithm.
Input: Imbalanced train set , scaling factor , instance hardness threshold , and sample threshold UB
Output: New train set
(1)Step1: Distinguish between easy sets and difficult sets for each sampledo
(2) Compute its nearest neighbors and ifthen
(3)  Put the samples into the difficult set
(4)end
(5)end
(6)Difficult set and easy set
(7)Step2: Compress the majority samples in the difficult set by the cluster centroid
(8)Take all the majority samples from and set it as
(9)Use the K-means algorithm with cluster
(10)Use the coordinates of cluster centroids and replace the majority samples in
(11)Compressed the majority sample set
(12)Step3: Sample the minority samples in the difficult set using the SMOTE algorithm
(13)Take all the majority samples from and set it as
(14)for each sample do
(15) Using SMOTE sampling, the sampling threshold is set to
(16) Putting new samples into
(17)end
(18)Step4: Merge sample sets
(19)New train set