Research Article

A GAN and Feature Selection-Based Oversampling Technique for Intrusion Detection

Table 1

Design rationale of different oversampling techniques.

TechniqueDesign rationale

Random oversamplingMinority samples are randomly selected and replicated to increase the number of samples

SMOTESamples are generated by interpolation between each minority sample and its surrounding minority samples

ADASYNAs with SMOTE, ADASYN generates new samples by interpolation; the difference is that the number of new samples that need to be synthesized for each minority sample is determined by the density of majority class instances around it

K-means SMOTEK-means SMOTE will first cluster the data into multiple clusters; different samples are then generated for the clusters’ density, with smaller densities generating a more significant number of samples

G-SMOTEG-SMOTE generates synthetic samples in a geometric region of the input space, around each selected minority sample