Research Article

A Hybrid Sampling SVM Approach to Imbalanced Data Classification

Algorithm 1

The hybrid sampling SVM method.
Input: Imbalanced training data set
Output: An SVM classifier
Step  1. Train an SVM classifier for training dataset and delete some negative samples using (5).
Step  2. Divide randomly T into k disjoint equivalent subsets .
Step  3. Select subset and over-sample the positive samples using SMOTE method; generate a
new training data set by merging the new synthetic samples into ; train an initial SVM
classifier for data set .
Step  4. For each subset in the rest subsets do
 Step  2.1. Compute the distances between negative samples and the hyperplane of classifier
 according to (5).
 Step  2.2. Select negative samples with the smallest distances; generate synthetic instances
 using SMOTE method.
 Step  2.3. Merge all positive samples and the new synthetic samples into , and obtain data
 set .
 Step  2.4. Train an SVM classifier for dataset .
Step  5. Classify data set using SVM method, and obtain a classifier .