Abstract and Applied Analysis

Research Article

A Hybrid Sampling SVM Approach to Imbalanced Data Classification

The hybrid sampling SVM method.

Input: Imbalanced training data set
Output: An SVM classifier
Step 1. Train an SVM classifier for training dataset and delete some negative samples using (5).
Step 2. Divide randomly T into k disjoint equivalent subsets .
Step 3. Select subset and over-sample the positive samples using SMOTE method; generate a
new training data set by merging the new synthetic samples into ; train an initial SVM
classifier for data set .
Step 4. For each subset in the rest subsets do
Step 2.1. Compute the distances between negative samples and the hyperplane of classifier
according to (5).
Step 2.2. Select negative samples with the smallest distances; generate synthetic instances
using SMOTE method.
Step 2.3. Merge all positive samples and the new synthetic samples into , and obtain data
set .
Step 2.4. Train an SVM classifier for dataset .
Step 5. Classify data set using SVM method, and obtain a classifier .