Research Article
A Hybrid Sampling SVM Approach to Imbalanced Data Classification
Algorithm 1
The hybrid sampling SVM method.
Input: Imbalanced training data set | Output: An SVM classifier | Step 1. Train an SVM classifier for training dataset and delete some negative samples using (5). | Step 2. Divide randomly T into k disjoint equivalent subsets . | Step 3. Select subset and over-sample the positive samples using SMOTE method; generate a | new training data set by merging the new synthetic samples into ; train an initial SVM | classifier for data set . | Step 4. For each subset in the rest subsets do | Step 2.1. Compute the distances between negative samples and the hyperplane of classifier | according to (5). | Step 2.2. Select negative samples with the smallest distances; generate synthetic instances | using SMOTE method. | Step 2.3. Merge all positive samples and the new synthetic samples into , and obtain data | set . | Step 2.4. Train an SVM classifier for dataset . | Step 5. Classify data set using SVM method, and obtain a classifier . |
|