Research Article
A Hybrid Sampling SVM Approach to Imbalanced Data Classification
Algorithm 1
The hybrid sampling SVM method.
| Input: Imbalanced training data set | | Output: An SVM classifier | | Step 1. Train an SVM classifier for training dataset and delete some negative samples using (5). | | Step 2. Divide randomly T into k disjoint equivalent subsets . | | Step 3. Select subset and over-sample the positive samples using SMOTE method; generate a | | new training data set by merging the new synthetic samples into ; train an initial SVM | | classifier for data set . | | Step 4. For each subset in the rest subsets do | | Step 2.1. Compute the distances between negative samples and the hyperplane of classifier | | according to (5). | | Step 2.2. Select negative samples with the smallest distances; generate synthetic instances | | using SMOTE method. | | Step 2.3. Merge all positive samples and the new synthetic samples into , and obtain data | | set . | | Step 2.4. Train an SVM classifier for dataset . | | Step 5. Classify data set using SVM method, and obtain a classifier . |
|