A Novel Sentiment Analysis Model of Museum User Experience Evaluation Data Based on Unbalanced Data Analysis Technology
Algorithm 3
An undersampling method.
Input: training dataset X’tr. Denoised MNSV area sample set XNSV-maj, denoised FNSV area sample set XNSV-min support vector sample set X’sv, test data D;
Output: The final sampling result G-mean.
Step 1: Train a new classifier with dataset X’tr;
Step 2: Input the test data D, and get the accuracy rate ACC1 of the minority class and the accuracy rate ACC2 of the majority class after the test.
Step 3: If ACC1 is greater than or equal to ACC2, end the operation; otherwise, go to step 4.
Step 4: Calculate the number difference h between different types of samples in X’tr;
Step 5: Randomly delete h majority class samples in XNSV-maj to get the deleted dataset X1NSV-maj.
Step 6: X’sv, X1NSV-maj, XNSV-min form a new SVM classifier. Input the test set into the classifier to get the final sampling result G-mean.