Computational Intelligence and Neuroscience

Research Article

A Novel Sentiment Analysis Model of Museum User Experience Evaluation Data Based on Unbalanced Data Analysis Technology

An oversampling method.

	Input: SV area sample set X_sv, denoised MNSV area sample set XNSV-maj, denoised FNSV area sample set XNSV-min, test data D;
	Output: training samples X_tr, real SVM sample set X_sv.
	Step 1: Initialize the sample, let X4 = X_sv.
	Step 2: Find all the minority class samples in the sample set X_sv and divide them into correct samples X1 and wrong samples X2.
	Step 3: The first δ samples in X1 close to the decision boundary and the samples in X2 form a new sample set X_j; for each sample in X_j, use the SMOTE algorithm to oversample between the same class, generate X3, and add X3 to into X4.
	Step 4: X4, XNSV-maj, XNSV-min form a new training set X’tr, and the training set trains the SVM classifier to find the corresponding SSV.
	Step 5: Input the test data D, obtain the accuracy rate ACC1 of the minority class and the accuracy rate ACC2 of the majority class after classification, and calculate the corresponding G-mean at this time.
	Step 6: Compare the size of ACC1 and ACC2, if ACC1 is greater than or equal to ACC2, then terminate the operation; otherwise, continue the following operations;
	Step 7: Repeat steps 2 to 6 until ACC1 is greater than or equal to ACC2, select the training sample X’tr corresponding to the largest G-mean during the period, then X’tr is the required optimal data. Correspondingly, the support vector sample set X’sv corresponding to X’tr is the closest support vector set.