Research Article

Assessing the Influence Level of Food Safety Public Opinion with Unbalanced Samples Using Ensemble Machine Learning

Table 1

Model settings.

ModelSettings

NBUniform prior probability of classes; other parameters follow the default setting of sklearn MultinomialNB model.
SVMParameters follow the default setting of sklearn LinearSVC model.
XGBoostEarly stopping rounds = 10; eval_metric = “logloss”; other parameters follow the default setting of the Python package XGBoost.
FastTextMinimal number of word occurrences = 2; other parameters follow the default setting of the Python package FastText.
TextCNNKeras-based implementation of a TextCNN [11]-like CNN, with a dropout layer after the embedding layer (dropout rate = 0.2); the 1D convolutional layer has 250 filters (kernel length = 3); a 3-max pooling layer follows and is followed by a flatten layer, a 50-unit dense layer, and a 3-unit softmax layer; the activation function of the convolutional layer and the dense layer is ReLU; input length = 1000, batch size = 256, epochs = 5.
LSTMKeras-based implementation of LSTM; the embedding layer is connected to a LSTM layer with 200 neurons, where a 0.2 dropout rate of the input and recurrent state is applied; following the LSTM layer is a dropout layer (dropout rate = 0.2), a 64-unit dense layer (ReLU activation function) and a 3-unit softmax layer; input length = 1000, batch size = 128, epochs = 5, Adam optimizer, learning rate = 0.01.
BERTChinese pretrained model, L = 12, H = 768, A = 12; batch size = 32, epochs = 5, learning rate = 2e − 5; input length = 128.
KNNParameters follow the default setting of the sklearn neighbors model.