Abstract

Aiming at the needs of network public opinion analysis and crisis public opinion early warning in colleges and universities, the semantic sentiment analysis method is studied in this paper. Most of the public opinion information comes from short text comment information, and its text is separated from the written language, the structure becomes simpler, and it lacks normativity, which brings certain difficulties to the extraction of text features. Traditional sentiment analysis methods often rely on emotional dictionaries and feature extraction, and with the continuous change of Internet culture, a technical help is needed to make even the dictionary updated. Based on the analysis and study of attention mechanism and deep learning related technologies, an LSTM model is proposed to mine the deep semantic characteristics of text, which can accurately determine its emotional tendency. The main tasks are as follows: according to the CNN and LSTM text processing, CNN can better extract the local features of the text, and LSTM can retain the text history information and effectively extract the global features of the sequence. The CBOW model is optimized to pay more attention to the feature vectors that affect the classification results during the calculation process. Finally, the improved model in this paper compares the accuracy, recall rate, loss rate, and F1 value of the traditional model to indicate the performance evaluation index of the model.

1. Introduction

Through the investigation of network usage in China, the paper discusses the public opinion guidance and analyzes the current situation of it and its influencing factors [1]. Today’s large number of college students, and in the era of information flow, campus public opinion management can be said to be very necessary. Good public opinion can increase the dissemination, and for the wrong public opinion information and opinions, the transmission should be blocked. For adolescents, although the rapid dissemination of the Internet can make them understand the world more quickly and efficiently, they can still learn bad things quickly. For some unhealthy campus public opinion, the school must have a means to stop it. To address this issue, this paper will use deep learning models to process and manage student comments by studying online campus posts by college students. This method can not only play a certain role in managing students’ online public opinion but also greatly help in in-depth understanding of student value development and student life. Based on the literature review, we identified three directions that researchers can focus on in automated feedback processing: advanced entity extraction, multilingual sentiment analysis, and figurative language processing. These three aspects can help people fully understand sentiment analysis research [2]. For multicampus schools, student organization management has brought great challenges, how to develop the community management model, and give full play to the role of student social education has attracted more and more attention from more and more schools. Obviously, in the developed network, in the era of information flow, it is essential to manage the academy forum, from which it is crucial to guide students to speak kindly [3]. The traditional public opinion management method is almost ineffective for public opinion management. Therefore, we need to take the existing web forum as the main communication position, use the existing social network analysis (SNF) method to analyze the subject and object of public opinion, incorporate most of the themes into management, and track the dissemination information in a timely and effective manner [4]. The Internet can resolve the crisis of public opinion, maintain public order, fairness, and justice, and realize the core value of the network-the dissemination of information. In view of the spontaneous, scattered, and some irrational characteristics, we choose to manage the network public opinion accordingly, in order to cultivate the consciousness of netizens, and improve the quality of netizens [5]. In reality, network public opinion plays an important role in the occurrence and subsequent development of student incidents, starting from the evolution of public sentiment, the public opinion of student group events in colleges and universities should be controlled in a timely manner, and an information supervision and management system for responding to group events should be established, including information collection, information evaluation, and dissemination flow [6]. Public sentiment has evolved richly, and existing emotion research lacks theoretical analysis. This article will build on the actual situation and study the general trends, evolutionary methods, and internal mechanisms of the evolution of public sentiment. Through the above analysis and research, public opinion is channeled into a form with the right values, so that the students’ online forum has a higher discussion value [7]. Public opinion analysis is the process of analyzing the text with subjective emotion. Text can be divided into document, sentence, and aspect-based, depending on the aspect granularity. Currently, three commonly used methods are as follows: dictionary-based methods, traditional machine learning methods, and deep learning methods. In this article, we focus on exploring deep learning methods [8]. This article employs deep learning techniques based on convolutional neural networks (CNN) to process campus reviews. After collecting campus hot topics for preprocessing and using the Word2vec model to generate word vectors, the emotional propensity classification using convolutional neural networks (CNN) obtained an accuracy rate of 89.76%, which is more accurate than that of traditional support vector machines (SVM) improves by 7.3%for better classification performance [9]. The pretrained Word2vec model, introduced here, both prevents the overfitting problem, but also reduces the number of training parameters, and improves the training efficiency. The sentence vector is input into the machine learning model and the deep learning model to compare the experimental results and propose the optimization direction. A deep learning-based emotion analysis model has a significantly higher accuracy than a single machine learning model [10]. From the perspective of semantic understanding, the OCC model is used to establish emotional rules. The model is only based on the emotional dictionary of network users in the process of emotional tendency analysis and marks the emotional tendency classification of campus forum texts on sudden public opinion as a training set, which is used to train the neural model of deep learning, and obtains the emotion recognition model of network public opinion [11]. First, 79 kinds of 237 universal identification videos are selected as the benchmark database, processing a total of 127980 frames of reference ratio. The algorithm trains 90,000 times at a time, iterates 2,000 times, and then selects the optimal model, superior to the classical methods in authoritative literature, with high precision and high scalability [12]. For the problems of imperfect student information management platform and low mining accuracy, this paper establishes a combined model based on the decision tree, the neural network, and the naive Bayesian algorithm and analyzes and predicts student behavior based on Spark. At the same time, based on students’ consumption rules, living habits, and learning habits, prediction analysis and case verification were conducted. Guide the healthy and comprehensive development of students’ behavior [9]. Most of the public opinion information comes from short text comments, and the text lacks normativity, is simple in structure, and is separated from the written language, which often brings certain difficulties to the extraction of text features. Traditional sentiment analysis relies on emotion dictionaries and feature extraction, but with the continuous updating and iteration of network culture and data volume, a large number of emotion dictionaries need to be updated; otherwise, semantic features will be lost, and classification will be unclear. In this paper, a han-clstm model that mines the deep semantic features of text can accurately determine the emotional tendencies of texts [13]. With the development of Internet information technology, network public opinion management occupies a dominant position in college student management. In view of the shortcomings of traditional sentiment analysis, this paper uses the deep learning method of network public opinion analysis, and the main research work is as follows: (1) in the text data preprocessing, it is used Word2vec model pretrained word vectors, after the model training to obtain low-dimensional dense vectors. (2) A joint deep neural network model is proposed, which automatically extracts the text high-dimensional features of the text by inputting the trained text word vector into the convolutional neural network, compares the model parameters to determine the optimal parameter values of the model, and then compares and analyzes the model with other sentiment analysis models, which verifies the effectiveness of the model. (3) In view of the characteristics of rich and fast change of public opinion information theme, a certain public opinion theme is analyzed from multiple dimensions. (4)Through comparative experiments with traditional sentiment analysis methods, the superiority of the algorithm is verified [14].

2. Deep Learning Model Building

2.1. Classification of Text Features

Text consists of subs, words, phrases, and sentences, all of which can be used as text features. Selecting features requires consideration of how the selected features are weighted, what effect they have on the results of text analysis, and so on. Suppose represents the number of documents with characteristic items, indicates the number of documents without classes, indicates the number of documents with no class, and indicates the total text [15]. The interinformation of features and categories can be calculated as

If the feature is not related to the category, then So,

To select features that are useful for multiaccumulation document identification, the maximum and average methods are used:

2.2. Long Short-Term Memory Networks

The long short-term memory network (LSTM) mainly improves the RNN, and the following Figure 1 shows the architecture of the long short-term memory network, which has three more control gates relative to the RNN, namely, the input gate, the forgetting gate, and the output gate [16]. The input gate determines that the input information needs to be updated, and the forgotten gate can determine whether the input information is important enough to be remembered and output, which can solve the problem of the disappearance of the original PNN gradient.

Each -th LSTM cell maintains a memory at -time . The output or activation function of the LSTM cell is

where is an output gate that adjusts the amount of memory content exposure. The output gate is calculated as

where

, for a sigmoid function, is a matrix. Storage unit updates storage content by partially forgetting existing storage content and adding new content:

The new storage content is

The degree of forgetting of the existing memory is controlled by the forgetting gate, and the degree to which the new memory content is added to the memory cell is controlled by the input gate. And the calculation of the door is

and are for diagonal matrix.

If an LSTM cell detects an important feature from the input sequence at an early stage, it can easily transmit that information (presence feature) over long distances, capturing potential long-distance dependencies [17]. As shown in Figure 2, , , and are input gates, forgetting gates, and output gates, respectively, andrepresents the contents of the storage unit and the new storageunit.

2.3. World2vec Model

The CBOW model is based on the context of the current central word, the current word is realized to predict the current central word, and the CBOW model structure is shown in Figure 3.

The CBOW model is broadly divided into three layers: input layer, implicit layer, and output layer and does not contain a hidden layer [18]. The amount of calculation of the model is greatly reduced, and the calculation efficiency is further improved.

For the input layer, for the first words of the center and the next words of the center, select this 2c word vector as the input vector of the input layer, average the input vector, and predict the center word . The conditional probability of occurrence. As shown in Figure 4, the central word is learning, and the four words before and after it are the input vectors of the input layer of the model and are the same real vectors of the same dimension.

The function of the implicit layer is to project the vectors entered by the input layer, copy the real vectors entered by the input layer to the hidden layer, and then sum this:

The skip-gram model predicts the contextual vocabulary of the current central word , , and based on the current central word. The model structure of skip-gram is shown in the following Figure 5.

This model represents that given a central word, predict the conditional probability of occurrence of the contextual word that derives that central word:

where the central word vector is the background word vector [7].

3. Multidimensional Public Opinion Analysis Algorithm Based on Deep Learning

3.1. Methods of Improvement

Although the CBOW model selected in this article improves the model running speed by removing the hidden layer, it is better than the output layer and retains softmax, and the actual time consumption is still very serious, which is optimized by high-frequency word sampling and negative sampling methods for this situation.

When CBOW training is performed on some high-frequency words, words that are better than the higher frequency will appear in the context of each word, but will not reflect more semantic information in the text, and the number of high-frequency word samples in the text far exceeds the number of training samples required for training word vectors, which greatly reduces the efficiency of the model. In view of this situation, the high-frequency words without differentiation are selected for sampling to reduce the number of training samples, and the sampling rate calculation formula is as follows:

represents the word present in the sample data, , for the probability that the sample word will appear in the entire sample data set, is the threshold for the sampling of high-frequency words, the general default value is 0.001, the larger the value, the greater the difference between the probabilities of the adoption of different probability words, and the greater the threshold of a word, the greater the probability of being deleted [19]。.

The sampling rate is shown in Figure 6 below, where the axis represents the probability and the axis represents the probability that a word is retained.

3.2. Network Training Process

The training for the model is to adjust the network parameters appropriately according to the network input characteristics to ensure the consistency of the prediction labels and the input labels. This paper will describe the multilayer perceptron (MLP) model as an example, specifically as follows in Figure 7:

Specific training can be divided into two major steps: the forward propagation phase and the back propagation phase [20].

Forward propagation phase is as follows: the MLP network obtains the predicted output of the network from the input training sample, the current network weights, and the equal solution.

Back propagation phase is as follows: the MLP network calculates the error between the prediction output and the network input, , and solves the corresponding cost function based on this:

The cost function is designed according to the norm, where

Then, it is optimized to update the data and ensure the authenticity and rationality of the data. Gradient descent is used in general optimization, and the corresponding mathematical relationship is

Within this relationship, refers to the learning rate parameter, the main function of which is to control the intensity of the error propagation [21]. The specific training process is as follows in Figure 8.

3.3. Evaluation Model

Evaluation criteria mainly include accuracy, accuracy, recall, and F1 value. Accuracy refers to the ratio between the predicted value and the total sample value; accuracy refers to the proportion of the correct judgment in the test example that is judged to be positive, and recall rate refers to the proportion of the test example in which the test example is actually the correct case [22].

First, the situation of binary classification is analyzed, and the correlation table of binary classification problems is constructed as shown in Table 1.

TP represents the actual time positive sample and is correctly identified as the number of literal texts.

FP stands for the actual time negative sample, which is incorrectly judged as the number of literal texts.

TN stands for negative sample of instance time and is correctly discriminated as negative case-specific amount of text.

The FN stands for the actual-time positive sample and is incorrectly judged as the negative literal quantity.

Thus, we get the classification evaluation standard calculation formula:

Accuracy is as follows:

Example accuracy is as follows:

Negative case accuracy is as follows:

Regular recall rate is as follows:

Negative recall rate is as follows:

Example F1 value is as follows:

Negative example F1 value is as follows:

The calculation method of ternary classification or more classification is actually similar to the binary classification problem; that is, the evaluation index is calculated separately for each category, the classification category to be calculated is regarded as a positive example, the other classification categories are regarded as negative examples, and then the accuracy of the evaluation parameters, recall rate, and F1 value are calculated for that category [23]. They are all similar to binary classification; so, they are calculated separately before category calculation.

4. Experimental Simulation

4.1. Experimental Data

The training data in this paper is by calling the existing campus network public opinion data that has been manually classified, which consists of 48584 positive and 48584 negative, with a total of 99168 training corpus. Since machine learning has traditionally divided the training set validation set into 7 : 3, this article divides a set of training set validation sets into 7 : 3; so, the training set has 69417 pieces of data, and the validation set has 29751 data.

4.2. Experimental Comparison Results

Table 2 is the experiments performed on five different sets of models to produce comparative plots of the five evaluation indicators. From the two comprehensive indicators of accuracy and -measure, the accuracy rate of the BGRU-Attention model reached 93.51%, and the -measure reached 93.54%, all better than other models. BGRU can effectively learn the characteristics of the text context, while CapsNet can extract richer text information and improve the expression ability of the text, and the combination of the two greatly improves the accuracy of emotion classification.

We can see it from Figure 9 that the accuracy of the five groups of models is constantly increasing, reaching more than 88% after the third iteration, of which the BGRU-CapsNet model reaches more than 90%, and the trend is stable and not similar. His crew model fluctuates greatly.

As can be seen from Figure 10, the BGRU-CapsNet loss rate is the fastest to decrease, indicating that the model converges very quickly. The loss rate of the BGRU-CNN-attention model reached 0.2640 at the first time in and the lowest value of 0.2064 at the 5th time, with a stable trend, indicating that the model is relatively stable.

Given below is an in-depth analysis of the time cost of the model, as shown in Figure 11, and BGRU-CapsNet achieved good text classification results, but the time cost was the largest of these models, reaching 139 s/epoch.

As seen from Figure 12, the classification method using the BGRU-CapsNet model performs better than the comparison model, and the trial results show that BGRU-CapsNet showed better results in accuracy, accuracy, recall, and F1 values.

4.3. Public Opinion Analysis of Campus Comments

According to the optimization method based on deep study, the public opinion was analyzed for the comments of 2019 college students.

4.3.1. College Students Have a Strong Sense of Justice

In the survey of online comments on college students, it is found that the vast majority of students are in a sense of justice and rationality, pay more attention to equal participation and freedom of speech, and pay more attention to individual independent views of events, and the herd mentality is not obvious. Only a few people guess the relevant comments for purposes of attracting attention, as is shown in Figure 13.

4.3.2. The Online Public Opinion of College Students Is Widely Concerned

In the survey of the type of network public time attention of college students, it was found that the students surveyed showed high attention to domestic hot events, and there are also international major events, campus focus issues, and things related to themselves and fully show the diversified attention angle and broad attention scope of college students in China, as shown in Figure 14.

4.3.3. The False and Emotional Network Evaluation of College Students Is Obvious

Better than network information and update fast, make college students in the network information always want to obtain the maximum information in the shortest time, the lack of full thinking time and space, under the condition of immature thinking or blind from others’ point of view, or make immature judgment, lead to the generation of false information and emotional consequences, as shown in Table 3.

4.3.4. A Large Amount of Bad Information and More Negative Content

The characteristics of virtualization, permeability, arbitrariness, openness, and concealment of the Internet information communication platform make it accommodate various views with uneven levels. In addition, college students are in the critical period of physical and mental development, and their ideas are easily expected to be affected by the surrounding areas, so that most college students are disturbed and affected by the daily network communication and information search concentration of bad information, as shown in Figure 15.

4.3.5. Polarization of Online Public Opinion Groups in Colleges and Universities Is Prone

The development and evolution of college students’ online public opinion are a kind of process development and evolution, and under the stimulation of external emergencies, college students’ sense of justice, compassion, social responsibility, and national consciousness was fully stimulated; so, in order to maintain their mind of social morality and national honor, in the process of expressing their views and views often with emotional expression, the emotional expression based on its own impact size and their own organization or groups of public opinion, form network public opinion polarization phenomenon, as shown in Figure 16, it can be seen that the biggest hidden danger of public opinion among college students is the cause of mass events.

According to the optimization method based on deep learning proposed in this paper, this paper analyzes the public opinion of the comments made by the students of grade 2019 in a university and draws an experimental report analysis chart from five aspects, such as college students’ strong sense of justice in commenting, wide attention of college students’ online public opinion, obvious falsehood and emotion of college students’ online evaluation, large amount of bad information, many negative contents, and easy polarization of college online public opinion groups.

5. Conclusion

With the popularization of the Internet, the network not only improves the convenience of students but also has some potential harm to the development of social stability.

This paper around the main module of public opinion analysis of text emotion analysis, through the improvement of deep learning model and method for experimental verification, from the data set, from the perspective of different frontal experiment, compares multiple traditional methods, through accuracy, recall, F1 value, and loss rate four indicators, to verify the effectiveness and feasibility of the method mentioned in this paper. Finally, the experimental results achieved a good results. However, with the development of The Times and the network, the society will have more demand for the network public opinion system. This article also has the following shortcomings: (1)More datasets are needed to verify the performance of the model(2)Future considerations of introducing BERT word vectors of word order connections into the model are needed(3)Nonsupervision or supervision method can be adopted in the follow-up research to improve the performance of public opinion management(4)It is necessary to appropriately improve the configuration of the computer and the computing environment, so as to improve the accuracy of public opinion analysis

Data Availability

The experimental data used to support the findings of this study are available from the corresponding author upon request.

Conflicts of Interest

The authors declared that they have no conflicts of interest regarding this work.

Acknowledgments

This work was sponsored in part by the Social Administration Department of Jiangsu Provincial Department of Education (2021SJB0992).