Abstract

Sina Microblog, China’s most popular social media platform, has a massive amount of data and users. The prediction of microblog forwarding has become a hot research topic in the current academic circle. The majority of current microblog forwarding research relies on traditional models that only use certain data properties or statistical features akin to term frequency-inverse document frequency (TF-IDF) for training but fail to extract microblog semantic-level information. In light of the popularity of evolutionary analysis and forecasting your own research only on particular moment popularity or predict popularity, finally, we only analyzed the influence of different factors on the final popularity, without considering the various factors in the popularity of the role played by different evolutionary stages and the popularity of evolutionary problems such as insufficient understanding. This article proposes a three-dimensional feature model. Average, trend, and cycle were used to fit and predict the prevalence, as well as the prevalence prediction method based on deep learning. We analyzed the hot topic evolvement popularity and defined a 3D character model. Regarding average, trend, and cycle and based on 3D feature model to create time series model to fit hot subject popularity evolution and forecast the short-term popularity of hot topic evolution, comparing Spike M and SH model, this work puts forward the model fit and greater accuracy. From the analysis and quantification of influencing factors in the three key stages of epidemic evolution, outbreak, peak, and decline, a prediction model of the epidemic based on a deep neural network was proposed, the different effects of each influencing factor in the evolution process of the epidemic were analyzed in detail, and the active period of the epidemic was predicted. In comparison with the Spike M and SVR models, the proposed technique has a greater effect and performance in terms of predictability and timeliness.

1. Introduction

With the rapid development of the Internet, microblog has become an important way to spread and obtain information. As a product of modern science and technology, it is changing people’s thinking and lifestyle every moment. A large number of microblogs are generated every day, which involves the forwarding, likes, comments, and other interactive information of microblogs, and the amount of data generated every day is quite huge [1]. Microblog has grown in importance as a platform for entertainment, social engagement, and information dissemination due to its open information platform and ease of usage. If you can reasonably use Microblog data, set up a reasonable social networking system, and effectively predict Microblog user interaction, you can not only establish the information propagation law of Microblog reasonable analysis, but also apply it to the personalized recommendation and user [25] sentiment analysis, which will improve people’s lives, business development, and government decision-making.

In addition, with the massive growth of Internet users, the main position of public opinion monitoring is also gradually transferred to the Internet. The spread of information in the network has an increasing influence on the security of the network ecological environment and the tendency of public opinion. As a very active form of information dissemination in the network, microblog has penetrated into people’s daily life. Every day, a large number of netizens express their opinions and spread their ideas through the Internet. Hot topics are easy to form public opinion, and it is accompanied by a great power of public opinion, and microblog is often accompanied by interaction, and the occurrence of the event is easy to attract the attention of netizens and finally form a large circle of communication, easy to trigger “resonance,” the formation of strong social influence of public opinion and “voice of the people.” This is impossible to ignore for the relevant regulatory authorities. However, the emergence of a microblog is a very common Internet event without warning. It is impossible for the supervision department to supervise and inspect every microblog, which is quite exhausting. At the same time, the free publication of comments and the fast transmission of information in the network pose a great challenge to the monitoring of the total public opinion on the network. Although government departments can take effective regulatory measures to supervise the spread of online information, for such a huge amount of information, these regulatory measures will face great challenges in both implementation and effectiveness. Therefore, if the model can be used to accurately predict the spreading scale of news in social media in the future, it will be of great help to the prevention of network ecological security and the implementation of regulatory measures.

Microblog platforms provide a wealth of data. According to the different purposes of prediction, different data can be selected and applied. For the prediction model of microblog forwarding, there are many research teams at home and abroad.

Li analyzed the factors that bloggers should consider when forwarding a microblog and concluded it as follows: the relationship between bloggers and users, the influence of microblog content itself, and users’ own content theme preference [6]. Through the analysis of the above three factors on the microblogging forwarding to predict, in the process of feature extraction using Latent Dirichlet Allocation (LDA) based on statistical model themes, in the final prediction process, Support Vector Machine (SVM) is used for prediction, and the experimental results show that the final accuracy can reach 84.6% by this method.

Liu et al. proposed a recognition method based on the user’s active period and time window and proposed a user interest calculation model based on event attenuation for the prediction of microblog forwarding. In addition to the original feature, user forwarding rate, interaction frequency, and other user behavior characteristics are proposed during the feature extraction process. Finally, by combining these features, a microblog forwarding prediction model based on user behavior characteristics is developed.

With the rapid development of neural networks in recent years, more and more models are using neural networks in feature extraction and prediction of regression sum [710]. Deng et al., in 2015, proposed a model for predicting microblog forwarding volume using BP neural network. They analyzed the factors affecting microblog forwarding from the perspective of posting and microblog content and transformed the regression problem of microblog forwarding prediction into a classification problem. Backpropagation (BP) neural network was used to predict the microblog forwarding amount under emergencies, and some valuable results were obtained.

The artificial neural network is a machine learning model achieved by imitating the structure of a biological neural network [11]. Compared with traditional machine learning models, the artificial neural network has learning ability and can construct nonlinear complex relationships, so it has been widely used. However, most of the data in real life exist in the form of serialization, and the transmission between layers of the artificial neural network is one-way, which makes it unable to make use of the associated information before and after the data and makes it encounter a bottleneck in the serialization data prediction. It was not until 1986 that Elman et al. proposed a cyclic neural network (RNN), which added a feedback mechanism to the hidden layer and could be used to process sequence data [1215]. In 1998, Williams and Zipser proposed backpropagation through times (BPTT’s) cyclic neural network training algorithm and updated the parameters of network structure in the way of reverse error propagation [1620]. The cyclic neural network has broken the limitations of traditional artificial intelligence methods in solving serialization problems, so the cyclic neural network has attracted a lot of researchers’ attention and has made great achievements in speech recognition, computer vision, natural language processing, and other aspects.

Foreign researchers started their research on recurrent neural networks earlier and achieved great achievements. In 1997, Hochreiter and Schmidhuber proposed the Long Short-Term Memory Network (LSTM) model, which added memory units on the basis of a recurrent neural network to solve the problem of gradient disappearance. In 2010, Google put forward the Word2vec tool, which can transform word vectors into spatial vectors. Mikolov et al. used the Log-Bilinear model to construct a text sentiment analysis model based on a recurrent neural network, which can make use of context information and achieve good analysis results. Socher et al. proposed the matrix-vector RNN model, which can learn the meaning of operators by adding matrices to record the word representation of the central word combination. Mohamad et al. studied the stability of recurrent neural networks and obtained sufficient conditions for exponential stability. The structure of the recurrent neural network has also been improved with further research. For example, Schuster proposed a bidirectional Recurrent Neural Network (RNN) structure. Grave et al. proposed a bidirectional LSTM structure in 2005. KaiShengTai et al. proposed a tree-like LSTM structure for text emotion judgment in 2015.

This paper summarizes the advantages and disadvantages of the current microblog forwarding prediction model, compares various methods, and establishes the main task of this paper. By analyzing the problems and shortcomings of the current model, the prediction method combining the traditional model and the depth model is proposed. Finally, the prediction model of the number of likes, retweets, and comments on microblog is established.

Through consulting materials and reading papers, the basic influencing factors of microblog forwarding prediction are mastered. Through the visual analysis of the microblog data set, this paper compares the differences between the numbers of microblog retweets, comments, and likes of different users and puts forward the concept of “hot users.” In the research and training of traditional models, the effects of each model are compared, and random forest and other measures are used to extract features in the process of feature screening. LSTM is used as the in-depth model to extract semantic features of microblog, and a self-attention mechanism LSTM model is proposed to improve the effectiveness of the depth model. The model, at the end of the fusion process, will be the result of the traditional model and characteristic of the depth model using neural network for fitting again, making the final model take into account both the user and the basic features of microblog and also understand microblog in the means, and the experimental results show that this design approach for the final prediction result is of great help/very helpful.

The arrangement of this paper is as follows: Section 2 consists of deep learning model; Section 3 discusses the RNN feature extraction network with an Attention mechanism. Section 4 proposes the experimental analysis as well as an Introduction to the experiment in detail. Section 5 concludes the paper.

2. Deep Learning Model

The evolution of popularity is not a process of simple accumulation and superposition of social users’ attention to online information, but a composite process of group attention focus, emergencies, and information dissemination. The evolution process of popularity reflects the latest development of hot social events and the spread trend in the network, which can tell the attention level of the network group to the event. The research on the popularity of hot topics can promote the understanding of hot topic propagation and provide important enlightenment and help to the trend analysis and tracking of hot topic propagation.

Based on Tianya hot topics data set and Twitter Hashtag data set, this chapter studies the evolution analysis and prediction of the popularity of hot topics. Firstly, an empirical analysis is made on the evolution of the popularity of hot topics through Tianya hot topics data set, and the characteristics and laws of the evolution of the popularity of hot topics are found. Then, the influencing factors in the evolution process of popularity were analyzed in stages, and the influencing factors were quantified. The average, trend, and cycle were proposed to model the evolution model, and the popularity of hot topics was analyzed and predicted. Finally, a prevalence prediction method based on deep learning was proposed to predict the prevalence and analyze the influence of each influencing factor in different stages of the evolution of prevalence.

2.1. Popularity Analysis of Hot Topics

This paper divides hot topics into nonemergent hot topics and emergent hot topics. Nonemergent hot topics tend to be long-term and periodic, such as the Spring Festival and Christmas. Nonemergent hot topics repeat at intervals (a year or several years). Sudden hot topics are usually triggered by breaking news, real events, malicious rumors, and so on. For nonemergent hot topics, we often have reasonable expectations and can better deal with them. However, emergent hot topics often have a great impact on society in the short term. Therefore, this paper focuses on emergent hot topics as the research object.

2.2. Depth Model

The traditional model has modeled the extracted features and expressed the likes, retweets, and comments of microblog in a low dimension. However, if only the traditional model is used, the accuracy of prediction will be greatly reduced. In addition, the number of likes, retweets, and comments of a microblog is not only related to the characteristics of the microblog owner, but also to the microblog content itself. If the microblog material includes @ others or is clearly emotive, the likelihood of transmission increases accordingly. Therefore, if the microblog content can be input into the model, the model will be biased in dealing with different microblog content, thus improving the prediction effect. In previous prediction models, the Linear Discriminant Analysis (LDA) theme model, TF-IDF, and other models can extract the main features of the text, but these features are to some extent added with the emotional color of developers or are only extracted based on statistical methods. Fundamentally, the model does not understand the text content of Microblog. The model also learns from high-latitude features. In recent years, with the development of deep neural networks, machine learning has greatly improved the processing effect of image, speech, and text. Therefore, in order to enable the model to predict by understanding the microblog content, a deep neural network is established for feature extraction of microblog content. In this module, the text content models in each microblog are divided into words and then input into the neural network. In order to obtain a good expression effect, the DROUP_OUT layer and BATCH_NORM layer are added to the deep neural network for optimization to prevent overfitting and speeding up learning.

2.2.1. Word Segmentation Training Word Vector

Natural language processing is an intersection of computer science, artificial intelligence, and linguistics. Its goal is to process or understand natural language and be able to help us do something useful. At present, natural language processing includes the most basic word segmentation, named entity recognition, word vector training, and other fields. As can be seen from the above, there are many statements in the microblog data set. In the process of modeling, we need the computer to “parse” each sentence. This requires modeling the data set. In traditional models, if natural language needs to be modeled, statistical knowledge must be used. Commonly used methods include TF-IDF and hidden Markov model. These methods will add a lot of artificial knowledge in the process of modeling, and the quality of feature extraction will directly affect the final result. Deep learning has always been a leader in the field of abstract expression, and we can make use of the powerful ability of deep learning for natural language modeling. The most important step in the modeling process is the training of word vectors. Before constructing the word vector model, each statement needs to be processed by word segmentation. Word segmentation is the most basic operation in text processing; its purpose is to divide long text content into meaningful, easy-to-understand multiple words and phrases. In this paper, the jieba dictionary, which has a good effect on Chinese word segmentation in the Python library, is used to perform initial word segmentation for microblog content. After completing the initial word segmentation, the results of word segmentation need to be screened to remove useless stop words such as “de,” “de,” and “tai” and finally get the required results.

After word segmentation, word vector training is required for the obtained results. In this paper, each word is expressed as a 300-latitude vector, which measures the distance between words and is obtained through statistical and unsupervised methods. At the same time, the word vector will be used as the initialization of the Embedding layer in the subsequent neural network to complete further optimization in the continuous iterative training of the neural network.

2.2.2. Embedding Layer

If you input text content directly into the machine algorithm model, the model will not process it. Because the computer recognizes only numeric data, it needs to convert pure string data into numeric input. Word vector training is realized by embedding layers in a deep neural network. The embedding layer is a special layer in a neural network, and its main function is to extend dimension expression. For text processing, the neural network takes as input a number of multiple representations of the text, which is then mapped into one-hot encoding. Then, it is input into the embedding layer, and the output of the position encoded as 0 in the one-hot encoding is also 0. According to the one-hot characteristics, the final result is the word vector represented by the SINGLE word identity (ID) in the embedding layer.

First of all, the word expression of the text obtained in the previous step is converted into numerical expression. Here, we count all the words and number all the words; that is, each word has an independent ID. We then map the word-level representation of the text to the numerical representation.

The representation of text content at the digital level is the input of the neural network, and each word is expressed as a 300-dimensional vector by embedded mapping. Random initialization can be used for the initialization of embedded weight, but the iteration speed of the model will be slow. Therefore, we take the pretrained word vector as its initialization weight to accelerate the convergence speed of the model. It should be noted that each line of embedding represents the weight of a word as shown in Figure 1. Therefore, in the process of initialization, only the mapping ID of each word corresponds to the initialization weight of embedding that can achieve the pretraining effect.

After representing each sentence with a number, we input each number into the embedding layer of the neural network in the form of one-hot encoding. The initial weight of embedding is a series of random matrices, but the lines of the matrix are equal to the number of all words; in other words, each word corresponds to a subvector. The columns of the random matrix can be formulated as hyperparameters and represent the vector of dimensions where embedding expresses words. If the dimensions are too large, the word vector training process is difficult to converge, while if the dimensions are too small, the network cannot accurately express each word. Generally, 30 or 50 dimensions are selected. When the one-hot encoded data enters the network, the network will find the corresponding vector in the random vector according to the position of “1” in the code for expression. The network will select the word vector in the fourth line to express the word. If the input is a sentence, the whole sentence is expressed as the corresponding word vector in a similar way as above. The subsequently expressed vector was introduced into the subsequent neural network for training.

To some extent, word vector training is an incidental product of the neural network. In many end-to-end tasks, the word vector is not the target that the model needs to obtain. In addition, in order to accelerate the training of the model, many trained word vector matrices are used when embedded word vector is assigned and then fine-tuned by their own data set, so that accurate word vector expression can still be obtained even when the data set is relatively small.

2.2.3. Feature Extraction by RNN

After the embedding input mapping is completed, the input is mapped into the expression of the word vector, and then the features of the input word vector need to be extracted by a deep neural network. The real structure of the network is shown in Figure 1.

Here, LSTM units in RNN are used for feature extraction. The function of LSTM is mainly for the extraction of semantic features. Because the network needs to be combined with the traditional model at the end of the training, the microblog content in the same data set as the traditional model is used for training, and errors need to be calculated during the backpropagation. Here, because it is a regression problem. So, we use the most commonly used mean square error to measure error. The statistical parameter is the average sum of the squares of errors at the corresponding points of the predicted data and the original data, and the calculation formula is as follows:

Because the model needs to predict the number of forwarding, likes, and comments of microblog, the mean square error of the three indicators needs to be calculated when calculating error, namely,

3. RNN Feature Extraction Network with an Attention Mechanism

The traditional model and the depth model are still two independent modules and cannot complete the adjustment of the depth model to the traditional model. In addition, although the depth model can complete the extraction of microblog semantic features, the model does not have “attention” when processing long sequence data. Therefore, aiming at the above two problems, this chapter first designs the deep network with a self-attention mechanism and then adopts the neural network to fit the traditional model and the deep model to complete the construction of the final model.

3.1. Deep Network Design with a Self-Attention Mechanism

In the prediction research of microblog forwarding, deep learning is needed to extract the semantic features of microblog. However, in the process of extraction, the RNN network cannot assign different weights to different data in the face of long sequence data. This is a bottleneck for deep model extraction, so it is necessary to design a deep network with an “attention” mechanism for feature extraction.

3.2. Limitations of the Attentional Mechanism

In order to make the model better extract the semantic features of microblog, we need to add the attention mechanism to the RNN of the deep neural network. As mentioned in Chapter 2, attention is the mechanism by which inputs to similar models have different weights. The general definition of standard attention is as follows:(1)Given a set of values and a vector query, the attention mechanism is a mechanism to calculate the weighted sum of values according to the query.(2)The focus of attention is how to calculate the “weight” of each value in the set values.(3)This attention mechanism is sometimes called query output, which focuses on different parts of the text. (Query PROM to the values)

Variants of attention mainly have two ways. One is innovation in the calculation method of the weighted sum of attention vector, and the other is innovation in the calculation method of attention score (matching degree or weight value). Note that the basic attention mechanism requires tasks to have both source and target concepts. In the Seq2Seq network, “source” and “target” correspond to input and output timing data. With the concept of “source” and “target,” attention can calculate the similarity between inputs and outputs and then proceed with the calculation. However, in this research, there is no concept of “target.” The content of microblog can be used as the “source” input of the model, but the similarity between input and output cannot be calculated, and the attention mechanism cannot be used. Therefore, the calculation mode of the attention mechanism needs to be changed.

3.3. Prevalence Prediction Based on Deep Learning

Most existing prevalence prediction works predict the value of future prevalence or predict whether the information will become popular, or the likelihood of an outbreak. In this paper, a prediction method based on the deep spirit network is proposed to predict the duration of the active period, and the different effects of each influencing factor on the evolution of epidemic degrees are analyzed in detail. This paper divides all kinds of influencing factors into two categories: dynamic factors and static factors. As shown in Figure 2, firstly, dynamic and static factors are embedded, respectively, by Long Short-Term Memory (LSTM) and Convolutional Neural Network (CNN). Then, the embedding vector VD of dynamic factors and the embedding vector Vs of static factors are connected as the input vector of Fully Connected Neural Network.

3.4. Dynamic Embedding Based on LSTM

Dynamic factors include Accumulation Time (AT), User Statistics (US), and Topological Network (TN). The time when the cumulative popularity reaches a certain amount is expressed as VA, the set of network topology vectors at different time points is , and the statistical information vector set is .

In this paper, the network topology vector V is obtained by the Deep Walk algorithm. Assuming that N is the network topology at time T, k nodes with the highest node degree in N are selected, and VR is the connection represented by the characteristics of K nodes, as shown in the following equation:where (2) in is the feature representation of node D, and is the mapping function in Deep Walk, representing the potential social representation related to node V. in feature representation is calculated by Skip Gram and Random Walk algorithm. The embedding vector representation of user statistics is shown in the following equation:where represents the number of big participants in the discussion, represents the total number of fans of all big , represents the maximum number of big fans, represents the median number of big fans, and represents the average number of big fans.

The embedding process of dynamic factors is shown in Figure 3. Firstly, the network topology vector and user statistical information vector at a corresponding time are connected, as shown in equation (5). Then, the connection vectors are sent into LSTM in turn. After model learning, the feature at moment and the cumulative time vector at moment are obtained. Finally, connect the vector N with as the embedding vector V of the dynamic factor, as shown in equation (6).

The forget Gate F of LSTM can be calculated by the following equation:where 0 is the Sigmoid function, is the weight parameter matrix, is the paranoid term, and h − 1 is the output of the previous node. The Input Gate I of LSTM can be calculated by equation (8), and the update is shown in equations (9) and (10).where is the value that needs to be updated, is the weight parameter matrix, and is the paranoid term.

The Output Gate O of LSTM can be calculated according to equation (11) to determine the output of conventional state information. The final output h of LSTM can be obtained from equation (12).

Finally, the final output vector of LSTM and the cumulative time embedding vector are connected as the embedding vector of dynamic factors, as shown in the following equation:

4. Analysis of Experimental Results

Experimental data is data generated by a measurement, test procedure, experimental design, or quasiexperimental design in science and engineering. Experimental data can be replicated by a number of researchers, and this data can be subjected to mathematical analysis.

4.1. Introduction to the Experiment

This model completes the standard model and deep network training, as well as the enhancement of the Attention mechanism and the construction of the final fitting network. This chapter mainly compares the effects of the model designed in this paper, single linear regression, support vector machine, and single XGBOOST, and also compares the different effects obtained by Cell units of different RNN.

Table 1 examines the accuracy of several models in predicting microblog forwarding.

It should be noted that the accuracy in the table is the regression accuracy derived using the above technique, not the classification accuracy in the usual sense. As can be seen from the table, the accuracy of integrated learning model-XGBoost is higher than that of the linear regression model and support vector machine model, but if the LSTM depth model is used alone, the effect of the model is not particularly ideal. This is simple to comprehend because using LSTM alone is similar to modeling microblog content directly, which ignores a lot of user input and produces results that are far from reality. Here, the traditional model has not been combined with the depth model, nor has the self-attention mechanism of LSTM been completed, but only the training of the traditional model and the feature extraction mode of the depth model have been completed.

When compared to the experimental data, it can be inferred that combining the traditional model with the depth model considerably improves model prediction accuracy and that employing the sub-self-attention mechanism also improves the prediction effect of the good depth model. This shows that the semantic features extracted from the depth model can adjust the prediction of the traditional model to a certain extent and improve the effect.

The model compares different effects obtained by using different RNN-cells, as shown in Figure 4.

This microblog has 7618 users who participate in topic forwarding and comments. Due to the large number of users, the topic key users are set here as 0.5% of the total users, 38 in total. The user nicknames and importance calculated are as follows: National Geographic of the United States, 1095.93; Morning News, 761.323; Sina Video, 570.045; Shanghai Hot News, 364.504:.e Xiao Xing, 16.3505: Dai Jin and Alex, 16.0959; Lawyer Gan Yuanchun, 14.6323; Harold d roth, 14.1375; Weijinqiao automobile Corporation, 13.1375, etc.

Table 2 shows the results of a test of 40 hot topics and 40 nonhot topics in 9 cases.

According to the test results obtained by adjusting parameters, when r = 0.5 and Nsmooth = 10, the algorithm can distinguish hot topics with 90% accuracy, among which 77.8% of topics’ prediction points are ahead of their hot topics list on Weibo, with an average advance time of 1.54 hours.

5. Conclusion

As the mainstream platform of network information interaction, microblog has exerted an increasing influence on People’s Daily life. Every day, countless microblogs are generated, and these microblogs are accompanied by likes, comments, and retweets, forming a huge social circle. If this huge data can be applied, it will be of great help in social network modeling, personality system design and public opinion monitoring, and other fields. This paper introduces the concept and basis of neural networks and deep learning. On this basis, IT introduces RNN with a strong processing ability for ordinal data and introduces different RNN models for different application backgrounds. In order to enable the model to efficiently process time-series data, attention machine was added to the depth model. In order to improve the detection of different key points in microblog data, a self-attention mechanism was proposed to strengthen the model’s ability to extract microblog semantic features.

Data Availability

The data used to support the findings of this study are available from the corresponding author upon request.

Conflicts of Interest

The authors declare that they have no conflicts of interest.

Acknowledgments

This study was supported by the project of the Ministry of Education of China (21YJCZH020), the project of Anhui Provincial Department of Education (SK2020A0121 and SK2020A0127), and the project of Anhui Agricultural University (2020zs13zd).