Abstract
Industrial Internet security is a prerequisite to ensure the high-quality development of the Industrial Internet. The significant way to curb Industrial Internet security accidents and prevent cyber threats proactively is effectively controlling the changes in network situations. In this paper, we propose a new prediction model based on Long Short-Term Memory (LSTM), minimum mean square variance criterion (MMSVC), and empirical mode decomposition (EMD), with the aim of effective noise reduction and high prediction accuracy. To minimize the disturbance of random noise, we firstly deleted several outliers in high-frequency and noisy Intrinsic Mode Functions (IMFs) decomposed by EMD. MMSVC performs well in identifying noisy IMFs without using thresholds. For the blank places, we refilled them by a certain weight with relevant figures. After that, the LSTM model was applied to predict the denoised signal. The preliminary experimental analysis illustrated that noise reduction with the EMD method could provide a significant boost in forecasting performance.
1. Introduction
With the rapid development of the Industrial Internet, a growing number of production services are integrated with the Internet. It means that many industrial components such as R and D, production, and management are exposed to the Internet. The data covered by the Industrial Internet is diverse and widely distributed. Once the network is attacked, the production and the development of enterprises will be seriously affected. Therefore, effective network security protection is extremely important to ensure the high-quality development of the Industrial Internet.
Network Security Situation Awareness (NSSA) [1] is one of the most popular technologies in cybersecurity. Compared with the traditional methods, the essential components to the NSSA are evaluating the network security situation and predicting the trend of network characteristics. Faced with cyberthreats, it can help the network administrators make decisions efficiently and nip the matter in the bud. The network security situation can be abstracted as a multidimension time series like Equation (1), where , , and denote different network characteristics. So, the network security situation prediction is a forecast of this multidimensional time series actually. It applies statistical models or other models to analyze the sets of historical network characteristics.
Nowadays, Artificial Neural Network (ANN) [2] is the most common method used in network security situation prediction, which has the distinguished advantages of self-learning and computing. To find the optimal parameters and construct the training model, the historical data sets need to be trained repeatedly. The accuracy of prediction is heavily reliant upon whether the collected data is effective. Nonetheless, a large number of noise points will inevitably appear in the data extraction process, leading to random errors. The more noise points exist, the more unreliable the prediction is. EMD filtering is a new adaptive technology applied to minimize the risk of random noise. For conventional EMD filtering, the high-frequency IMFs will be completely discarded due to the presence of noise points, which not only consist of noise points but also include valid signals. This will result in severe signal distortion.
Because of the above problems, we propose a prediction algorithm combining EMD and MMSVC with LSTM. Firstly, we changed the signal from time domain to frequency domain in order to separate high-frequency and noisy IMFs. Secondly, we expurgated noise points under the condition that kept the valid signal. Eventually, we utilized LSTM to predict denoised data.
While most conventional EMD-based prediction methods predict individual IMFs directly, the proposed approach performs noise reduction on IMFs, which can significantly reduce the impact of noise on prediction. Noisy IMF identification using MMSVC avoids selecting thresholds of different permutation entropies. In order to locate and delete noise points, the signal is divided into groups. Outliers in each group are regarded as noise points. Then, refill them by a certain weight with relevant figures. Compared with other methods, this approach has the advantages of no special requirements for the signal itself and easy to operate.
2. Related Work
The typical methods for network security situation prediction include Regression Analysis [3], Grey Theory prediction model [4], and Artificial Neural Network. These algorithms exhibit high performance in some particular applications, but they also have their limitations. For instance, the Regression Analysis lacks real time because it describes the regular by mathematical formulas. Various contingencies occurring in the network frequently lead Regression Analysis to become more inefficient. The Grey Theory prediction model has remarkable effects on small data sets, whereas, the accuracy of forecasting is considerably lower than ANN. The availability for network security situation prediction using ANN has been verified by many researches. For example, Liu et al. [5] proposed a method that applies GM (1,1) model and BP neural network model. In article [6], a network security situation prediction method based on BP neural network optimized by Seeker Optimization Algorithm (SOA) was proposed. The effectiveness of Artificial Neutral Network for network situation prediction can be effectively verified. LSTM is applied in this experiment, which has better performance than Recurrent Neural Network (RNN) in a long sequence [7]. LSTM can avoid the disturbance of the information attenuation due to a long time. However, data acquisition will produce random noise inevitably, which would make the learning of parameters deviate and give rise to decreased accuracy [8–10].
Therefore, it is vital to provide a significant boost in forecasting performance by filtering noise. In recent years, there has been an increasing interest in exploring methods for noise reduction, such as using filters, wavelet transform (WT) [11], EMD [12], and Fourier transform [13].
In article [14], a signal-filtering method based on empirical mode decomposition is proposed. Compared to well-known filtering methods, this method is a fully data-driven approach without too much human intervention.
In article [15], the authors present a comprehensive study of high-G calibration denoising method. They proposed a denoising method based on the combination of empirical mode decomposition (EMD) and wavelet threshold. They utilized wavelet threshold to processed IMFs, in which the filtering depends on the selection of decomposition number and basis function.
In article [16], a novel noise reduction technique for underwater acoustic signals based on complete ensemble empirical mode decomposition with adaptive noise (CEEMDAN), minimum mean square variance criterion (MMSVC), and least mean square adaptive filter (LMSAF) is proposed by Li and Wang. In their scheme, LMSAF was utilized for high-frequency IMF noise reduction, which addresses the problem of selection of decomposition number and basis function for wavelet noise reduction. However, LMSAF requires linear independence of input vectors at different times. The correlation of input signal will cause repeated error propagation, slow convergence speed, and poor tracking performance.
Here, we aim to address the issue of noise by using a technique, which adapted from combining EMD with LSTM.
3. Proposed Model
3.1. Principle of EMD
The greatest merit of EMD is it does not require too much human intervention. EMD is very simple and convenient for signal adaptive decomposition [17, 18].
Any signals can be decomposed into IMFs in different frequency domains. Different features of time scales in the historical sequence sets are involved in different IMFs. The detailed procedure of EMD consists of six steps, which are described as follows:
Step 1. All the local maximums and minimums of the original signal will be determined and joined as upper envelope and lower envelope by using the cubic spline difference.
Step 2. Calculate the average of and which is .
Step 3. The mean value is subtracted from the original sequence to obtain the intermediate signal , .
Step 4. Judge whether the two conditions for becoming the IMF are met: the difference between the number of zero points and the number of extreme points cannot be more than one; meanwhile, the mean of the upper envelope and lower envelope must be zero at any time. If it satisfies, will be considered as an IMF and perform Step 5; otherwise, repeat Step 1 to Step 4.
Step 5. Calculate the residual and regard as new . Then, repeat Step 1 to Step 4 until the signal will not be decomposed.
Last but not least, the decomposition of the original signal can be expressed as Equation (2), where is determined by and is the residual signal.
3.2. Principle of MMSVC
MMSVC is utilized to identify noisy IMFs decomposed by EMD [16]. The detailed procedure of MMSVC consists of three steps, which are described as follows:
Step 1. Original signal removes the first IMFs to construct , , , where is the total number of IMFs.
Step 2. Calculate square variance of and : .
Step 3. Repeat Step 2 to calculate any two adjacent IMFs, and find minimum mean square variance, .
The first IMFs are considered as noisy IMFs while is the least value.
3.3. Principle of LSTM
LSTM was first proposed by Hochreiter and Schmidhuber [19] in 1997. From the internal structure, LSTM is adept at dealing with the time-series question [20, 21]. The specific structure of LSTM is depicted in Figure 1, where is composed of the updated state value at time , output signal (hidden state) , sigmoid neural network layer , tanh function, and tanh neural network layer.

The forget gate is used to simulate the forgetting process by controlling the forgetting degree of the last memory cell with a weight matrix . Input signal and into sigmoid function, afterwards, output with the value ranging 0~1. If the result is 0, it indicates that information is vanished completely, whereas, the whole data will be stored with the consequence of 1. The specific calculation procedure is as follows:
The input gate acts on determining how much information will be saved to the cell state , which can effectively avoid the memory of irrelevant content. Sigmoid function and tanh function, both known as active function, are expressed by Equations (4)–(6) in detail, where , and are weight matrixes and and are bias terms. Add the information that needs to be forgotten and the information that needs to be remembered to update the cell state.
The output gate operates on controlling how much information from the cell state will be outputted to , which is the input signal of the next time. Summarizing, Equations (7) and (8) briefly describe the operations performed by an output gate.
3.4. EMD-MMSVC-LSTM Model
This algorithm fully exploits the properties of EMD, MMSVC, and LSTM. We firstly identify the noisy IMFs using MMSVC and delete several outliers in noisy IMFs. Secondly, for the blanks, we fill them by a certain weight, which is determined by the distance between the blank and adjacent reference data. Finally, the LSTM model is applied to predict denoised signals. The specific process of the algorithm is shown in Figure 2, where is regarded as the original time series and is identified as a new sequence.

The signal is decomposed into IMFs with diverse frequencies. The randomness of the signal decreases gradually. By deleting outliers in the high-frequency and noisy IMFs, the purpose of filtering noise and prediction with precision can be achieved. The specific procedures are described as follows:
Step 1. Signal decomposition. Apply EMD to decompose the network characteristic to obtain IMFs in different frequency domains and one residual.
Step 2. Utilize MMSVC to obtain noisy IMFs.
Step 3. Filter outliers in noisy IMFs. Take IMF1 for an example; to begin with, divide IMF1 into groups according to period . After that, sort the sample points of each group according to ascending order. Then, find the factor with the ranking percentage of % and the factor () with the ranking percentage of %. Ultimately, filter the data less than and greater than in each group. (The selection of and is determined by the data itself. Through continuous experiments, it is found that the effect of this group of data is the best when is 90 and is 10.)
Step 4. Fill in the blank places. The vacant parts are filled by a certain weight with relevant figures, and only nonnoise points can be used as reference data. For the noise point , and must be valid data; otherwise, we need to search for valid data nearby. If the distance between and is and the distance between and is , then . The pseudocode of Step 2 and Step 3 is represented in Algorithm 1 below.
Step 5. Reconstruct signal. Add IMFs and residual item by item to be a new time series .
Step 6. Signal prediction. LSTM is utilized to predict the new time series.
|
4. Experiment and Verification
4.1. Data Set
The data in this paper is extracted from the traffic statistics logs collected from the CERNET campus network, which is composed of various indexes known as ONLINE_USERS, IP_INBPS, IP_OUTBPS, TCP_INBPS, and TCP_OUTBPS. The time range is from 14:00 on March 26, 2016, to 18:00 on March 29, 2016.
4.2. Noise Filtering Based on EMD
It is acknowledged that the noise in the original data will inevitably reduce the accuracy of the prediction model. As Figure 3 shows, ONLINE_USERS and IP_INBPS are both decomposed into nine IMFs and one residual by EMD. Through the spectrum analysis of IMFs, the frequency of IMFs decreases step by step.

(a)

(b)
MMSVC is used to identify noisy IMFs of two indexes. The result is depicted in Table 1. As Table 1 shows, is the minimum value for ONLINE_ USERS and is the minimum value for IP_INBPS. Therefore, for ONLINE_USERS, the first five IMFs are noisy IMFs. And for IP_INBPS, the first three IMFs require noise reduction.
Process noisy IMFs. First, group the data according to the period. Second, filter outliers with the ranking percentile higher than 90% and the ranking percentile lower than 10% in each IMF. Third, fill the blanks by a certain weight with relevant figures. As Figure 4 shows, the noise is suppressed to a certain extent and the main image features of the original data are retained.

(a)

(b)

(c)

(d)
4.3. Experimental Results of EMD-MMSV-LSTM Model
Predict the two characteristics using the LSTM model. The data from 26th to 28th were regarded as training data, and the remaining data were used as the test data. For the conventional LSTM model, to a certain degree, the prediction of the current time depends on the value of the previous time, which leads to inefficiency readily. By converting the forecast results to rely on real data, the prediction will be effectively improved.
To prove the validity of the experiment, features named IP_OUTBPS and TCP_INBPS were involved. Figure 5 presents the comparisons of prediction between raw data described and denoised data described, in which the blue polyline is the observed values and the orange polyline is the predicted values. MSE, RMSE, and MAE were used as evaluation indexes in this experiment. See Table 2 for specific data.

(a)

(b)

(c)

(d)

(e)

(f)

(g)

(h)
The results of this experiment are summarized that the noise can be suppressed to a certain extent by EMD filtering; in the meantime, the main image features have almost remained. The utilization of filtering noise by EMD could provide a significant boost in LSTM forecasting performance.
5. Conclusion
In this study, we propose a new forecasting model known as EMD-MMSVC-LSTM. Compared with the most EMD-based prediction model, the proposed method can significantly reduce the impact of noise on prediction. MMSVC is employed to identify noisy IMFs without the selection of thresholds. After that, delete the outliers in each group. This approach of noise reduction has the advantages of no special requirements for the signal itself and easy to operate. Eventually, make prediction for the denoise signal.
Although the method of noise reduction has been realized in this paper, the filtering conditions are relatively unitary and the filtering process is nonadaptive, which would give rise to insufficient filtering. How to improve the effectiveness and adaptability of noise filtering is the direction of further research.
Data Availability
The data used to support the findings of this study are available from the corresponding author upon request.
Conflicts of Interest
The authors declare that there is no conflict of interest regarding the publication of this paper.
Acknowledgments
This work was supported in part by the National Key Research and Development Program of China, No. 2020YFB1711000 and No. 2018YFB08040505.