Abstract

With the expansion of the digital business line, the network flow behind the digital power grid is also exploding. To prevent network congestion, this article proposes a novel network flow forecasting model, which is composed of variational mode decomposition (VMD), GRU-xgboost block, and a forecasting adjustment block, to grasp the changing patterns and trends of network flow in advance, and to formulate reasonable and effective flow management strategies and meet the requirements of users for network service quality. The network flow series in power grid enterprise always contain complex patterns and outliers, and VMD is applied to adaptively process complex net flow time series into several subseries with simpler patterns. A GRU-xgboost block is designed to reconstruct the features of historical series. Then, xgboost model is applied to generate predictions for all decomposed subsignals. For the final predictions, we design a forecasting adjustment block to further remove the influence of random noise. Finally, the empirical results show the superior performance of the proposed model on network flow forecasting task.

1. Introduction

The in-depth integration of digital technology and power enterprises not only improves the operational efficiency of traditional power supply mode, but also brings great challenges to its platform trend. Intelligent operation that establishes a new power business mode and maintenance of digital IT infrastructure monitoring will become the inevitable choice of enterprise digital transformation. This operation system can automatically trace and make failure cause analysis. When the network facing attacks, the system can keep stable operation by existing control methods [1, 2]. While in terms of realizing high informatization and intelligence, enterprises need to establish an accurate monitoring system of supply and demand data and an effective big data analysis system. However, the number of wireless network users has increased dramatically with the development of communication technology. With this, the network flow of power grid enterprises increases sharply. Therefore, how to properly model and predict network flow has become an important part of improving the utilization rate of network resources and improving user experience, and an important role in big data analysis system.

However, the forecasting task of the network flow of power enterprise is different from other common forecasting tasks such as wind speed forecasting, load forecasting, and air quality forecasting because the network flow series of power enterprise have a unique “characteristic” comparing with other types of time series. Specifically, the network flow series are always fluctuated since the unexpected accident in network (e.g., manual maintenance) or abnormal data acquisition. As shown in Figure 1, the flow series is complex and difficult to model.

These mutations in time series are always defined as outliers, which can greatly influence the performance of modelling and forecasting. The common defined outliers are the points that deviate so much from other observations as to arouse suspicions. The outliers can be thought of as observations generated by a different mechanism [3]. Therefore, the outliers in the time series above may be irregular arbitrary values. However, different from commonly defined outliers, the outliers in network flow time series of power grid enterprises always last for a period because the service or some abnormal situation cannot be resolved quickly. Additionally, these outliers are always close to or even zero. This phenomenon can be found in Figure 1. As a matter of course, a general model may not achieve an appropriate forecasting performance in the task of network flow of power grid enterprises. Therefore, common outlier detection methods or some robust regression methods may not achieve good effects. For this kind of complex nonlinear time series, researchers have proposed various approaches to improve the quality of datasets, like ranked set sampling (RSS) [4], subsampled, and upsampled methods in data preprocessing stage. A widespread way used by researchers is processing original time series [5]. The widely used methods include variational mode decomposition (VMD) [6], empirical wavelet transform (EWT) [7], empirical mode decomposition (EMD) [8, 9], and its improved methods: ensemble EMD (EEMD) [10, 11], complementary EEMD (CEEMD) [12], and complete EEMD with adaptive noise (CEEMDAN) [13]. The VMD is an adaptive, completely signal processing method. This technique has the advantages of determining the number of mode components. It can realize the effective separation of inherent modal components (IMF) and the frequency domain division of signals or time series and can finally obtain optimal solution of the variational problem. Hu et al. [14] also proposed a VMD-based hybrid model for time series forecasting, where VMD is used to decompose daily streamflow series into multiple components. While the selection of the number of the expected decomposed IMFs is a difficult problem, the EMD method has a solid mathematical theoretical foundation and can reduce the nonstationary of time series with high complexity and strong nonlinearity. Furthermore, the EMD algorithm causes mode mixing problems, so EWT is useful in analyzing instantaneous time-varying signals. It can effectively extract information from signals and carries out multiscale detailed analysis on functions or signals through operation functions such as scaling and translation, while the effects of EWT depends on the selection of wavelet basis. In this study, the VMD algorithm is selected for input data preprocess because it is substantially more sensitive to sampling and noise than existing decomposition, such as EMD and EEMD [14].

In the past few years, although there were relatively few literature studies on the network flow prediction of power grid enterprises, the studies on similar time series forecasting tasks are limited [15], such as load demand forecasting [1618], wind power forecasting [19, 20], solar power forecasting [21, 22], stock price forecasting [23], and short-term performance of airlines [24]. Traditional linear models such as autoregression (AR), moving average (MA), autoregression moving average (ARMA), and autoregressive integrated moving average (ARIMA) can achieve good performance in small-scale sparse datasets, but weak in datasets with mutant and nonlinear patterns. With the development of artificial neural network (ANN) technology, more and more ANN-based methods are proposed to fit and represent historical nonlinear data [25], such as multilayer perception (MLP) [26], extreme learning machine (ELM) [26], and least square SVM (LSSVM) [27]. All these methods can fit the historical data without any assumptions of data, which is also defined as nonparametric models. When datasets are abundant, the nonparametric models can achieve better performance. If datasets are massive, the deep learning models can train better forecasting models. For data of time series type, recurrent neural network (RNN) [28] and its improved algorithms long short-term memory neural network (LSTM) [29] and gate recurrent unit (GRU) [30] can learn the hidden patterns of a long time series and selectively forget useless information in time series. More advanced methods are encoder-decoder-related methods and transformer methods. Azencot et al. [31] proposed a novel consistent Koopman autoencoders, which explores the interplay between consistent dynamics and their associated Koopman operators. Wu et al. [32] proposed a new time series forecasting model based on the transformer structure and generative adversarial networks (GANs). Another commonly used approach in industry is boosting algorithm, which is composed of multiple weak estimators. Commonly used boosting-based algorithms are adaboost [33], xgboost [34], and gradient boosting decision tree (GBDT) [35]. These algorithms build multiple weak estimators to predict the dataset and then integrate the predictions of multiple estimators with a certain strategy as the final predictions.

In this study, we mainly focus on a problem of generating accurate network flow predictions for power grid enterprise. Firstly, traditional statistic models may not achieve a satisfied forecasting performance because the flow series obtained in power grid enterprise contain complex nonlinear patterns in a huge scale of digital power grid. For this, we apply VMD algorithm to decompose the original network flow series into simplified subseries and each subseries have its optimal centre frequency and limited bandwidth. To generate accurate predictions for the network flow series, we must encode the historical information of flow series in a better way. Among current popular neural networks, RNN and its variants (LSTM and GRU) may have a better applicability because they can capture the information in historical data. We select the GRU model to encode the flow series since it has simpler structure. In current research, research studies believe that a single model will bring some risks in the modelling process. Therefore, we think that merely using the GRU model will cause a risk for generating accurate predictions. The xgboost, which is proved to be an effective model in time series forecasting, is applied to utilize the encoded information by GRU to generate flow predictions to avoid the risk caused by using single model. Specifically, we rename this process as a GRU-xgboost block. This block is used for each subseries to capture the hidden pattern in historical data and generate future predictions. Aggregating the predictions of all subseries, we can obtain a prediction series. For the residual between the prediction series and original network flow series, we design a residual adjustment block to compensate the error caused by random noise.

The contributions can be summarized as follows:(1)We propose a decomposition based adjusted GRU-xgboost forecasting approach for network flow forecasting task of power grid enterprises to generate accurate flow predictions.(2)In the proposed algorithm, we design a GRU-xgboost block to capture the hidden patterns in time series and reconstruct input features for the xgboost model. This block uses GRU unit to convert historical data to reconstructed state vector, which can adaptively abandon useless information from the investigated series.(3)A residual adjustment block is designed to compensate the error caused by random noise in time series. The random noise in time series may influence the accuracy of aggregated predictions of all subseries because of its irregular property. We design this residual adjustment block to compensate forecasting error.

2. Basic Methods

In this section, we describe three key blocks in the proposed algorithm in detail.

2.1. VMD

VMD is an adaptive and completely nonrecursive method of mode variation and signal processing. This technique has the advantage of determining the number of mode decomposition, and its adaptability lies in determining the number of mode decomposition of a given sequence according to the actual situation. In the subsequent searching and solving process, it can adaptively match the best centre frequency and limit bandwidth of each mode and can realize the effective separation of inherent modal components (IMF) and the frequency domain division of signals, and thus can finally obtaining the optimal solution of variational problems. It overcomes the problems of end-point effect and mode component aliasing in EMD method and has solid mathematical theoretical foundation. It can reduce the nonstationary of time series with high complexity and strong nonlinearity and decompose to obtain relatively stable subsequences with multiple different frequency scales, which is suitable for nonstationary sequences.

Firstly, a variational problem is constructed with the following assumptions: (1) the original signal is decomposed into components; (2) the decomposed sequence is a mode component with a limited bandwidth with a central frequency; and (3) the sum of estimated bandwidths of all modes is the smallest. The constraint condition is that the sum of all modes is equal to the original signal, and the corresponding constraint variational expression iswhere represents the original signal, denotes convolution, and and represent the decomposed -th mode component and corresponding centre frequency, respectively.

Then, we introduce Lagrange multiplier and transform the constrained variational problem into unconstrained variational problem. The augmented Lagrange expression is formulated as

Then, the alternate direction method of multipliers (ADMM) is applied to calculate the optimal , and . This process can be formulated as

2.2. GRU-xgboost Block

For the decomposed subsequence , we have to model and generate predictions for them. In this section, we design a GRU-xgboost block to extract features from historical data more effectively and use them to train models.

If we want to obtain a prediction of a value in a time series, we should take the values before this value into account because of the potential correlation of the values in time series. A simple way is taking several front values before the target value as an input feature vector. However, this kind of feature vector may contain abundant of useless information, which may influence the training performance. Meanwhile, this feature vector means that each value is equal in importance, and this is unreasonable. GRU is a variation of recurrent neural network (RNN), which can capture important patterns in historical data and transfer them into a hidden state vector. We use GRU to further improve the quality of input feature vectors.

Given a time series with samples, which is defined as , we construct the input feature vector for the target bywhere is the number of data that will affect the -th data.

The corresponding output is given as

With the input vectors and label . We can use GRU to improve the quality of input feature vectors. The GRU combines the forget gate and input gate in the LSTM into a single update gate and can save long-term memory and solve the long-reliance problem in the traditional recurrent neural network. The specific updating formulas of GRU are given aswhere represents the samples in ; and are the previous and current hidden states, respectively; , , , , , and are the learned weight matrices; and is the logistic sigmoid function. For each group of input vectors and label , we can obtain an improved feature vector , which is the hidden state in GRU.

As introduced in the reference of Chen and Guestrin [36], the xgboost framework can achieve outstanding performance in various fields, such as essential protein prediction [37], disease progression of breast cancer prediction [38], and stock price prediction [39]. In this article, we apply this framework to the network flow forecasting tasks and generate predictions for each .

2.3. Residual Adjustment Block

The network flow predictions are obtained by aggregating the predictions of all decomposed subsequences. Although VMD can adaptively determine the centre frequency of mode components, each mode is smooth after demodulation to baseband, which have certain robustness. The impact of random noise in time series can be further reduced. In this section, we design a residual adjustment block to compensate the influence of noise.

In the training set, a residual series can be obtained by making a difference between the aggregated predictions and real observations:where is the label of training set, represents the operation of LSTM-xgboost block, and is the modelling output of each subsequence . Then, support vector regression (SVR) [40] is applied to model this training residual series. For each aggregated prediction value, the SVR can generate an adjustment prediction series and summing the two predictions is the final network flow prediction. New residuals are obtained by making a difference between current aggregated prediction value and current real observation:where the samples in set represent those samples in the test set.

3. The Proposed Hybrid Deep Learning Framework

For the network flow time series in power grid enterprise, we propose a novel algorithm framework to deal with the outliers for a period and random noise, and finally generate accurate predictions. The flowchart of the proposed approach is shown in Figure 2. This framework can be divided into three stages:(1)Use VMD to decompose network flow time series into subsequence .(2)Use GRU-xgboost block to train and generate predictions for each subsequence, then aggregate the predictions off all subsequences to obtain a prediction series for the original flow series.(3)According to the residual series of training set, use the residual adjustment block to generate adjustment predictions for the aggregated prediction series.

In specific experiments, there always exits a problem that the parameter in (4) is difficult to determine. Since partial auto-correlation function (PACF) can describe the direct relationship between the observed value and its lag term. We firstly use PACF to select an appropriate value for , and then determine a best choice for with trials and errors. For each subsequence, since we have used GRU to reconstruct the input feature vectors, we set a same for them for convenience. To keep original flow series and residual series consistent, we also set a same for residual series.

(i)Require: Time series .
(ii)Ensure: final predictions .
(1) Use VMD decompose the time series as equation (1) and obtain .
(2)for to do
(3)   Use GRU-xgboost block improve the quality of input feature vectors and obtain improved reconstructed hidden state ;
(4)   Generate predictions and modelling output of sub-sequence.
(5)end for
(6)  Aggregate predictions of all sub-sequences and obtain a prediction series .
(7)  Aggregate modelling output of all sub-sequences and calculate a residual series .
(8)  Use residual adjustment block generate compensation values according to the residual series.
(9)  Sum the series and to obtain the final predictions.

4. The Case Study

In this section, we will display the experimental results of our framework and make comprehensive analysis to demonstrate the performance of this framework.

4.1. Network Flow Time Series

In our study, we collect an hourly inflow flow time series and an hourly outflow time series. The two datasets are from NARI group in Nanjing, China. Each time series starts from 12/28/2020 and ends in 03/09/2021 with a total 1728 samples. We set the data from 12/28/2020 to 02/20/2021 as the training set, the data from 02/20/2021 to 03/01/2021 as the validation set, and the rest as test set. The validation set is applied to decide whether to end the training of the training set. The test set is used to test the performance of the trained models. Table 1 shows the statistical properties of the time series. As shown in Figures 3 and 4, the data in the red box can be denoted as the period outliers, which is a difficult problem when training models.

4.1.1. Experimental Configuration

To evaluate the performance of the proposed approach and other benchmark models, four regression error criteria are applied to this study: mean absolute error (MAE), mean square error (MSE), R2, and adjR2:where is the sample size, is the prediction, is the mean value, and is the number of features. The closer the R2 and adj R2 are to 1, the better the model fits. Since R2 cannot quantify the accuracy, the adj R2 is proposed to offset the influence of sample size on R2.

To demonstrate the performance of the proposed approach, several benchmark models are designed to make comparisons. They are GRU, VMD-GRU, xgboost, VMD-xgboost, VMD-VMD-xgboost, VMD-xgboost with residual adjustment and EMD-xgboost. In the VMD-VMD-xgboost, the highest frequency subsequence after decomposing the original flow time series is decomposed by VMD again. The VMD-xgboost with residual adjustment is denoted as VMD-xgboost-adjustment. The parameter in (4) is finally selected as 3. For fair comparison, all benchmark models share this parameter. For the number of modes to be recovered in VMD, we select 20 after trials and errors for both two network flow time series. The hidden nodes of GRU are determined as 40. We make single-step forecasting (e.g., one hour ahead) for all benchmark models, which means the front observations are used to generate next prediction. For each group of experiment, its predictions are generated “circularly,” which means that the front observations will move one back to forecast next observation. Each group of experiment is repeated 30 times, and the results are used to plot boxplots in Figure 5.

4.1.2. Empirical Analysis

The specific experimental results of all benchmark models are listed in Table 2. We firstly focus on the effectiveness of VMD. The specific decomposing results of inflow flow series and outflow flow series are shown in Figures 6 and 7. According to the adjR2 criterion of VMD-xgboost (inflow: 0.9914, outflow: 0.9885) and EMD-xgboost (inflow: 0.9184, outflow: 0.8779), we can conclude that the VMD can achieve better performance than EMD, which mainly because VMD overcomes the problems of end-point effect and mode component aliasing in the EMD method. Therefore, the VMD is an appropriate choice for our method. Meanwhile, we further explore whether further decomposing complex subsequences with VMD can improve the prediction effect. According to the results of VMD-VMD-xgboost in the inflow dataset (MAE: 9034632, MSE: 1.36e14, R2:0.9915, adjR2: 0.9914) and outflow dataset (MAE: 11229317, MSE: 2.09e14, R2: 0.9886, adjR2: 0.9884), we find the prediction performance was almost consistent with the results of VMD-xgboost. Therefore, we can conclude that one decomposition has achieved an excellent effectiveness and further decomposition cannot bring better performance. We set two groups of models to compare the performance between GRU and the xgboost. For the first group (GRU and xgboost), we can find the GRU is better than xgboost in terms of the four criterion according to the results of 1st row and 3rd row and Figure 8. While for the second group (VMD-GRU and VMD-xgboost), the VMD-xgboost can perform better than VMD-GRU according to Figure 5. This illustrates that the GRU may have stronger ability than xgboost in terms of handling complex nonlinear signal. However, for the decomposed high-quality subsequences, the xgboost can achieve better fitting performance and generalization ability. Comparing the 4-th and 6-th rows and Figure 5, we can find VMD-xgboost-adjustment have a little better performance than VMD-xgboost in the inflow dataset and the outflow dataset. This suggests that the residual adjustment can improve the prediction effects slightly. Comparing the VMD-xgboost-adjustment and the proposed approach, we can find the proposed approach achieves better performance than the VMD-xgboost-adjustment in both the inflow and outflow datasets, and this can prove the effect of GRU-xgboost block on improving the quality of input feature vectors. Therefore, the proposed approach can achieve an accurate flow prediction in network flow forecasting task of power grid enterprises.

In short, according to the analysis, four main points are listed as follows:(1)The VMD algorithm is a better method to deal with the original complex and nonlinear network flow time series compared with the EMD algorithm in this study(2)When applying the VMD algorithm, the xgboost algorithm can achieve better performance compared with GRU although the GRU can achieve better performance compared with the xgboost model without the VMD algorithm(3)The residual adjustment designed in the proposed approach can further improve the forecasting performance slightly(4)The proposed GRU-xgboost block can improve the quality of input feature vectors and generate more accurate network flow predictions.

5. Conclusion

This study proposes a novel approach for network flow forecasting tasks of power grid enterprises, which can achieve a satisfactory result. The proposed novel network flow time series forecasting framework is composed of VMD, a GRU-xgboost block, and a residual adjustment block. In this framework, the VMD is used to decompose the complex and nonlinear network flow time series with period outliers into smooth and high-quality subsequences. Then, modelling for these subsequences can generate more accurate predictions compared with directly forecasting the original flow series. For each subsequence, to better capture the hidden state vectors in historical series, a GRU-xgboost block is proposed to capture the effective information in historical data and remove useless patterns. Then, the improved input feature vectors of GRU are input into xgboost to generate a series of prediction. For single subsequence, this block can combine the advantages of GRU to obtain the hidden patterns of historical series and xgboost to generate accurate prediction information. For the predictions aggregated by all subsequences, a residual adjustment block is designed to reduce the influence of random noise in original flow series and further improve forecasting accuracy. We test the forecasting performance of the proposed approach in two network flow time series from NARI group in Nanjing, China. The experiment results demonstrate the effectiveness of the three blocks and our approach in generating accurate network flow predictions.

The proposed novel network flow time series forecasting framework in this study can be extended to other complex time series forecasting fields in the future, for example, stock price time series, load demand time series, and wind speed time series. Additionally, our approach focuses on providing specific point prediction in the future. However, probabilistic prediction and interval prediction may have more practical significance because point precision can only provide limited information and cannot describe the future probabilistic situation well. In the future, our approach can be improved in terms of generating probabilistic predictions and interval predictions.

Data Availability

The data used to support the findings of this study are available from the corresponding author upon reasonable request.

Conflicts of Interest

The authors declare that they have no conflicts of interest.

Acknowledgments

This project was supported by the Science and Technology Project of State Grid Corporation of China (Research and application of key technologies for intelligent operation, maintenance and testing of power dispatching data network, 5100-202040329A-0-0-00).