Abstract

In view of the fact that the potential high-dimensional features in the historical sequence are difficult to be effectively extracted by traditional power load forecasting methods and the coupling factors of electricity, heat, and gas have not been considered, the correlation of electric heating and gas load is considered in this paper, and a short-term power load forecasting method for integrated energy systems based on Attention-CNN- (Convolutional Neural Network-) DBILSTM (Deep Bidirectional Long-Short-Term Memory) is proposed. First, the correlation between the multiple load influencing factors is considered, and the Pearson coefficient is used to quantitatively calculate the correlation between the multiple loads. Second, a CNN network consisting of a one-dimensional convolutional layer and a pooling layer is established. High-dimensional features reflecting the dynamic changes of the load are extracted, and the proposed feature vector is constructed in the form of time series as the input of the DBILSTM network; the dynamic change law of time series data is modeled and learned. Then, the Attention mechanism is introduced to assign different weights to the hidden state of DBILSTM through the mapping weight and the learning parameter matrix, to reduce the loss of historical information and strengthen the impact of key information, and the Dense layer is used to output the load prediction results. Finally, the influence of the correlation of multiple loads and its influencing factors on the power load forecasting results is analyzed, based on the historical load data of the integrated energy system in a certain area of Northeast China. The simulation results of the calculation example show that the prediction accuracy of the method reaches 97.99%, and the integrated energy system electric, heat, and gas load correlation coefficients as the input parameters of the Attention-CNN-DBILSTM network can reduce the average prediction error by 0.37%∼1.93%. The proposed method has been verified to effectively improve the prediction accuracy by comparison with the prediction model results of CNN-LSTM network, CNN-BILSTM network, and CNN-DBILSTM network.

1. Introduction

Under the background of the strategic goal of “carbon peak and carbon neutrality,” the coupled operation mode of electricity, heat, and gas is the key to constructing a new low-carbon, safe, and efficient power system [1, 2]. As an important part of the new power system, the integrated energy system takes the power distribution system as the core, and the multienergy flow of electricity, heat, and gas is coupled and complementary. It is an important material basis and realization method for the establishment of a low-carbon, high-efficiency, multienergy power system and energy Internet with new energy as the main body and is the key to promoting the energy revolution in the new era [35]. However, the operating characteristics of the integrated energy system’s electric, heat, and gas multienergy flow coupled operation mode pose a severe test for the power system load forecast [6]. Therefore, the consideration of the multiple load correlation factors of electricity, heat, and gas is of great significance for the operation and dispatch of the integrated energy system. Different from the independent operation of traditional power system, thermal system, and natural gas system, the integrated energy system meets multiple load demands through the coordination and complementation of electric energy, heat energy, and natural gas [3]. The importance of the correlation between multiple loads among multiple factors and the way to consider the influence of the correlation between multiple loads in the model are the key issues for integrated energy system power load forecasting. Some research have been carried out at home and abroad on the load forecasting problem that considers multiple load correlations. Literature [7] used historical time series of cooling, heating, and power loads and weather factors to form a multivariable time series and proposed a multivariable phase space reconstruction and Kalman filter for combined cooling, heating, and power system load forecasting methods; the correlation of weather factors with a large connection between cooling, heating, and power loads has been initially considered. Literature [8] analyzed the load characteristics of the integrated energy system and studied the effect of the integrated energy system load pattern and energy consumption data types on the predictability of multiple loads. Literature [9] used Copula theory to analyze the coupling characteristics of the cooling, heating, and electric loads in the integrated energy system, and a multiple load forecasting model for the integrated energy system was established. The common influencing factors of multiple loads are considered by the above research to improve the accuracy of integrated energy system load forecasting. However, limited to the common influencing factors of each energy form, the correlation of coupling factors among multiple loads has not been considered.

In terms of short-term load forecasting algorithms, traditional methods and machine learning have fast forecasting speeds, but the time series of data has not been considered. In recent years, in the context of a large-scale improvement of training data, deep learning methods can fully mine data historical sequence features and have better robustness [10, 11]. Many research have been carried out on the short-term power load forecasting based on deep learning at home and abroad. Literature [12] applied DNN (deep neural network) to short-term power load forecasting. Literature [13] and literature [14] used CNN model to extract load features and capture seasonal cycles to improve load forecasting accuracy. Literature [15] and literature [16] considered the long-term temporality of data. LSTM (Long-Short-Term Memory) network and GRU (gated recurrent unit) network were, respectively, introduced into short-term power load forecasting to solve the problem of vanishing gradients in RNN (recurrent neural networks). Literature [17] proposed a BILSTM network based on feature screening, and the time series features of multidimensional load data were further mined. Literature [18] introduced the Attention mechanism in the short-term load forecasting process, giving the hidden layer different weights, and the influence of important information in the load data was strengthened. The above research has high accuracy in processing data on highly nonlinear sequences, but the power load data of the integrated energy system with a high proportion of renewable energy access has strong volatility and uncertainty; the dynamic change of load is difficult to be better learned by a single, one-way neural network model.

From the above analysis, it can be seen that the current domestic and foreign research on the power load forecasting of the integrated energy system mainly focuses on considering the common influencing factors of multiple loads. However, there are few research on load forecasting problems considering the correlation of multiple load coupling factors. Therefore, in view of the strong coupling characteristics between the multiple loads of the integrated energy system and on the basis of considering the correlation of the multiple loads, deep learning methods need to be introduced into the power load forecasting of the integrated energy system. At the same time, relevant influencing factors such as season, weather, and date are considered, and the multiple load time series data are processed and analyzed rationally.

Via the above considerations, the multiple load correlation factors of electricity, heat, and gas are considered in this article, and the CNN-DBILSTM integrated energy system short-term power load forecasting model based on the Attention mechanism is proposed. First, the Pearson coefficient is used to quantitatively calculate the multielement load of the integrated energy system, the tested effective coefficient is used to measure the multielement load correlation of the integrated energy system, and the strong correlation factors are selected to support load forecasting. Then, a CNN-DBILSTM short-term power load forecasting model based on Attention mechanism is proposed. In this method, the CNN network is used to extract effective feature vectors from the historical load sequence as the input of the DBILSTM network and to model the dynamic changes of the proposed time series features. The Attention mechanism is introduced to give different probability weights to the hidden state of the DBILSTM network, and the influence of important information is strengthened. Finally, via historical load data of the integrated energy system in a certain area of Northeast China, MAPE (mean absolute percentage error) and RMSE (root mean square error) are used to evaluate indicators. The proposed model is compared with the prediction model results of the LSTM network, the Attention-LSTM network, and the Attention-BILSTM network to verify that the power load prediction accuracy of the integrated energy system is effectively improved by the Attention-CNN-DBILSTM network proposed in this paper.

2. Analysis of Load Correlation of Integrated Energy System

The integrated energy system is composed of energy input equipment, energy conversion equipment, and multiple loads. Among them, power generation equipment mainly includes photovoltaic power generation, wind power generation, and power purchase from external power grids. Electric energy, heat energy, and gas energy are coupled and converted by multienergy flow through energy conversion equipment. Energy conversion equipment includes electric boilers, micro gas turbines, and P2G equipment. The structure of the integrated energy system is shown in Figure 1.

Considering that the Pearson coefficient can better reflect the direction and degree of the change trend between the two variables, the Pearson coefficient is used to quantitatively analyze the correlation between electricity, heat, and gas in the integrated energy system [19]. For variables and , the correlation coefficient can be expressed as

In the formula, and are the mean values of n eigenvalues x and y, respectively. The closer the correlation coefficient r is to 1, the stronger the correlation between the eigenvalues x and y is. The closer the correlation coefficient r is to 0, the weaker the correlation between the feature values and is.

Then, the significance test method is used to test the reliability of the correlation coefficient r. Suppose H0 means that there is no correlation between the two variables the t-distribution test is used for the statistics, and the calculation formula is

Finally, according to the given significance level and the degree of freedom , the t distribution table is used to find the t distribution with the degree of freedom n − 2, that is, the critical value of . If a, the null hypothesis H0 is rejected, then the previous hypothesis H0 is rejected, indicating that the two variables are correlated. The correlation between multiple loads and its influencing factors is considered in this article, and the significance level is not given. According to formula (2), the statistics are calculated, and then the significance level that satisfies the rejection H0 is obtained by checking the t distribution table.

In the short-term power load forecasting process of the integrated energy system, the power load forecast results are not only related to the types of electricity, heat, and gas but also affected by weather factors and economic factors. However, the increase in the factors considered in the forecasting model will also increase the uncertainty. The Pearson coefficient is used in this paper to carry out quantitative correlation analysis, and the more relevant influencing factors are selected for the short-term power load forecasting of the integrated energy system to improve the accuracy of the forecasting model.

3. Principles of Deep Learning Models

3.1. CNN Principle Structure

CNN is a feed-forward neural network, and it is also a learning algorithm with a multilayer network structure [20]. It consists of a convolutional layer, a pooling layer, and a fully connected layer. CNN uses local connections and weight sharing to process data information. The alternate use of multiple convolutional layers and pooling layers to extract data feature vectors can effectively reduce data complexity, reduce the number of weights, and improve the quality of data features and the generalization ability of the prediction model. The CNN model structure is shown in Figure 2.

3.2. BILSTM Principle Structure

The LSTM network is an improved RNN network. The gradient descent method is used to eliminate the error gradient, which solves the problem of gradient explosion and gradient disappearance during the training process, and the prediction accuracy is improved [21]. The BILSTM (Bidirectional Long-Short-Term Memory) network is proposed to solve the problem of low data utilization and poor data relevance caused by the training method of forward time series propagation of the LSTM network [22]. The BILSTM network is a two-way cyclic network based on time series. The input data is trained through the two-way time series, and the output data contains information on the entire time series, which has better time series data processing capabilities.

The RNN has memory through parameter sharing between neurons and is often used to process the nonlinear characteristics of sequence data, but its network storage capacity is poor. As the time sequence interval increases, the gradient caused by the hidden layer information being covered disappears. LSTM introduces long-term memory and short-term memory through the gating unit, which solves the problems of gradient explosion and gradient disappearance to a certain extent. Its network structure is shown in Figure 3.

LSTM controls the output of the memory unit by three logic units: input gate , forget gate , and output gate .

In the above formula, , , , , , and are gate training parameters; is jointly determined by input , the previous hidden layer output , and the activation function ; is the activation function; is the cell state; is the candidate value of the new cell state .

LSTM network prediction data depends on the output of the hidden layer at the previous moment, so the data usage rate is low and the relevance is poor. BILSTM is composed of a combination of a forward LSTM of a forward input sequence and a backward LSTM of a reverse input sequence; each generates output data and then is connected to the output node to synthesize the final output data. The forward and backward relationships of the input data are effectively extracted, without relying on data timing and predefined parameters, and the network structure of BILSTM is shown in Figure 4.

3.3. DBILSTM Principle Structure

Different from traditional power load forecasting, the power load forecasting process of the integrated energy system needs to consider factors such as electricity, heat, and gas load correlation. Considering that the single-layer BISLSTM network is poor in processing complex time series data in the power load forecasting of the integrated energy system, in this article, the DBILSTM network model composed of multiple BISLSTM networks is used to forecast the power load of the integrated energy system. The network structure is shown in Figure 5.

The DBILSTM network is composed of an input layer, a hidden layer, a Dense layer, and an output layer. The hidden layer is composed of n BILSTM networks. The BILSTM network of each layer obtains the information in the front and back directions through the forward LSTM network and the reverse LSTM network [23].

Through the information fusion of the first n – 1 layer BILSTM network, the output of the n-th layer is used as the bidirectional time series feature vector of the load at time t, and the prediction result is output through the Dense layer. The calculation process is as follows.

Suppose that the i-th input sequence is ; then the output sequence of the first layer can be expressed as

In the above formula, is the activation function; and are the forward and backward weight matrices, respectively; with the superscript representing the current number of layers; represents the addition calculation, keeping the original data dimension unchanged.

The output sequence of the n-th layer can be expressed as

3.4. Attention Mechanism Principle Structure

The Attention mechanism is based on the resource allocation mechanism of the human brain’s attention. The essence is to ignore low-relevance information and highlight the required information through a probability allocation mechanism. According to the influence of the input feature vector on the output feature vector, different weights are given to the state of the hidden layer, thereby effectively improving the prediction accuracy of the model [24]. In this article, the Attention mechanism is introduced into the CNN-DBILSTM integrated energy system power load forecasting model. The influence of inputs at different time steps on the load forecasting results is selectively paid attention to, and higher weights are assigned to key information, so that the load forecasting accuracy is improved. The Attention mechanism structure is shown in Figure 6; is the attention probability distribution value of the CNN-DBILSTM hidden layer under the Attention mechanism, and y is the CNN-DBILSTM output value after the Attention mechanism is introduced.

4. Attention-CNN-DBILSTM Model

In the process of power load forecasting of the integrated energy system, historical data based on time series contains important characteristic information, which reflects the trend of power load changes in the integrated energy system. Traditional methods such as DBM and DNN require artificial extraction of fixed time features, and the time series and correlation of historical load data have not been fully considered. In this article, the CNN network is first used to extract the historical load periodic feature data, and the dynamic changes of the proposed time series features are modeled and learned and then input into the DBILSTM network for training. The DBILSTM network has a good performance in the modeling of high volatility and uncertain time series load data in the integrated energy system. However, the short-term power load forecasting of the integrated energy system has a long time series, and the DBILSTM network may have problems such as information loss and modeling difficulties. The Attention mechanism is introduced to give different probability weights to the hidden state of the DBILSTM network to strengthen the influence of important information. Therefore, a short-term power load forecasting model for the integrated energy system based on Attention-CNN-DBILSTM is proposed, which can effectively learn the dynamic characteristics of the time series data of the integrated energy system by combining multiple network structures.

5. Attention-CNN-DBILSTM Model Structure

The power load forecasting framework based on Attention-CNN-DBILSTM proposed in this paper is shown in Figure 7. It is composed of input layer, CNN layer, DBILSTM layer, Attention layer, Dense layer, and output layer. The specific description of the model is as follows:(1)Input layer: First, the Pearson coefficient was used to perform a correlation analysis on the relevant data, and a correlation coefficient greater than 0.3 was selected as an effective input factor. Then, the relevant data are processed and normalized for abnormal amount. The input layer uses preprocessed historical time series data as the input of the prediction model, which can be expressed as .(2)CNN layer: The CNN layer is used to extract features from the input data. First, a CNN network consisting of 2 convolutional layers, 2 pooling layers, and a Dense layer is built. A CNN network consisting of a one-dimensional convolutional layer and a pooling layer is established, and the ReLU activation function is used for activation. In order to retain the load fluctuation information, pooling layer 1 and pooling layer 2 select the maximum pooling. After the convolutional layer and the pooling layer are processed, they are output through the Dense layer and extracted to the feature vector. The Sigmoid activation function is used by the Dense layer. The output feature vector of the CNN layer can be expressed asIn the above formula, and are the outputs of convolutional layer 1 and convolutional layer 2, respectively; and are the outputs of pooling layer 1 and pooling layer 2, respectively; , , and are weight matrices; , , , , and are deviations; is the convolution operation function; the output length of the CNN layer is I; .(3)DBILSTM layer: The DBILSTM network is composed of multiple BILSTM networks including forward LSTM and reverse LSTM. Compared with the traditional LSTM network, the DBILSTM network has better time series data learning capabilities.The final output sequence of the DBILSTM layer isIn the above formula, is the Rule activation function of the Dense layer; and are the weight parameters of the Dense layer and the output layer, respectively; and is the offset of the Dense layer.(4)Attention layer: The output of the DBILSTM layer is used as the input of the attention layer via the activation function. Calculate the probability corresponding to different eigenvectors according to the weight distribution principle, and continuously update iteratively to get a better weight parameter matrix. The weight coefficient calculation formula can be expressed asIn the above formula, represents the attention probability distribution value determined by the output vector of the DBILSTM layer at time t; and are weight coefficients; is the bias coefficient; is the output of the Attention layer at time t.(5)Output layer: The output layer calculates the output with a prediction step of m through the fully connected layer. . The prediction formula isIn the above formula, is the predicted output value at time t; is the weight matrix; is the bias vector; is the deviation vector.

5.1. Loss Function

In the Attention-CNN-DBILSTM model training process, Adam (adaptive moment estimation) optimization algorithm is selected to optimize the model parameters [25]. Adam is a first-order optimization algorithm that optimizes the output value of the loss function through iterative network weights. The root mean square error function is used to express the loss function of the model, which can be expressed as

In the above formula, n is the number of load forecast output moments; is the actual value; is the load forecast value at time i.

5.2. Case Analysis

In order to verify the accuracy of the Attention-CNN-DBILSTM model considering the multielement load correlation of the integrated energy system, the MATLAB platform was used to simulate and analyze the original electricity, heat, and gas load data of the integrated energy system in a certain area of northern China from January 1, 2016, to December 31, 2017. 48 points are collected a day, with a sampling interval of 30 minutes, and the training set, validation set, and test set are divided as to 8 : 1 : 1. In addition to electricity, heat, and gas load data, the integrated energy system relies on information acquisition devices to obtain hourly steps of temperature, radiation, wind speed, wind direction, and other related data and holiday information.

The distribution of power load data of the integrated energy system is shown in Figure 8. It can be seen that the load data is cyclical when measured in years, but it fluctuates sharply when measured in days.

6. Data Preprocessing and Model Evaluation Index

The mean square method is used to deal with the abnormal amount in the load data, which can be expressed as

In the above formula, is the number of pieces of daily load data; is the load on the i-th day; the abnormal point criterion is .

In order to facilitate the training of the model, the min-max normalization method is used to linearly transform the original data and map it between (0, 1), which can be expressed as

In the above formula, is the normalized data; is the original load data; and are the maximum and minimum values of the sample data, respectively.

The mean absolute percentage error (MAPE) and the root mean square error (RMSE) are selected as evaluation indicators, which can be expressed as

In the above formula, is the number of predicted results; and are the actual value and the predicted value, respectively.

7. Correlation Coefficient of Influencing Factors

Correlation analysis is performed on the electricity, heat, and gas load data in the integrated energy system. After the t distribution test is performed on the obtained Pearson coefficient, the influencing factors with a correlation coefficient greater than 0.3 are normalized and then used as model input for training. The Pearson correlation coefficient values of typical months are shown in Table 1.

The influencing factors of the effective correlation of multiple loads are shown in Figure 9. It can be seen that the thermal load correlation coefficients are all greater than 0.4, which has a great impact on the power load forecasting, and then the thermal load data in the period is used as an influencing factor to participate in the power load forecasting. The gas load affects the power load forecast at a certain time, and then the cases where the correlation coefficient is greater than 0.3 are screened out and the gas load data at this time is used as an influencing factor to participate in the power load forecast. Therefore, it is necessary to consider the heat load and gas load data in the power load forecasting process of the integrated energy system.

8. Result Analysis

In order to verify the accuracy and stability of the proposed model, CNN-LSTM, CNN-BILSTM, CNN-DBILSTM, Attention-CNN-DBILSTM, and Attention-CNN-DBILSTM models considering the correlation of multiple loads are introduced for comparative analysis. The input data are all preprocessed time series data. The data of the first 16 months are used to train the model to predict the daily load for the next 8 months. One week in each month of the forecast sample is taken for daily power load forecasting. The evaluation indexes of each model are shown in Table 2.

According to the change trend of the average value of the prediction result evaluation index of each model in Figures 10 and 11, it can be seen that the average value of the MAPE index of the Attention-CNN-DBILSTM model that considers the multiple load correlation proposed in this article is reduced by 1.93%, 1.39%, 0.71%, and 0.37%. The average RMSE index decreased by 41.73%, 29.93%, 23.61%, and 15.79%. Therefore, after the multivariate load correlation is considered, the prediction accuracy is effectively improved by the proposed Attention-CNN-DBILSTM model, and the average prediction accuracy can reach 97.99%.

In order to more intuitively reflect the load forecasting effect of different models, Figure 12 shows the comparison between the load forecasting value and the actual value on June 28, 2016, which is a randomly selected working day.

It can be seen that the load curve of the selected forecast day is approximately a bimodal curve, and the load curve fluctuates greatly at the sampling point 20–40 (corresponding to the 10 : 00–20 : 00 time period). Compared with other models, the proposed model has the smallest fluctuation in the peak and trough regions of the load curve. In addition, in the rising and falling phases of the morning and evening load curves, the law of load changes can also be well captured by the proposed model.

Figure 13 shows the comparison between the predicted load value and the actual value on January 1, 2017, which is a randomly selected holiday.

The prediction results of other models have fluctuated greatly. The proposed model has better prediction accuracy, and it has been verified that the proposed model is also suitable for holidays.

Therefore, the proposed Attention-CNN-DBILSTM prediction model can be close to the actual value, and the prediction accuracy is higher.

It can be seen that the Attention-CNN-DBILSTM prediction model proposed in this article can be close to the actual value, and the prediction accuracy is higher. Compared with other models, the proposed model performs more prominently during load peak and valley period. The law of load change during peak and valley period can be accurately analyzed, and the accuracy of load forecasting is effectively improved. Compared with other models, the proposed model performs better in the peak-valley period, can accurately analyze the load variation law during the peak-valley period, and is also applicable to holidays. The accuracy of load forecasting is effectively improved.

9. Conclusions

Aiming at the influence of the multiple load correlation factors of the integrated energy system on power load forecasting, the correlation between electricity, heat, and gas load is analyzed, and the CNN-DBILSTM short-term power load forecasting model based on the Attention mechanism is established. The effective features of load changes are extracted through CNN, and the CNN-DBILSTM network combined with the Attention mechanism is used to train the model to mine the internal timing characteristics of the load data. Then the Dense layer is used to output the short-term power load forecast value. Finally, the validity of the model is simulated and verified based on the historical load data of the integrated energy system in a certain area of northern China. The main conclusions are as follows:(1)The coupling characteristics of the electricity, heat, and gas multiple loads of the integrated energy system are considered, and the Pearson coefficient is used to quantitatively calculate the correlation of multiple loads. The results show that the thermal load, natural gas load, and electrical load are highly correlated. Among them, the correlation coefficient between electrical load and thermal load at some moments is stronger than that of weather.(2)After the multiple load correlation is considered, the average prediction accuracy reaches 97.99%. The average MAPE dropped by 0.37%∼1.93%, and the average RMSE dropped by 15.79%∼41.73%. The accuracy of power load forecasting was effectively improved.(3)Compared with traditional deep learning models, the CNN-DBILSTM short-term power load forecasting model based on the Attention mechanism proposed in this article can fully mine the time series characteristics of  load data under multidimensional input feature parameters and has higher prediction accuracy in short-term power load forecasting.

Data Availability

The datasets generated for this study are available upon request to the corresponding author.

Conflicts of Interest

The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as potential conflicts of interest.

Acknowledgments

This work was supported by Key R&D Program of Liaoning Province (2020JH2/10300101), Liaoning Revitalization Talents Program (XLYC1907138), and Key R&D Program of Shenyang (GG200252).