Abstract

Wind energy is a renewable energy source with great development potential, and a reliable and accurate prediction of wind speed is the basis for the effective utilization of wind energy. Aiming at hyperparameter optimization in a combined forecasting method, a wind speed prediction model based on the long short-term memory (LSTM) neural network optimized by the firework algorithm (FWA) is proposed. Focusing on the real-time sudden change and dependence of wind speed data, a wind speed prediction model based on LSTM is established, and FWA is used to optimize the hyperparameters of the model so that the model can set parameters adaptively. Then, the optimized model is compared with the wind speed prediction based on other deep neural architectures and regression models in experiments, and the results show that the wind speed model based on FWA-improved LSTM reduces the prediction error when compared with other wind speed prediction-based regression methods and obtains higher prediction accuracy than other deep neural architectures.

1. Introduction

As a green renewable energy source, wind power has an immeasurable commercial development prospect, and the research on related forecasting technologies is also more important. However, the randomness, volatility, and intermittency of wind resources have brought great troubles and challenges to the stable operation of the power system. Traditional wind power forecasting technologies are no longer sufficient to solve the above problems. For this reason, it is urgent to introduce cutting-edge artificial intelligence technology. Artificial intelligence is a branch of computer science dedicated to the research and development of theories, methods, technologies, and application systems for simulating, extending, and expanding human intelligence. In recent years, the rapid development of artificial intelligence-related machine learning, deep learning, and other technologies has provided new ideas for the research and implementation of high-precision wind power prediction technology and brought new development opportunities.

Wind power prediction relies on wind speed estimation. Due to the cyclical, daily pattern, and high stochastic variability, accurate prediction of wind power is too complicated. Therefore, it is clear that efficient transformation and application of the wind energy resources require exact and complete information on the wind features of the region, and local and regional climates, topography, and impediments include buildings; all affect wind energy. In the last decades, scholars have proposed different prediction methods based on the time series of historical wind speed and in general, these models can be divided generally into four types: physical, statistical, intelligence learning model, and hybrid model.

Physical approaches, which are based on a detailed physical description of the atmosphere, used meteorological data such as air temperature, topography, and pressure to predict wind speed, thus leading to intricate calculations and high costs [1]. Statistical methods, such as Autoregressive Integrated Moving Average (ARIMA) model, Seasonal ARIMA (SARIMA), Generalized Autoregressive Conditional Heteroskedasticity (GARCH) model, and Monte Carlo Simulation [2], predict wind speed on the premise of linear assumption and are more accurate than physical methods [3, 4]. However, the variation of wind speed contains significant nonlinear and chaotic characteristics, and it is usually difficult to accurately and effectively predict the future wind speed simply by applying these methods or models. In addition, statistical methods require a large amount of data for learning and modeling and are more suitable for ultrashort-term wind power prediction. Intelligent learning methods, such as Support Vector Regressor (SVR), Decision Tree Regressor (DTR), Multivariate Linear Regression (MLR), Artificial Neural Network (ANN), train and predict the wind speed data with better performance in the fitting of the nonlinear changes of wind speed [57]. SVR, MLR, and DTR have advantages in sparsity and generalization and solving nonlinearity prediction problems, but its key parameters mainly rely on manual selection [811]. ANN [12, 13] has the advantages of good nonlinear fitting and strong self-learning ability, but it is unstable, slow convergence rate is easy to fall into the local optimal value, and it is difficult to obtain its network structure including the number of hidden layers. The wind speed prediction model based on Convolutional Neural Networks (CNN) can consider the temporal and spatial correlation of wind speed, to make the ultrashort-term prediction of the spatial distribution of wind speed [1416].

Wind speed is affected by many factors, and a single prediction model cannot fully include all these factors. Particularly in extreme weather cases, a single model does not have sufficient learning, which may lead to a large deviation in the prediction. The combined prediction method takes into account the respective advantages of different models at the same time, optimally combining a variety of single models and giving play to the advantages of each model can significantly improve the accuracy of prediction [1720]. Combination prediction methods mainly include weighted combination prediction and fusion combination prediction [21]. The key lies in the determination of the weight coefficient. The combination method of the fixed weight coefficient [22] is simple and easy to realize. The combination method of the variable weight coefficient [17] is strong adaptability and high accuracy. The fusion combination is optimized by other prediction methods in different prediction stages, including input data stabilization, model parameter optimization, and output error correction. Based on empirical mode decomposition (EMD) [2225], variational mode decomposition (VMD) [2629], analytical mode decomposition (AMD) [30, 31], the wavelet decomposition [14, 25, 32], and so on, the wind speed sequence data was preprocessed to make the data stable. Better prediction results are achieved. In addition, the Hilbert–Huang transform (HHT) [33], fast correlation filter [34], principal component analysis (PCA) [35], and so on extracted the input features of wind speed data and obtained good prediction results by optimizing the short-term wind speed prediction model combined with other prediction methods. It is an important way to optimize the parameters of the model by using an intelligent algorithm. According to the characteristics of the wind speed data, the intelligent algorithm is used to determine the parameters adaptively during the training process to improve the learning ability and generalization ability of the model. Genetic algorithm [36], particle swarm optimization algorithm [27], and cuckoo algorithm [37] are used to optimize the hybrid model combining the parameters and threshold values of BPNN, LSTM, SVM, and other intelligent learning models, which can overcome the problem of low prediction accuracy of a single model and improve the accuracy of wind speed prediction. The prediction results of the traditional method are substituted into the error model to correspond to the superposition and correct the error, which has strong universality and is not limited to the specific prediction process [3841].

In this paper, based on the measured data of a wind turbine in a power plant and the analysis of wind power time series, the combined prediction method is proposed. Firstly, a wind speed prediction model based on LSTM is established. Then, from the perspective of model hyperparameter optimization, the fireworks algorithm (FWA) is used to automatically search for the best hyperparameter combination suitable for wind speed data. Finally, the optimized FWA-LSTM is used to predict and analyse the wind speed data, and its feasibility and effectiveness are verified.

This paper is organized as follows: in Section 2, we constructed a wind speed prediction model based on LSTM; in Section 3, we studied the firework algorithm, hyperparameters optimization of LSTM by the firework algorithm, and optimized LSTM wind speed prediction algorithm based on firework algorithm. In addition, experimental environment configuration and parameter settings, wind speed prediction results based on the proposed method, and the comparison are discussed; and finally, the main conclusions are drawn in Section 5.

2. Wind Speed Prediction Model Based on LSTM

2.1. LSTM Neural Network Model

The traditional neural network model will lose the remote information, and it is difficult to learn the long-distance dependent information. LSTM is an improvement of the recurrent neural network, which aims to overcome the defects of the recurrent neural network in processing long-term memory. The LSTM introduced the concept of cellular states, which determine which states should be preserved and which should be forgotten. The basic principle of LSTM is shown in Figure 1.

As shown in Figure 1, Xt is the input at time t, ht−1 is the output of the hidden layer at time t − 1, and Ct−1 is the output of the historical information at time t − 1; f, i, and o are, respectively, the forgetting gate, input gate, and output gate at time t, and e is the internal hidden state, namely, the transformed new information. LSTM conducts parameter learning for them in the training. Ct is the updated historical information at time t, and ht is the output of the hidden layer at time t.

Firstly, the input xt at time t and the output ht−1 of the hidden layer are copied into four copies, and different weights are randomly initialized for them, so as to calculate the forgetting gate f, input gate i, and output gate o, as well as the internal hidden state e. Their calculation methods are shown in formulas (1)–(4), where W is the parameter matrix from the input layer to the hidden layer, U is the self-recurrent parameter matrix from the hidden layer to the hidden layer, r is the bias parameter matrix, and σ is the sigmoid function so that the output of the three gates remains between 0 and 1:

Secondly, forgetting gate f and input gate i are used to control how much historical information Ct−1 is forgotten and how much new information e is saved, to update the internal memory cell state Ct. The calculation method is as follows:

Finally, output gate o is used to control how much Ct information of the internal memory unit is output to the implicit state ht, and its calculation method is shown as follows:

2.2. Wind Speed Prediction Model Based on LSTM

The process of using LSTM to predict wind speed data is shown in Figure 2. It mainly includes wind speed data preparation and preprocessing (data resampling and null filling), data normalization, data division, prediction model establishment and evaluation, and data prediction.

First, the wind speed data is modeled as a nonnegative matrix X of an N × T, where N represents the number of wind speed monitoring points, T represents the number of time slots sampled, and each column in the wind speed data matrix represents the wind speed value at different points in a specific time interval.

Wind speed prediction can obtain the predicted value of the future time through the historical time series, X(i, j) represents the scale of N  × T flow matrix, xn,t represents the wind speed value of row n and column t. Wind speed prediction is defined by a series of historical wind speed data (xn,t−1, xn,t−2, xn,t−3, …, xn,t−1) to predict the wind speed at time t in the future. In the wind speed prediction model based on LSTM (Figure 2), it is assumed that the wind speed at a certain point in the t-slot is predicted, the input of the model is (xn,t−1, xn,t−2, xn,t−3, …, xn,t−1), and the output is the predicted value of the wind speed at the t-slot at this point.(1)Wind speed data preparation and preprocessing: to meet the time-frequency (seconds, minutes, hours, days, etc.) requirements of wind speed data prediction, it is necessary to resample the original data, that is, to convert the time series from one frequency to another through downsampling or upsampling. In addition, if there are null values in the resampled data sequence, the null values need to be filled. Here, we use the machine learning method—the K-Nearest Neighbours (KNN)—to fill with null values of wind speed data.(2)Data normalization: the range standardization method is used to process the wind speed data so that the sample data value is between 0 and 1. The calculation method of the range standardization method is shown as follows:In formula (7), represents the maximum value of wind speed data and represents the minimum value of wind speed data.(3)Data division: the wind speed data after preprocessing and normalization is divided into a training set and a test set according to a simple cross-validation method. While keeping the wind speed data sequence unchanged, fivefold cross-validation is used to divide into the training set and the test set, which are used for the training and prediction of the LSTM wind speed prediction model, respectively.(4)Construct an LSTM wind speed prediction model: define an LSTM neural network and set the parameters, including time step, network layer number, number of neurons in each layer, dropout, activation function, return value type and number, hidden layer dimension size, learning rate, batch size, and values for the number of iterations.(5)Compile the network: set the optimizer, error measurement indicators, and training record parameters and compile the constructed LSTM wind speed prediction model.(6)Evaluate the network: the training set data is substituted into the model for training, the error of the established prediction model is evaluated, and the parameter settings of the model are fine-tuned according to the result to obtain a better prediction effect.(7)Forecast and evaluation: use the optimized wind speed prediction model to make predictions, compare the prediction results with the real data, and calculate the error.

3. The LSTM Wind Speed Prediction Model Optimized by the Firework Algorithm

3.1. The Firework Algorithm

The fireworks algorithm (FWA) [4244] is a simple-rule, fast-convergence-speed swarm intelligence optimization algorithm. It searches the solution space mainly by the sparks generated by the firework explosions, and the fireworks and the sparks from the explosion formed the whole crowd. In this algorithm, the firework is seen as a feasible solution in the solution space of the optimization problem, and the process of firework explosion to generate sparks is the way of searching the neighbourhood. FWA includes the following steps: initialization, calculating the fitness, generating sparks by firework explosions, and calculating the optimal solution.

Firstly, FWA sets a series of initial parameter values including the number of fireworks population N, the explosion range control parameter , the maximum number of sparks m, the number of variant sparks , the parameters a and b that limit the number of sparks produced by the explosion, The minimum normal value ε of zero, and the solution space boundaries Bu and Bi, where Bu is the upper boundary and Bi is the lower boundary. The firework algorithm mainly uses random initialization to generate N initial fireworks in the solution space.

Secondly, calculate the fitness value of each firework, and generate sparks based on the fitness value. The calculation for generating the number of sparks in FWA is shown as follows:

In formula (8), Si is the number of sparks produced by the ith firework, m is a constant which limits the total number of sparks produced, Ymax is the objective function value of the firework with the worst fitness of the current population, is the fitness function of the firework xi, and ε is the minimum number of the machine.

The calculation of FWA explosion amplitude is shown as follows:

In formula (9), Ai is the explosion amplitude of the ith firework, that is, the explosion radius, is a constant which represents the maximum explosion amplitude, and Ymin is the fitness value of the firework with the best current population fitness value.

Thirdly, according to the actual firework attributes and the actual situation of the search problem, sparks are generated in the radiation space of the firework. To ensure the diversity of the population, the fireworks need to be appropriately mutated, such as Gaussian mutation.

The calculation of the Gaussian mutation algorithm in FWA is shown as follows:

In formula (10), is the position of the ith individual on the kth dimension and is the value of the Gaussian distribution function where .

Finally, calculate the optimal solution of the population and decide whether the termination condition is met. If it satisfies the requirements, stop the search; else, continue iterating.

In the entire population, the spark with the best fitness value is selected and retained as the next-generation fireworks, and the remaining sparks are selected by roulette. The probability of each spark being selected is calculated as follows:

In formula (11), is the probability of the ith spark and is the sum of the distance between the and the candidate fireworks except for .

Compared with particle swarm optimization (PSO) and genetic algorithm (GA), the fireworks algorithm has higher convergence and solving accuracy and has been applied to solve many practical optimization problems, of which parameter optimization is an important aspect [4547].

3.2. Hyperparameter Optimization of LSTM by the Firework Algorithm

The hyperparameter selection of the LSTM model has an important influence on the prediction accuracy of the model. The existing hyperparameter selection generally adopts the empirical method. The empirical method is arbitrary and blind in the choice of parameters without universality. Therefore, combining multiple hyperparameters into a multidimensional solution space and traversing the solution space to obtain the optimal parameter combination can reduce the randomness and blindness of parameter selection. The selection of multiple hyperparameters is often carried out in a larger solution space, and a better performance optimization algorithm is needed to quickly obtain the global optimal solution. Therefore, the firework algorithm with global optimization and fast convergence speed is adopted to optimize the LSTM model’s hyperparameters to improve the scientificity of model parameter selection and thus improve the prediction accuracy of the model.

Suppose that n hyperparameters of the LSTM wind speed prediction model need to be optimized, and each firework represents a set of hyperparameters in the solution space. Assuming that there are q sets of hyperparameter combinations in the n-dimensional continuous search space, for the ith hyperparameters i(i = 1, 2, …, q) in the spark, the n-dimensional current position vector xi(k) =  represents the current value of the ith group of hyperparameters in the solution space, and the n-dimensional velocity vector represents the search direction of the group of hyperparameters.

The goal of wind speed prediction is to make the predicted value close to the actual value, that is, the error between the predicted value and the actual value is as small as possible, so the Root Mean Square Error (RMSE) of the training data in the wind speed prediction model is selected as the objective function. Let fitness = RMSE; then, the objective function is to minimize RMSE. The calculation method of RMSE is as follows:

In formula (12), is the predicted value, , y is the true value, .

According to the firework algorithm, two important hyperparameters of the LSTM wind speed prediction model are optimized: the time step and the number of neurons in each layer. Two LSTM models, single-layer and double-layer LSTM, are used as the research objects to optimize the hyperparameters. Use node to represent the number of neurons and look_back to represent the time step. For a single-layer LSTM model, fitness = RMSE (node, look_back); for a two-layer LSTM model, fitness = RMSE (node 1, node 2, look_back).

According to the FWA process (as shown in Figure 3), the process of the hyperparameters optimization of the LSTM wind speed prediction model mainly includes six steps:Step 1: initialize the parameters of FWA: set the initial firework population size, namely, the number of hyperparameter combinations N, the explosion range control parameter , the maximum number of sparks m, and the number of variant sparks and limit the number of sparks produced by the explosion parameters a and b, the minimum normal value ε that tends to zero, and the solution space boundaries Bu and Bi, where Bu is the upper boundary and Bi is the lower boundary. Using random initialization, N initial fireworks are generated in the solution space. Set the maximum number of iterations item_max and the preset error Pre_error.Step 2: calculate the fitness of each firework; that is, calculate the fitness value of the objective function of each group of hyperparameters. According to the fitness value, the explosion operator, the number of sparks, the explosion amplitude, and the offset value are calculated. Each firework explosion generates sparks of the hyperparametric group, and the sparks beyond the boundary are mapped according to the rules. At the same time, a certain number of Gaussian variation sparks of the hyperparametric group are generated by using Gaussian variation.Step 3: set the optimal objective function value Fi of each group of hyperparameters. For the ith group of hyperparameters, compare its current objective function value current_fitness with Fi. If it is less than Fi, use current_fitness as the best objective function value Fi of the ith group of hyperparameters; that is, let Fi = current_fitness.Step 4: set the global optimal value . For the ith group of hyperparameters, compare Fi with . If it is less than , use Fi as the optimal value of the current population; that is, let .Step 5: update the explosion range and spark number of each group of hyperparameters according to formulas (8) and (9).Step 6: check the termination conditions. If the set conditions (preset error or maximum number of iterations) are not reached, return to Step 2 to continue execution.

3.3. Optimized LSTM Wind Speed Prediction Algorithm Based on the Firework Algorithm

According to the wind speed prediction steps based on LSTM and the process of the FWA hyperparameter optimization, the call relationship between them can be obtained as in Figure 4.

It is obtained that the wind speed prediction algorithm based on LSTM optimized by FWA—the FWA-LSTM wind speed prediction algorithm—is derived. The pseudocode of the algorithm is shown in Algorithm 1.

(1)Wind speed data preparation and preprocessing;
(2)Normalize the raw data;
(3)Divide training set and test set;
(4)Construct LSTM wind speed prediction model. Set partial parameters and fix the number n of optimized parameter;
(5)FWA parameter initialization (fireworks population size P, solving space dimension d, maximum number of iterations iter_max, explosion amplitude range control parameter , the maximum number of sparks m, the number of variation sparks , the parameters a and b that limit the number of sparks produced by the explosion, the minimum normal value ε that tends to zero, the solution space boundaries Bu and Bi);
(6)Initialize the values of n-dimensional parameter combinations of P groups randomly in the solution space;
(7)Initialize the global optimal parameter combination gbest_parameters, the partial optimal parameter combination pbest_parameters, and the best fitness function value Pg;
(8)While the end condition is false:
(9) Apply the n-dimensional parameter combinations of P groups, respectively, to the LSTM network flow prediction model for training, and calculate the current fitness function value;
(10) Get the current best fitness value Pi and the corresponding parameter combination pbest_parameters;
(11) if Fi < 
(12)   = Fi;
(13)  gbest_parameters = pbest_parameters;
(14) end if
(15) for each parameter combination
(16)  Calculate the search direction and position of the new parameter combination according to formulas (8) and (9);
(17)  Fix the updated parameter in the selected values;
(18) end for
(19) The number of iterations+1;
(20)end while
(21)Return the gbest_parameters;
(22)gbest_parameters is introduced into LSTM wind speed prediction model to predict test data and calculate prediction error.

Algorithm 1 firstly preprocesses the wind speed data, normalizes and divides the data to obtain a training set and a test set, then establishes the LSTM wind speed prediction model, and uses FWA to optimize the LSTM hyperparameters to obtain the optimal parameter combination; finally, the parameters are substituted into the model to complete the prediction and error calculation of wind speed data.

4. Experimental Evaluations

4.1. Experimental Environment Configuration and Parameter Settings

This study selects the measured wind speed data of a wind farm in 2015, starting from January 1, 2015, to December 31, 2015, with an interval of 1 hour, each containing 8759 data packets. This paper selects some data segments for model analysis.

For the prediction results of the network model, three error analysis indicators are used to verify the prediction accuracy, namely, Root Mean Square Error (RMSE), Mean Absolute Error (MAE), and Mean Absolute Error (Mean Absolute Percentage Error, MAPE). The calculation methods of MAE and MAPE are shown in equations (13) and (14).

It can be seen from formula (12) that the smaller the value of RMSE, the smaller the average error between the prediction result and the actual data, the higher the prediction accuracy of the model, and the better the prediction performance of the model. Similarly, it can be seen from formulas (13) and (14) that the more MAE and MAPE values tend to 0, the better the prediction effect of the model is and the more perfect the model is. on the contrary, the larger the values, the greater the error and the worse the prediction effect of the model.

To fully verify the prediction effect of the LSTM wind speed prediction model on the wind speed data after FWA optimization, the optimized model prediction results were compared with the typical LSTM prediction results, other neural network models, and regression prediction methods. As shown in the firework algorithm [45, 46], the initial number of fireworks N = 5, the size of the fireworks population P = 50, the preset maximum explosion amplitude  = 40, the maximum number of sparks m = 20, the number of mutant sparks  = 5, the constants a = 0.04, b = 0.8, and the maximum number of generations is set to 100.

4.2. Wind Speed Prediction Results Based on FWA-LSTM
4.2.1. Data Processing

(1)Data resampling: Figure 5 shows wind speed data after null filling by the KNN algorithm.(2)Data normalization: the range standardization method (equation (7)) is used to process the wind speed data so that the sample data value is between 0 and 1. The processing result is shown in Figure 6.(3)Data division: the normalized data is divided into train set and test set according to the simple cross-validation method. The first 80% of the data is used as training data for the training of the LSTM network model. The remaining 20% of the data is used as prediction data to verify the efficiency of the model.

4.2.2. Wind Speed Prediction Based on Basic LSTM
(1)Define the network: this prediction uses a four-layer LSTM model with one input layer, two hidden layers, and one output layer.The specific connection method of the three-layer LSTM is as follows: the first layer of LSTM receives input with time steps of 1, data_dim = 3, and the number of neurons is 64; the second layer uses the results of the first input layer as input for training and passes its output to the next hidden layer. The number of neurons is the same as that of the first layer, and the third layer of the hidden layer (Dense) uses the first layer of LSTM. The output of the third layer is the input; the output layer of the third layer takes the output of the second hidden layer as the input and is connected to a fully connected layer. A one-dimensional vector with a length of 200 output from the fully connected layer is the final output result, which represents the predicted value of 200 data points in the future. To prevent LSTM from overfitting, a dropout layer is added between the first layer and the hidden layer for regularization. After repeated testing, it is found that the accuracy of the training set is the highest when dropout = 0.25.(2)Compile the network: LSTM network compilation uses adaptive moment estimation (Adam) algorithm as the optimizer and a mean square error loss function as the objective function.(3)Fitting the network: the LSTM network is trained on 1600 pieces of train data and 200 pieces of valid data are used for validation. The number of iterations epochs = 50, look_back = {1, 5, 10}, and batch_size = 128.(4)Network evaluation: when look_back takes 1, 5, and 10, respectively, and the number of hidden layers (LN) is 1 and 2, respectively, the loss data of the model training process are shown in Figure 7.According to Figure 7, the loss of each training shows a downward trend, indicating that the model is effective.(5)Wind speed forecasting: 200 pieces of test data are predicted and the results are shown in Figure 8. TestOriginal_result represents the original data; testPredict_result_101, testPredict_result_105, and testpredict_110, respectively, represent the prediction results when LN = 1, look_back takes 1, 5, and 10, respectively. TestPredict_result_201, testPredict_result_205, and testpredict_210, respectively, represent the prediction results when LN = 2, and look_back takes 1, 5, and 10, respectively.(6)Error of the prediction mode: LSTM models corresponding to different parameter combinations were used for wind speed prediction, and errors (RMSE, MAE, MAPE) of each model validation set were compared (see Table 1).

It can be concluded that, for the wind speed data, the prediction effect of the parameter combination set by the empirical method is unstable and cannot achieve the optimal prediction performance. Therefore, the fireworks algorithm (FWA) is adopted to optimize the model; that is, an intelligent algorithm is used to efficiently obtain the parameter combination with the optimal prediction effect.

4.2.3. Hyperparameter Optimization according to FWA

To show the process of the optimal parameter value of the LSTM wind speed model determined by the firework algorithm, Figure 9 shows the changes in the number of nodes and the time step during the optimization process of the FWA-LSTM12 model.

Figure 9 shows that, for the prediction of wind speed data, the fitness value tends to be stable starting from the 9th iteration; that is, the FWA to optimize the LSTM-based wind speed prediction model converges easily.

The changes in the number of nodes and time steps in the optimization process of the FWA-LSTM23 model are shown in Figure 10.

It can be seen from Figure 10 that the optimal parameters of the FWA-LSTM model are set to node 1 = 5, node 2 = 2, and look_back = 5. Therefore, in the prediction of wind speed data used in this article, the best configuration of the LSTM model is that the number of neurons in the first layer is set to 5, the number of neurons in the second layer is set to 2, and the time step is set to 5.

4.3. Result Analysis

To evaluate the prediction performance of the LSTM model after parameter optimization by FWA, wind speed data samples at 200-time points are used for verification. Firstly, the basic LSTM and PSO-LSTM (LSTM optimized by the particle swarm optimization) are tested for comparison. Figure 11 shows the prediction results of these three methods.

It can be seen from Figure 11 that the prediction effects of FWA-LSTM and PSO-LSTM models are similar and both better than the basic LSTM method. To compare the performance of the three methods more clearly, their prediction performance evaluation index values are calculated and shown in Table 2.

It can be seen from Table 2 that compared with the PSO-LSTM and the basic LSTM, the FWA-LSTM has slightly smaller prediction errors of RMSE and MAPE, while the MAE is close to PSO-LSTM. On the whole, FWA-LSTM is considered to be superior to the PSO-LSTM and the basic LSTM.

To further verify the prediction effect of the improved FWA-LSTM prediction model, it is compared with other neural network prediction methods, such as Gated Recurrent Unit (GRU), Simple Recurrent Neural Network (SimpleRNN), and Bidirectional Recurrent Neural Network (BiRNN), and other predictive models such as SVR and ARIMA were also compared. Similarly, using 200-time points of wind speed data samples for verification, their prediction performance evaluation index values were obtained, respectively, and the results are shown in Table 3.

It can be seen from Table 3 that the prediction errors of RMSE, MAE, and MAPE of the FWA-LSTM model optimized by FWA are all less than those of other tested prediction methods such as GRU, SimpleRNN, BiRNN, SVR, and ARIMA, so it is said that the FWA-LSTM prediction model has a better prediction effect in wind speed prediction. Therefore, it is said that the FWA-LSTM is more suitable for dealing with the real-time sudden change of wind speed data.

In summary, the proposed FWA-LSTM method has a better prediction effect and higher reliability for the future prediction of wind speed.

5. Conclusions

Wind speed prediction can be applied to wind energy optimization and has important reference significance for wind power planning and the stable operation of the power system. This paper first established a wind speed prediction model based on the nonparametric model LSTM neural network, optimized the hyperparameters of the established LSTM prediction model with the firework algorithm, and reduced the prediction Root Mean Square Error compared to the empirical method of obtaining parameters, and the FWA-LSTM is better than the double-layer LSTM in the wind speed data prediction.

The improved model FWA-LSTM is applied to wind speed prediction and compared with the prediction effects of other neural network prediction methods and regression methods. The experimental results show that compared to other prediction models and the traditional LSTM model, the FWA-LSTM method reduces the prediction errors, which significantly reduces the prediction error and improves the accuracy of wind speed prediction. The next step will continue to combine a variety of prediction methods to improve the prediction accuracy of wind speed prediction.

Data Availability

The data used to support the findings of this study are available from the corresponding author upon request.

Conflicts of Interest

The authors declare no conflicts of interest.

Acknowledgments

This work was supported by the National Natural Science Foundation of China under Grant no. 62072363.