Abstract

This paper discusses the efficient implementation of a new hybrid approach to forecasting short-term PV power production for four different PV plants in Algeria. The developed model incorporates a time-varying filter-empirical mode decomposition (TVF-EMD) and an extreme learning machine (ELM) as an essence regression. The TVF-EMD technique is used to deal with the fluctuation of PV power data by splitting it into a series of more stable and constant subseries. The specified set of features (intrinsic mode functions (IMFs)) is utilized for training and improving our forecasting extreme learning machine model. The adjusted ELM model is used to evaluate prediction efficiency. The suggested TVF-EMD-ELM model is assessed and verified in various Algerian locations with varying climate conditions. In all examined regions, the TVF-EMD-ELM model generates less than 4% error in terms of normalized root mean square error (nRMSE).

1. Introduction

The vision and goal of countries around the world have been to create a sustainable and environmentally friendly economy by developing plans for a promising future by investing in green and renewable energies, notably solar energy.

The future installation of PV capacity is expected to reach 4,815 GW by 2040, according to the IEA 2019 Sustainable Development Scenario [1]. In this regard, Algeria, like other countries in the world, has begun investing in the field of photovoltaic energy in order to diversify energy sources and not rely entirely on fossil energy within a time frame set by the Algerian government to reach 22,000 megawatts of electricity production from renewable sources, which is 2011–2030 [2]. To achieve this goal, the task of installing photovoltaic stations was entrusted to the national company Sonelgaz, which has experience in the field of renewable energies, as it installed 23 photovoltaic stations connected to the network and wind farms throughout the country. However, we see that most of the grid-connected solar energy production plants are affected by several factors, including photovoltaic panels, inverters, meteorological conditions, and dust accumulation on photovoltaic panels. Therefore, it becomes necessary to analyze and forecast the PV generation capacity [3, 4].

Decomposition algorithms are considered a type of statistical method that can be used to analyze time series data, such as data on solar photovoltaic (PV) power generation. The main benefit of using decomposition algorithms in PV power forecasting is that they can help to identify and separate different components of the time series data, such as trend, seasonality, and noise. This can make it easier to understand the underlying patterns in the data and to make more accurate forecasts of PV power generation. Decomposition algorithms can also be used to remove the effects of these components, which can improve the accuracy of forecasts by reducing the amount of noise in the data. In summary, the use of decomposition algorithms in PV power forecasting can help to improve the accuracy and reliability of the forecasts, which can be useful for a variety of applications, such as grid management and renewable energy planning. There have been many studies that have investigated the use of decomposition algorithms for PV power forecasting. Some of the most commonly used decomposition algorithms include the seasonal decomposition of time series (STL), moving average (MA), autoregressive integrated moving average (ARIMA), empirical mode decomposition (EMD), ensemble empirical mode decomposition (EEMD), a new version of the basic EMD (CEEMDAN) [5], and iterative filtering decomposition method (IF). Das et al. [6] reviewed the usage of various adaptive decomposition algorithms for time series analysis. In their work, they described the computational stages for several adaptive decomposition strategies in detail, which can be very useful for researchers working on time series data forecasting.

In the literature, there are many methods for predicting PV production [7, 8]. These methods can be grouped into three leading families: statistical methods, physical methods, and hybrid methods [9]. These methods provide either irradiation forecasts or direct production forecasts. The option of a forecasting method can be guided by several parameters: the forecast, the forecast horizon envisaged, and the type of data available. There are various sources of data that can be used in the context of PV production forecasting, namely, production measurements and meteorological variables such as solar irradiation, weather forecasts, and camera or satellite images. An interesting approach is to group the forecast models by increasing horizons from a few minutes to several days. Intra-hourly and very short-term forecasts that cover horizons ranging from less than a few minutes to a few hours are essential to the activities of variability treatment, production monitoring, load adjustment, and storage management.

The medium-term forecast is used in the context of energy management and trading. Long-term forecasting allows for better planning and optimization of resources. We find in the literature comparisons of forecasting methods for short and very short-term horizons and detailed analyses of these methods according to the type of input data. In this study, we have presented the hybrid decomposition models in the four grouped different classes based on the adopted decomposition algorithms for PV forecasting. Firstly, the method of EMD. [10], proposed a forecasting method has been mentioned that is contingent on a hybrid empirical mode decomposition (EMD) and extreme learning machine (ELM) [11], The proposed EMD-CNN-based combined forecasting method and voltage time series data are decomposed by EMD [12], has contributed to short-term PV power forecasting by an approach called EMD-SCA-ELM, which is a parameter optimization process for ELM that is controlled by SCA with EMD signal filtering technique, prediction, and training based on SLFN. The [13] proposed EMD-BPNN method is estimated on a PV power dataset collected from a 100 kW roof-top grid-connected solar plant. Secondly, the method of (VMD). Subsequently [14], presented a hybrid method of VMD and deep CNN with multiple input factors that have been proposed, which is able to improve the accuracy of short-term PV power predictions [15], applied VMD to decompose PV power into different fluctuating components. And then, a deep belief network and an autoregressive moving average were used to predict the fluctuating component. However, the VMD has the disadvantage of setting the mode number and the penalty factor by experiencing a decision [16], proposed a model of variational modal decomposition (VMD), maximum correlation minimum redundancy (mRMR), and deep belief network combination (DBN) to predict photovoltaic output, which effectively improved the prediction accuracy. Reference [17], used VMD to decompose the historical PV power and then combined it with the LSTM optimized by the improved particle swarm optimization (IPSO) algorithm to predict. The residual error of VMD is also very important to the prediction results, which have not been predicted and analyzed. Third, the method of WD shows [18], this study focuses on forecasting the power output of a photovoltaic system located in Puglia-South East Italy at different forecast horizons, using historical power output data and performed by statistical models. hybrids based on least squares support vector machines (LS-SVM) with wavelet decomposition (WD) [19], proposes an improved DL model to improve the accuracy of day-ahead solar irradiance prediction. It should be noted that the DWT-CNN-LSTM model is individually established under four general weather types due to the strong dependence of solar irradiance on the meteorological state [20], presents a method combining an artificial neural network (ANN) and a wavelet decomposition (WD) for power prediction of a PV system. Solar irradiance and six other parameters are chosen as input to the hybrid model based on WD and ANN [21], compared wavelets ANFIS, ANFIS, and ANN based on various performance indices, including RMSE, nRMSE, MAE, MAPE, and standard deviation. Finally, the method of (CEEMDAN) [5], applied a hybrid deep learning model that combines two popular deep neural networks to extract spatiotemporal information from degenerate solar radiation subsets of CEEMDAN. Variable empirical mode analysis can divide noisy and fluctuating time series into several subsets called intrinsic mode functions [22], proposes a deep learning model based on bidirectional long-term memory (BiLSTM), sinusoidal cosine algorithm (SCA), and full ensemble empirical mode decomposition with adaptive noise (CEEMDAN) for solar radiation prediction. Prediction results show that the proposed methodology provides high prediction accuracy compared to the independent BD-LSTM and SCA-Bi-LSTM [23], proposes a closed recurrent unit (GRU) neural network prediction model based on Full Ensemble Empirical Mode Decomposition with Adaptive Noise (CEEMDAN). Uses Approximate Entropy (AE) to rearrange each subsequence to generate low-frequency, mid-frequency, and high-frequency signals, then feeds them into the model for prediction [24], have developed a hybrid interval prediction model by combining Fuzzy Information Granulation (FIG), Network Enhanced Long and Short-Term Memory (ILSTM), and Automatic Regression Differential Moving Average Model (ARIMA) for the interval prediction of PV output power. A summary of articles on PV power forecasting is reported in Table 1.

However, decomposition algorithms do have some limitations. One weakness is that they may not be able to accurately forecast PV power generation in situations where the underlying patterns in the data are complex or nonlinear. In addition, decomposition algorithms can be sensitive to the choice of parameters, and selecting the wrong parameters can lead to poor forecasts. Finally, decomposition algorithms may not be able to capture unexpected events or changes in the data, such as sudden changes in weather conditions or equipment failures, which can affect PV power generation. In an effort to address the gap in decomposition technique for PV power forecasting, we have suggested the use of a new decomposition technique called TVF-EMD, which stands for time-varying filter-ensemble empirical mode decomposition. Our approach uses a combination of time-varying filters and ensemble empirical mode decomposition coupled with the ELM model to effectively decompose the PV power signal into its underlying components, allowing for more accurate forecasting of PV system output. Through the use of this new decomposition technique, we aim to make significant contributions to the field of PV power forecasting.

This paper is organized as follows. Section 2 describes the four studied PV plant systems. Section 3 presents the key elements of our proposed model. Section 4 describes the main components of our hybridization strategy. Section 5 outlines the model evaluation process. Results and discussion are presented in Section 6. Finally, in Section 7, we summarize the main findings of this work and suggest potential areas for future research.

2. Overview of the Four Solar Photovoltaic Plants

The study area included the areas of photovoltaic power plants in Algeria, and four solar plants were selected from among 22 photovoltaic plants connected to the grid in different climatic regions to validate the models [25, 26]. The first area is the Laghouat photovoltaic station, which is characterized by a semi-continental climate with geographical coordinates located at 33°48′10N 2°52′30E; the second region is the region of Ghardaia, which is characterized by a semi-desert climate with geographical coordinates 32°29′N 3°40′E. The third region is the Sidi Bel Abbes region which has a dry climate with geographic coordinates 35°11′38N 0°38′29W; the fourth area is the Djelfa region which has a cold climate with geographic coordinates 34°40′30N 3°15′30E [27, 28]. The geographical coordinates of the study sites are shown on the map of Algeria (see Figure 1).

The solar photovoltaic plant in Laghouat, Djelfa, and Sidi Bel Abbes was commissioned in 2016, except for the pilot plant in Ghardaia that was commissioned in 2014, which is part of the National Renewable Energy Program and is one of 23 similar plants built across the highlands and the south of the country to produce 400 megawatts [29]. The studied site’s location of the PV central is shown in Figure 1.

The modules used in these solar power plants are combinations of different technologies used in the four projects. The total capacity of these plants is 135.1 MW. Four different technologies were used in the Ghardaia solar power plant with several energy classifications, thin amorphous silicon (a-Si) (Cd-Te), amorphous silicon (a-Si), polycrystalline silicon, and monocrystalline silicon (a-Si n la-Si). For the remaining three solar power plants, crystalline polytechnology was used, with a variation of the technology manufacturer (Table 2).

3. Methodology

3.1. Principle of TVF-EMD

EMD decomposes a given signal x(t) into a limited number of single-component IMFs and a nonzero average residual r(t), namely,where is the i-th IMF. To obtain each IMF, an iterative procedure called the sifting process is used. The sifting process of EMD is mainly carried out by two steps: (1) estimate the “local mean” and (2) recursively subtract the local mean from the input signal until the resulting signal becomes an IMF.

To improve the effectiveness of the empirical mode decomposition (EMD) approach, the time-varying filter-empirical mode decomposition (TVF-EMD) method replaces monocomponents with local narrow-band signals that have similar properties to IMFs but can generate a more pronounced Hilbert spectrum. Local narrow-band signals are defined based on their instantaneous bandwidth; if the signal’s local instantaneous bandwidth is less than a certain threshold value, it is classified as a local narrow-band signal. The basic idea behind this approach is to determine the local cutoff frequency and then apply time-varying filtering. The shifting process of TVF-EMD is achieved using a time-varying filter, which is carried out in three main steps.

Phase 1. Estimate the local cutoff frequency.
The purpose of determining the local cutoff frequency is to handle the separation and intermittency issues. Using the signal x(t) as an example, the following steps are performed [30]:(i)Step 1. Find the maximum timing of x(t) expressed as ui, i = 1, 2, 3, ….(ii)Step 2. Find out all intermittences, expressed as ej, j = 1, 2, 3, … which satisfywhere stands for the bisecting frequency and ρ is the preset threshold on the frequency change rate between two consecutive maxima.Subsequently, the timing of ui is taken as an intermittence, namely, ej = ui.(iii)Step 3. Assume ej, locates on the rising edge could be regarded as a floor. If they are on the falling edge, is considered to be a floor. The remaining parts of are regarded as peaks.(iv)Step 4. Obtain the final local cutoff frequency by interpolating between the peaks.

Phase 2. Filter the input signal using a time-varying filter to obtain the local mean.
B-spline approximation is used to conduct the filter on the signal x(t), which takes the extrema timing of h(t)as knots.By this means, the filter cutoff frequency is in accordance with . Subsequently, filter the input signal x(t) using the built B-spline approximation filter. The approximate result is denoted as m(t).

Phase 3. Check whether the residual signal meets the stopping criterion.
A narrow-band signal is defined by its instantaneous bandwidth. In this approach, a relative criterion, namely,whereis the Loughlin instantaneous bandwidth and denotes the weighted average of the instantaneous frequency of the individual components.
For a given bandwidth threshold ε, the signal can be viewed as a local narrow-band if [30].

3.2. Extreme Learning Machines

Extreme learning machines are feed-forward neural networks with single or multiple hidden node layers for classification, regression, clustering, sparse approximation, compression, and feature learning. These hidden node parameters might be assigned at random and never updated, or they can be acquired from their predecessors and never changed. In most cases, the weights of hidden nodes are usually learned in a single step, resulting in a fast-learning scheme [32, 33]. According to their inventors, these models can create good generalization performance and learn faster than backpropagation networks. According to the research, these models can also outperform support vector machines in classification and regression applications. depicts the [34].

4. The Hybrid Forecasting Model

Figure 2 depicts the fundamental structure of the proposed model. Furthermore, the following are the essential stages related to the construction of the combined TVF-EMD-ELM forecasting models:(i)PV power data are collected and processed to generate training and testing samples. Training is applied for hyperparameter tuning, while the rest is used for model assessment.(ii)The TVF-EMD technique is employed for decomposing PV power data into K distinct frequency components. The nonstationary characteristics of the data can be addressed adequately using this technique.(iii)The generated IMF sequences from the TVF-EMD algorithm are employed as input parameters for the forecasted model.(iv)The forecasting quality on the test set is then evaluated using the fully trained ELM model.(v)The forecasting quality on the test set is then evaluated using the fully trained ELM model on the four studied regions.

5. Evaluation Metrics

Different quality assessments were employed to study the quantitative impact of the proposed combination technique, and they are expressed as [3537]

The flowchart of the proposed method is shown in Figure 3.

6. Results and Discussion

Accurate short-term PV power forecasting is essential for assuring needed power grid capacity availability and storage. This part evaluates the effectiveness of the developed TVF-EMD-ELM approach for half-hour PV output power forecasting utilizing various PV power outputs measured in four different PV systems in Algeria. The suggested TVF-EMD-ELM approach is established to a maximum horizon of 30 minutes, used in the initial phase to extract meaningful information and manage nonstationary characteristics in PV power time series. This study split the original data into thirty IMFs (IMF1, IMF2... IMF30). As can be seen, the resulting subseries appear to exhibit less nonstationarity behavior than the overall data. The developed TVF-EMD-ELM model is tested on four separate PV power datasets, with half of each dataset used for training and the rest utilized for model evaluation. The PV power is the desired output of the proposed TVF-EMD-ELM in the current study and its previously decomposed data with optimal delay selection.

There are several factors that can impact the amount of power generated by a photovoltaic (PV) system, including the amount of solar irradiation, the temperature, and the angle at which the PV array is installed. In this study, we focused on examining the relationship between the PV power that was actually generated and the desired output of the PV system. To do this, we used a trial-and-error approach to evaluate the contribution of various time lags and determine the optimal number of delays.

During the initial phase of our testing, we employed a stand-alone extreme learning machine (ELM) model to identify the most effective delay for our specific application. We evaluated the performance of the forecasting algorithm by analyzing the total PV power generation across four different datasets. The results of all experiments were analyzed using commonly used metrics. As shown in Tables 36, the impact of different delays of endogenous variables on the target output was found to be significant for all of the regions under study.

As demonstrated by the numerical results of our trial-and-error approach, each region had its own optimal lag for forecasting 30-minute PV power. In the Ghardaia region, using ten previous PV inputs was found to be the most suitable lag. For the Laghouat, Djelfa, and Sidi Bel Abbes PV plants, the optimal lags were ten, thirteen, and eleven, respectively. These differences in the selected lags for each region can be attributed to variations in climate conditions and PV plant capacity. The forecasting errors for different delays are clearly depicted in Figures 46.

In the second part of our experiment, we used the specified endogenous PV variables to forecast 30-minute ahead of PV power using the proposed combination methodology. We compared the performance of this methodology, called the TVF-EMD model, to that of the conventional ELM model for four PV plants. The best results for each case are shown in bold in Table 6. We evaluated the performance of the forecasting algorithms on different types of days. The proposed TVF-EMD-ELM model demonstrated superior forecasting performance for 30-minute ahead PV power across all of the studied PV plants in the database. As shown in Table 7, the TVF-EMD decomposition technique significantly improves the forecasting performance of the conventional ELM model. For the Ghardaia region, the nRMSE value was reduced by 17.72% from 21.8% to 3.64%. For the Laghouat, Djelfa, and Sidi Bel Abbes regions, the proposed integration scheme resulted in a reduction of the forecasting error in terms of nRMSE of 19.6%, 23.297%, and 25.796%, respectively. The variability range of the correlation coefficient of the TVF-EMD-ELM model was greater than 99%, while the variation values for the stand-alone ELM model were limited to the range of [91.6%–94.37%].

As can be seen from Figures 710, the dispersion between the measured and forecasted PV power of the stand-alone model is very large, compared with the case of the TVF-EMD-ELM model, where the dispersion is low in all studied regions. The lower the spread, the higher the accuracy, resulting in minor forecasting errors. Comparison performance of the used models in terms of statistical metrics shows that the conventional model cannot provide sufficient forecasting performance for PV plant systems. However, the use of the decomposition technique can boost the forecasting ability of stand-alone models with considerable improvement.

As demonstrated, the dispersion between the measured and forecasted PV power is much larger for the stand-alone model compared to the TVF-EMD-ELM model, where the dispersion is low across all studied regions. A smaller spread indicates a higher level of accuracy and leads to lower forecasting errors. When comparing the two models using statistical metrics, it is clear that conventional models do not provide sufficient forecasting performance for PV plant systems. However, the decomposition technique can significantly improve the forecasting ability of stand-alone models.

7. Conclusion

In this paper, a novel integrated model based on the decomposition approach was introduced 30 minutes ahead of forecasting PV power. The historical PV power was divided into multiple IMF components from high-low frequency bands through the TVF-EMD algorithm, and the obtained IMF series were supplied into the ELM regression to build the TVF-EMD-ELM model for PV power forecasting. Based on the results, the suggested TVF-EMD-ELM model can estimate the intra-hour variation of PV power with high precision in different regions in Algeria. The performance of the proposed hybridization methodology is validated on four PV power plant systems. The developed forecasting model is easy to build, fast to converge, and uses only exogenous PV power.

This paper focused primarily on assessing the performance of the TVF-EMD decomposition method in improving the time series related to the ELM model’s PV power forecasting accuracy without considering other meteorological or electrically measured parameters such as irradiation temperature and wind speed. These factors will be considered in future research for more exact predictions.

Nomenclature

AI:Artificial intelligence
ANN:Artificial neural network
ARIMA:Autoregressive integrated moving average
ELM:Extreme learning machine
EMD:Empirical mode decomposition
GB:Gradient boosting
IEA:International Energy Agency
IMF:Intrinsic mode function
RMSE:Root mean square error
KNN:K-nearest neighbor
LS-SVR:Least squares support vector regression
MABE:Mean absolute bias error
MAPE:Mean absolute percentage error
MQR:Multiple quantile regression
NMAE:Normalized mean absolute error
nRMSE:Normalized root mean square error
NWP:Numerical weather prediction
QRF:Quantile regression forest
r:Correlation coefficient
RF:Random forest
SARIMA:Seasonal autoregressive integrated moving average
SD:Seasonal decomposition
SKTM:Shariket Kahraba wa Taket Moutadjadida
SVM:Support vector machine
SVR:Support vector regression
NARX:Nonlinear autoregressive with exogenous inputs
TVF-EMD:Time-varying filter-empirical mode decomposition.

Data Availability

The data used to support the findings of this study are available from the corresponding author upon request.

Conflicts of Interest

The authors declare that they have no conflicts of interest.

Authors’ Contributions

Reski Khelifi was responsible for figures, literature search, and state of the art. Mawloud Guermoui was responsible for programming in MATLAB software, manuscript preparation, methodology, and data curation. Abdelaziz Rabehi was responsible for manuscript preparation, conceptualization, methodology, and data curation. Ayoub Taallah and Abdelhalim Zoukel were responsible for review and editing. Sherif S. M. Ghoneim was responsible for conceptualization and methodology. Mohit Bajaj was responsible for conceptualization, methodology, and data curation. Kareem M. AboRas was responsible for review and editing and literature search. Ievgen Zaitsev was responsible for data curation, literature search, and state of the art.