Abstract
Accurate prediction of time series is complex due to nonlinear characteristics but can play a significant role in practical problem. In this paper, a novel varying-coefficient hybrid model is proposed to accurately predict the nonlinear time series. A set of fuzzy radial basis function (FRBF) neural networks is used to approximate the varying functional coefficients of the state-dependent autoregressive model with exogenous variables (SD-ARX). The obtained model is called the fuzzy radial basis function network-based autoregressive model with exogenous variables (FRBF-ARX), which combines the advantages of the FRBF in function approximation and the SD-ARX model in nonlinear dynamics description. Then, a structured nonlinear parameter optimization method (SNPOM) and the modified multifold cross-validation criterion are used to estimate the parameters of the proposed varying-coefficient FRBF-ARX model. The performances of the FRBF-ARX model are used to predict the PM2.5 concentration and simulated SISO nonlinear process, respectively, and the performances of the proposed model are also compared and discussed. The experimental results show that the FRBF-ARX model has better performances of accuracy on nonlinear time series forecasting than that of other models.
1. Introduction
Time series forecasting has become a hot topic in recent years. Time series prediction plays an important role in making better economical service, weather service, and individual decision under uncertainty. Owing to the importance of prediction in various applications, many prediction methods and techniques have been proposed to address this issue [1]. For example, transductive long short-term memory (LSTM) is used to predict weather conditions [2]. In [3], seasonal autoregressive model is used to predict hourly water demands. In [4], financial data series are predicted by using clustering methods and support vector regression (SVR). However, time series data do not always have the same characteristics [5]. Time series data are high-dimensional and complex with unique properties that make them challenging to analyze and model. The main goal is how to improve the accuracy of prediction model. Under this background, it has become a challenging problem to accurately predict the future values. Therefore, time series prediction is a difficult task. In order to effectively solve this problem, a wide variety of prediction models have been proposed in many studies. Forecasting methods can be classified into three categories: statistical models, artificial intelligence (AI) models, and hybrid models [6–8].
In the first category, many statistical models are referred to as common time series models. The linear autoregressive (AR) model, autoregressive integrated moving average (ARIMA) model, and autoregressive moving average (ARMA) model are the traditional and effective models for time series forecast. In some studies, the ARIMA model is used to predict groundwater level, and the experiment showed that the ARIMA model is effective and feasible to predict groundwater level [9]. In [10], tourism demand was forecast with the ARMA method, and the experimental results demonstrated that the ARMA model has certain prediction accuracy. However, those AR-type models mentioned above can well model the linear part of time series, but they are not suitable for nonlinear time series modeling [11].
In the second category, many AI models have been proposed in various forecasting fields in recent years, such as artificial neural network (ANN), support vector machine (SVM), and extreme learning machine (ELM). For example, in [12], neural networks were used to predict Indonesian electricity load, and the experimental results show that the neural networks can obtain better prediction accuracy. A nonlinear machine learning technique of SVM is proposed to analyze shallow water bathymetry data; based on the experimental result, SVM provides a better performance for shallow depth ranges [13]. In [14], the online sequential extreme learning machine (OSELM) was applied to forecast daily streamflow at two small watersheds in Canada, British Columbia, at lead times of 1–3 days. The experimental results show that OSELM is an effective method because of its forecast accuracy.
However, the traditional single prediction model used in the original data series cannot capture the complicated relations existing in the nonlinear data series [6]. Therefore, many researchers have proposed some hybrid models that combine neural network with other models [8]. Because the hybrid model combines the advantages of multiple single models or multiple algorithms, hybrid models can achieve better prediction accuracy [15, 16] For example, in [17], least square support vector machine, singular spectrum analysis, deep belief network (DBN), and locality-sensitive hashing are constituting a hybrid model for wind power prediction. The simulation results show that the prediction performance of the proposed hybrid model outperforms all the other models. In [18], single-linear, hybrid-linear, and nonlinear prediction techniques are used to predict energy demand in China and India, and the experimental results show that proposed techniques obtain a high fitting precision. A linear regression (LR) model and DBN model are used for time series forecast, and the experimental results show that the proposed hybrid model may be a useful tool for time series forecasting [19]. A transferred recurrent neural network- (RNN-) based framework and a generative adversarial network-based (GAN-based) model are proposed to predict battery calendar ageing [20, 21]. In [22], a hybrid grey double exponential smoothing model is proposed to the prediction of PM2.5 and PM10, and experimental results show that the proposed hybrid model has a higher prediction accuracy.
Although combining different individual models together in a hybrid model can improve the predictive ability of the model to a certain extent, the predictive ability of the model can still be further improved. The state-dependent AR model with functional coefficient is a class of general nonlinear time series. Based on aforementioned research, a novel hybrid prediction model can be obtained in this paper. A set of fuzzy RBF neural networks is used to approximate the coefficients of a state-dependent AR (SD-AR) model, and then the FRBF-AR model is obtained [23]. The obtained FRBF-AR model has the advantages of both the SD-AR model in the description of nonlinear dynamics and FRBFs in function approximation. The complexity of the FRBF-AR model is decomposed into the AR part. Therefore, the FRBF-AR model is locally linear and globally nonlinear.
The function type coefficients of a state-dependent autoregressive model with exogenous variable (SD-ARX) are approximated by a set of FRBFs; then, the FRBF-ARX model is obtained in this paper for nonlinear time series forecast. The motivation to the proposed FRBF-ARX model is to extend FRBF-AR model by several exogenous variables as input signals. The obtained FRBF-ARX model has the advantages of both the SD-ARX model in the description of nonlinear dynamics and FRBFs in function approximation. Therefore, the complexity of the FRBF-ARX model is decomposed into the linear ARX part. The FRBF-ARX model can be considered as locally linear and globally nonlinear. It is well known that the prediction accuracy of the model for time series prediction depends on the estimation method. In fact, it can be seen in Section 3 that the parameters of the FRBF-ARX model can be divided into linear and nonlinear parts, and the number of linear parameters is larger than the number of nonlinear parameters. In this paper, a structured nonlinear parameter optimization method (SNPOM) [24] that combines the Levenberg–Marquardt method (LMM) for nonlinear parameters estimation and the least-squares method (LSM) for linear parameters estimation at each iteration is also used to identify the parameters of the proposed FRBF-ARX model. The linear weights are updated many times when the nonlinear parameters are updated to look for the search direction. Therefore, it is a very effective method for parameter estimation of the proposed FRBF-ARX model in this paper. The proposed FRBF-ARX model and the SNPOM are combined to predict different time series in this paper. The simulation results show that the proposed FRBF-ARX model exhibits better prediction accuracy compared with other prediction methods or models.
The remainder of this paper is organized as follows. Section 2 describes in detail the FRBF network-type models in the paper. Section 3 describes the proposed hybrid algorithm. The comparative experimental investigation is given in Section 4. Finally, Section 5 gives the concluding remarks.
2. Nonlinear System Modeling
2.1. State-Dependent ARX Model
Without loss of generality, the following ARX model can be used to describe single-input and single-output (SISO) nonlinear time series systems.where denotes the output of the nonlinear system, denotes the input of the system, and represent the white noise. denotes nonlinear mapping, and represent the order of the output and input, respectively, and is the state vector at time , and it may contain the input series or/and output series.
Many different types of function are used to approximate the unknown nonlinear map . The state-dependent ARX (SD-ARX) model is a general version and can be described as follows.where , , and are the state-dependent coefficients of model (2). Model (2) has an autoregressive structure that is regarded as a linear ARX model at time . That is to say, a nonlinear process can be split into a large number of small segments; therefore, model (2) can be regarded as locally linear with each segment.
2.2. FRBF-ARX Model
FRBF neural network is an efficient tool to solve nonlinearity approximation problem, FRBF neural network is used to approximate the coefficients and of the model (2), and then the obtained model (2) is called as the FRBF-AR model [23]. The structure of the single FRBF neural network is shown in Figure 1, and the input-output relationship of the single FRBF neural network can be described as follows:where , , , and are the output values of the input layer, fuzzification layer, fuzzy inference layer, and output layer, respectively; is the number of the neural nodes in the fuzzification layer; is the number of nodes in the fuzzy inference layer; is a constant bias for the output layer; represent the connection weights between the output layer node and the fuzzy inference layer node ; is the input vector for the FRBF neural network; is the center vector; is the scaling parameter; and denotes the Euclidean norm.

If the state-dependent coefficients of model (2) are approximated by a set of FRBF neural networks, the model derived is called the FRBF-ARX model, which is given bywhere and represent the order of the model; are the centers of FRBF networks; are the scaling parameters; and is the dimension of the state vector . are the state-dependent coefficients in model (4). represents the structure of the FRBF-ARX model, and represents the structure of the ARX model in this paper. It can be seen from equation (4) that the single FRBF network (3) is an integral part of the FRBF-ARX model. Therefore, the FRBF-ARX model can be regarded as a more general nonlinear model than the single FRBF network.
3. Estimation of the FRBF-ARX Model
The parameters of the FRBF-ARX model to be identified include both input order and all the parameters in the FRBF-ARX model. If all the parameters to be identified in the proposed FRBF-ARX model are not treated differently, for example, if all the parameters are identified by the Levenberg–Marquardt method (LMM), it will need a large of computation, and it is difficult to obtain good results. In [24], Peng et al. proposed a structured nonlinear parameter optimization method (SNPOM) for parameter estimation. The search space is divided into linear parameter space and nonlinear parameter space by the SNPOM. The linear parameters and nonlinear parameters are identified by LSM and LMM, respectively. The linear parameters have been updated many times when each nonlinear parameter is calculated. This method greatly improves the convergence speed and prediction accuracy. Therefore, SNPOM is also used to identify the parameters of the proposed FRBF-ARX model in this paper.
According to the SNPOM, the optimization process for the proposed FRBF-ARX model is demonstrated according to the following steps [23]. Step 1. Parameter classification: the linear parameters for the FRBF-ARX model are given as follows: and the nonlinear parameters for the FRBF-ARX model are also given as follows: To make the SNPOM more suitable for estimation of FRBF-ARX model, the proposed FRBF-ARX model is redesigned into two forms as follows: or where represents the vector for all of the linear parameters and represents the vector for all of the nonlinear parameters. Equation (8) is the regression form of equation (7). Step 2. Initialization: a subspace is chosen randomly as the initial value of from the vector space , and the initial value of the scaling factor is calculated according to the following equation: After the initial value of the scaling factor is selected, the initial value of the nonlinear parameter is selected and fixed, and LSM is used to calculate the initial values of the linear parameters . where represent the measured data set, represent the largest time lag in model (7) or model (8), and represent the number of data series. Step 3. Parameter optimization: All the parameters are used to optimize the objective function by the sum of squares of residuals. The optimization of the objective function is given as follows:
The following equation is a parameter optimization problem:
The parameters are optimized by the cross-iteration process, and the nonlinear parameters are updated by the following equation:where is the search direction, is a scalar step length parameter, and the value of can be obtained by the following equations:where is used to control the magnitude and direction . When tends to zero, will tend to the Gauss–Newton direction. However, when tends to infinity, will tend to the steepest descent direction. In equation 14, a step length of unity Bk is taken in the direction dk. is the same as the mixed quadratic and cubic polynomial interpolation and extrapolation method.
For the linear parameters, LSM can be used to calculate according to the following equation:
The step length was determined by the line search process in equation (14) to ensure thatand at each iteration, the parameters and are updated according to equations (14) and (16), respectively. Finally, the best parameters for and are obtained for decreasing the objective function (11) at the iteration. Therefore, the procedures of the proposed FRBF-ARX model for the parameter’s estimation are summarized in Table 1.
4. Simulations and Applications
In order to illustrate the effectiveness of the proposed FRBF-ARX model, the root mean square error (RMSE) and the normalized mean squared error (NMSE) are used to evaluate the performance of the FRBF-ARX model, which are given as follows:where represents the actual value, represents the predicted value, represents the mean of the actual value, and represents the length of the data series. The values of RMSE and NMSE are used to evaluate prediction accuracy in this paper. The smaller the value of RMSE and NMSE, the higher the prediction accuracy of the model.
4.1. PM2.5 Concentration Prediction
The dataset used in our experiment comes from 1/1/2010 to 3/26/2010 PM2.5 concentration (ug/m3) and cumulated wind speed (lws, m/s) [25, 26]. Those data series belong to time series. In order to facilitate PM2.5 and lws data analysis, some abnormal values are removed. Therefore, a total of 2023 data series are obtained and are shown in Figure 2. Figure 2 gives the value of PM2.5 and lws, in which the first 1500 data points are used as the training datasets, and the remaining 523 data points are used as the testing datasets. The historical data of PM2.5 and lws are used to predict the future value of PM2.5 by the proposed FRBF-ARX model in this section.

Traditionally, the performances of the prediction model are usually influenced by the choices of the input variables. In order to find reasonable model structure, different input combinations are shown in Table 2. When the input variable is unchanged, the selection of the number of the neural nodes in the fuzzification layer and the number of nodes in the fuzzy inference layer is determined according to the minimum MSE value. Then, the total number of the prediction models for each model will reach up to 5 for input variables. The comparison results of different models under different input variables are given in Table 2. In order to show the comparison more intuitively, two histograms based on the values of RMSE and NMSE of different models are given in Figures 3 and 4, respectively. It can be seen from Table 2 and those two figures that the prediction result of proposed FRBF-ARX model has the best prediction accuracy compared with other prediction models.


In order to illustrate the computational complexity and superiority of the proposed method compared with other modeling methods, Table 2 also gives the comparison results of computing time for the training data for different models. Table 2 shows that the prediction result of the FRBF-ARX model is better than that of linear ARX model for different input values. This is because the FRBF-ARX model not only has the superiority of FRBF in function approximation but also has the nonlinear behavior description ability of the SD-ARX model. However, because of the complexity of training FRBF modules, the training time may be longer than that of the linear ARX model. In time series analysis, the prediction accuracy is much more important than the computational complexity of the proposed method. Moreover, the training of FRBF-ARX model parameters is performed offline, so in most cases, the use of the FRBF-ARX model in practice may be not affected by long time offline training. Table 2 shows the computing time of MATLAB for training data. We computed the results in this paper using the PC with Intel(R) Core (TM) i7-9700 CPU @ 3.00 GHz.
In order to illustrate the advantage of the proposed FRBF-ARX model, the prediction process of FRBF-ARX (5, 4, 9, 9, 2) is given in this section. In the process of PM2.5 prediction, the PM2.5 concentration and the lws sequence values of the past five hours are used to predict the PM2.5 concentration an hour later. This PM2.5 data series was used by the optimization method described in Section 3 to illustrate the superiority of the proposed FRBF-ARX model. In terms of FRBF neural network construction, we use the FRBF-ARX model, where , , , , and , to model this complex PM2.5 data series. Then, we obtained model (19), given as follows:
The results obtained from the proposed FRBF-ARX model are shown in Table 2; also, the prediction results obtained from the single ARX model are shown in the same table for comparison. It can be seen from Table 2 that the proposed FRBF-ARX model produces much smaller prediction errors compared with the single ARX model. Figure 5 gives comparison between the original data and the predictive values of the PM2.5 for the testing data. Figure 6 shows the predictive errors and histograms for the proposed FRBF-ARX model for the testing data. Figure 6 shows that the prediction errors have obvious Gauss distribution, and it verifies that the proposed FRBF-ARX model can be used to predict nonlinear time series.


Figure 7 shows the parameter search process using the proposed method for estimating the FRBF-ARX model. The value of the objective function, linear parameters, and nonlinear parameter search process are shown in Figure 7, respectively. Figure 7 shows that the objective function is convergent and the linear estimated parameters and the nonlinear estimated parameters are also convergent. Therefore, the parameter search process proposed in this paper is feasible.

4.2. Simulated SISO Nonlinear Process
The following nonlinear input-output system is used to evaluate prediction performance for the FRBF-ARX model in this section.
A system input is simulated by the following form:
1000 input-output data pairs are obtained from equations (20) and (21), as shown in Figure 8. The first 500 data points are used as the training data, and the remaining 500 data points are used as the testing data series. The proposed FRBF-ARX model is used to identify this SISO nonlinear process. The parameters of the FRBF-ARX model are selected as , , , and with a smaller value of MSE (3.0995) compared with the single linear ARX model (5.0385) for the testing data. The obtained FRBF-ARX model (22) is used to describe complex nonlinear systems (20) and (21). A comparison of the original outputs and the predicted outputs of the FRBF-ARX model for the testing data is given in Figure 9, and it can be seen from Figure 9 that the proposed FRBF-ARX model obtains good prediction accuracy for the data series.


In order to evaluate the performance of FRBF-ARX model in a robust manner, a repetition experiment is used in this data series. The data from Figure 8 are randomly scrambled; then, the new 1000 input-output data pairs are obtained. The first 500 input-output data pairs are used as training data, and the remaining 500 data pairs are used as testing data. According to this way, 10 sets of the new input-output data pairs are obtained, equation (22) is used to model each new data pair, and the experiment prediction results are given in Table 3 by using ARX and FRBF-ARX models, respectively. The MSE value for each model is given in Table 3 for the testing data. It can be seen from Table 3 that the proposed FRBF-ARX model obtains the smallest modeling error compared with other models. Therefore, the FRBF-ARX model is an effective prediction model for the time series forecast.
5. Conclusions
Many researchers have been devoted to designing feasible prediction tools to improve the prediction accuracy of time series in recent years. In order to improve prediction precision, a novel varying-coefficient FRBF-based nonlinear state-dependent ARX (FRBF-ARX) model combined the FRBF with function approximation capability and the SD-ARX framework was proposed to predict nonlinear time series data in this paper. A structured nonlinear parameter optimization method and the modified multifold cross-validation criterion are used to optimize the parameters of the FRBF-ARX model. Then, the proposed FRBF-ARX model is compared with other models in two time series data. Based on several performance indexes, experimental results demonstrated that the proposed FRBF-ARX model has better prediction accuracy. It can be seen from this paper that it is feasible to use the hybrid model for describing the nonlinearity of some time series data. As a future work, the FRBF-ARX model will be used to design nonlinear model predictive controller (MPC) for some nonlinear control systems. The application of the proposed FRBF-ARX model in modeling multivariable system is also the main research in our future work.
Data Availability
The data used to support the findings of this study are available from the corresponding author upon request.
Conflicts of Interest
The authors declare that they have no conflicts of interest.
Acknowledgments
This research was supported by the Key Projects of Natural Science Research in Colleges and Universities of Anhui Province (grant nos. KJ2020A0508, 2022AH051054, and 2022AH051038), Humanities and Social Sciences Research Project of Anhui Province (grant no. SK2021A0374), and Key Research and Development Plan of Anhui Province (grant no. 202104a05020050).