Abstract
To predict the daily air pollutants, the fractional multivariable model is established. The hybrid model of the grey multivariable regression model with fractional order accumulation model (FGM(0, m)) and support vector regression model (SVR) is used to predict the air pollutants (PM10, PM2.5, and NO2) from December 31, 2018, to January 3, 2019, in Shijiazhuang and Chongqing. The absolute percentage errors (APEs) are used to determine the weights of the FGM(0, m) and SVR. Meanwhile, the Holt–Winters model is used to predict the air quality pollutants for the same location and period. When the mean absolute percent error (MAPE) is 0%–20%, it indicates that the model has good accuracy of fitting and prediction. The MAPE of the hybrid model is less than 20%. It is shown that except for the PM2.5 concentration prediction in Shijiazhuang (13.7%), the MAPE between the forecasting and actual values of the three air pollutants in Shijiazhuang and Chongqing was less than 10%.
1. Introduction
According to the statistical data in China [1], the 338 cities had an average of 79.3% of days with good air quality (meet the air quality standard), which increased to 1.3% compared with 2017. The number of days with heavy pollution was 2.2 percent, which fell to 0.3% compared with 2017. The PM2.5 concentration was 39 microgram/m3, which fell to 9.3% compared with 2017. The concentration of PM10 was 71 microgram/m3, which fell to 5.3% compared with 2017. The air quality in China improved in 2018 on a whole, but only 121 of 338 cities meet air quality standards as shown in Table 1. When the concentration of the air pollutants (PM10, PM2.5, and NO2) meets the standard, the air quality is regarded as good. Otherwise, the air quality will be regarded as poor. The 24-hour air pollutant standard implemented since 2016 in China is shown in Table 1 [2]. In addition, there were 822 days of severe pollution, 20 more than that of 2017. It indicates that the governance of air quality is still a problem that cannot be ignored.
In recent years, air quality has attracted more and more attention, and more and more research studies have been done on air quality. The impact of foreign direct investment and research as well as development on China’s industrial CO2 emission reduction has been studied and its trend has been predicted [3]. A seasonal stacked autoencoder model combining seasonal analysis and deep feature learning was proposed for forecasting the hourly PM2.5 concentration in Beijing [4]. An integrated short-duration memory neural network was proposed for the prediction of hourly PM2.5 concentration in Beijing [5]. The trend of the observational PM10 concentrations in Shimla city, India, was analyzed [6]. The predictive models can be divided into two categories (single model and the hybrid model). Some scholars used a single model to study air quality, and the multigene genetic programming was used to predict the concentrations of PM10 [7]. The grey Markov model was used to predict the concentration of air pollutants in Pingdingshan [8]. A single dependent variable partial least squares regression was used to predict PM2.5 real-time concentration in Beijing [9]. The grey Holt–Winters Model was used to predict the air quality indexes of Shijiazhuang and Handan [10]. A microscale land use regression model was used to predict NO2 concentrations at a heavy trafficked suburban area in Auckland, NZ [11]. The seasonal grey one variable model with fractional order accumulation was used to predict air quality indexes of Xingtai and Handan [12]. The optimized particle swarm was used to predict the concentration of air pollutants in Aburrá Valley, Colombia [13]. The empirical mode decomposition based on the multifractal detrended fluctuation analysis method was used to study the daily PM2.5 concentration in Hong Kong [14]. A land use regression model was used to estimate annual and seasonal PM1, PM2.5, and PM10 concentrations [15]. Many scholars combined two different models to study the air quality, and the hybrid model of the regression models and feedforward backpropagation models with principle component analysis was used to predict the daily PM10 concentrations [16]. The mixed model of information gain and least absolute shrinkage was used to predict the air quality index [17]. A linear and an artificial neural network statistical model have been developed and validated and established to forecast the short-term PM10 hourly concentrations in the city of Brescia (Italy) [18]. The mixed air quality assessment model was designed and applied to analyze the pollution sources of PM2.5 [19]. A mixed forecasting model of daily air quality index considering air pollution factors in Beijing and Guilin was proposed [20]. A hybrid particle swarm optimization-support vector machine model based on clustering algorithm was used to forecast the short-term atmospheric pollutant concentration in Beijing [21]. A hybrid multiresolution multiobjective ensemble model was used to forecast the daily PM2.5 concentrations [22]. The hybrid model of autoencoder with bidirectional long short-term memory neural networks was used to predict the PM2.5 concentration [23].
In recent years, more and more hybrid models have been used by scholars to predict. However, few of them will use the hybrid model of the artificial intelligence algorithm and statistical algorithm and few scholars used the hybrid model to predict the air pollutants. It has been proved that the hybrid models with good prediction effect in M4 are the combination of artificial intelligence algorithm and statistical algorithm. In order to improve the prediction accuracy, a hybrid grey multivariable regression model with fractional order accumulation model [24] and support vector regression [25] model (FGM(0, m)-SVR) model is proposed to predict air pollutants (PM2.5, PM10, and NO2) in this paper.
This paper is divided into five parts. In Section 2, the situation in Shijiazhuang and Chongqing is introduced. In Section 3, the hybrid model is introduced. In Sections 4 and 5, the process of calculation and the results of Shijiazhuang and Chongqing are shown, respectively. Through the analysis of the calculation results, some suggestions for the air quality in Shijiazhuang and Chongqing are given in Section 6. Meanwhile, the conclusions are summarized in Section 6.
2. Location and Air Quality in Shijiazhuang and Chongqing
2.1. Location in Shijiazhuang and Chongqing
Shijiazhuang is the capital city of Hebei Province. It is located in the north China plain, which is adjacent to Beijing and Tianjin in the north, Bohai in the east, Taihang Mountain in the west, and the economic zone in the south. Shijiazhuang is the gate of the capital city, 273 kilometers away from Beijing. It is located between latitude 37°27′∼ 38°47′ and longitude 113°30′∼115°20′ (as shown in Figure 1). Shijiazhuang is one of the most polluted cities in China. According to the statistics in 2018, Shijiazhuang ranked 168th among 169 cities with poor air quality in China.

Chongqing is an economic, financial, scientific, and technological innovation, shipping and commercial logistics center in the upper basins of the Yangtze River. It is located in the southwest of China’s inland, Hubei and Hunan in the east, Guizhou in the south, Sichuan in the west, and Shanxi in the north. Chongqing is located longitude 105°17′∼110°11′ and latitude 28°10′∼32°13′ (as shown in Figure 1). Chongqing is also a heavily polluted city. Compared with previous years, the condition of air quality in Chongqing has improved significantly in 2019. But it is still a long way from the goal set by the Chongqing Ecology And Environment Bureau that ensures the number of days with good air quality in 2019 stays above 300. Chongqing has been known as “the city of fog” and the air quality ranks behind other cities in China.
2.2. Air Quality in Shijiazhuang and Chongqing
The number of days with good air quality (up to the air quality standard) as shown in Table 2 is obtained from the website of Shijiazhuang Environmental Protection Bureau (http://www.sjzhb.gov.cn) and Chongqing Environmental Protection Bureau (http://www.cepb.gov.cn/), respectively. The days with good air quality in Shijiazhuang and Chongqing from 2014 to 2018 can be seen from Figure 2 clearly.

As shown in Figure 2, despite the increasing efforts of government governance, the number of days with bad air quality had been increasing since 2015. According to the statistical data of the Hebei Province Environment Protection Hall in 2018, the air quality of Shijiazhuang is the worst in Hebei province. In addition, according to the statistics of China Environment Network, the air quality of Shijiazhuang was the worst among the 11 cities when it was ranked by air quality composite index. However, the target of air quality has been proposed in the “Three-year Action Plan for Shijiazhuang City to Win the Blue Sky Defense War (2018–2020),” and the days with good air quality in Shijiazhuang will exceed 176 days in 2019. It is mentioned in the plan that Shijiazhuang will complete the targets of “The 13th Five-year for Economic and Social Development of the People’s obligatory” for air environmental quality until 2020, the main atmospheric pollutants emissions will be cut, and the rank of air quality strives to exit from the last 10 in 169 cities of China. Therefore, how to effectively predict the air quality is particularly important.
The number of days with heavy pollution in Chongqing is increasing year by year. The number of days with good air quality in Chongqing reached 316 in 2018. At the same time, the arrangements for the environmental protection work of 2019 had been made in the teleconference of Chongqing environmental protection on January 21, 2019: ensuring that the number of days with good air quality remains above 300 and that the average annual concentration of fine particulate matter is kept within 40 micrograms per cubic meter. From 2016 to 2018, the number of days with good air quality in Chongqing was 301, 303, and 316, which is inseparable from the government’s governance measures. If the air pollutant can be predicted more accurately, the air pollution control will be more effective.
3. The Construction of Model
By accumulating generation operators, the FGM(0, m) model can transform the data from nonlinear to linear. At the same time, the data can be mapped from nonlinear to linear by using the SVR model. The historical parallelism was existed in the data of the same period. By processing the data through FGM(0, m) and SVR, better accuracy of prediction will be achieved, so the FGM(0, m) model and SVR model are used to forecast the air pollutants in this paper. In China, the number of people who choose to travel during the New Year’s Day is larger, and people’s travel is also affected by the environment, so it is particularly important to predict the air pollutants more effectively. Meanwhile, taking the air pollutant (PM10) in Shijiazhuang as an example, the Holt–Winters model is used to calculate the fitting and predictive values. The basis of the Holt–Winters model is that the time series with linear trend, seasonal change, and random fluctuation can be decomposed and combines with exponential smoothing method to establish a forecasting model. In recent years, the Holt–Winters model is often used to predict the seasonal data, and it was used to predict the air pollutants in Shijiazhuang and Handan [10]. The Holt–Winters model was used to predict the air pollutant concentration (PM10) in Shijiazhuang, and the results are contrasted with the hybrid model in this paper.
3.1. The Model of FGM(0, m)
The GM(0, m) model has been widely used in recent years, and it was used to analyze the influence factor for the construction of the model in vocational high school [26]. The GM(0, m) model was used to analyze the consumer experience on small- and medium-sized enterprises in creative living industry [27]. In this paper, the hybrid model of FGM(0, m) and SVM is applied to predict air pollutants. The FGM(0, m) model is introduced as follows:
is the system characteristics sequence.are the sequences of related factors.
Then,is the order accumulation generate operator. , , : is the order accumulation sequence:
Therefore, the equation of the FGM(0, m) model is given bywhere are model parameters.
3.2. The Model of SVR
The SVR model is a linear separable model based on kernel functions that converts linearly indivisible data into high-dimensional space. In this paper, SVR model is used to learn the historical data, and according to the learning results, the model will make predictions. For the training sample set: .
represents the input data of SVR, and represents the output data of SVR. The function model of SVR is given by
Among them, represents predicted value, represents a nonlinear mapping function, represents the weight vector, and is the bias. and give us the following formula:where and represent the maximum error coefficients of the penalty coefficient and insensitive loss function, respectively. and represent the relaxation coefficients. represents the sample size of input data.
The weight vector can be expressed as follows:where and represent Lagrange coefficient, respectively. The mathematical model equation of SVR is given bywhere represents the kernel function for calculating the inner product of two input vectors in high-dimensional eigenvectors. The function of sigmoid is used in this paper.
3.3. The Basis of Model Weights
Through the statistical interpretation of multiple prediction models, a robust short-term forecast of wind power generation under uncertainty is proposed [28]. A new method to determine the weight of the hybrid model is proposed. The root mean square errors (RMSEs) of model are used to determine the weight of the models. The RMSE is used as a performance index, which indicates the accuracy of the forecasting models due to awareness of RMSE over large errors in prediction. In this paper, the APEs are used to determine weights according to the same principle.
3.4. The Hybrid Model of FGM(0, m) and SVR
In this paper, the fitting values and predictive values obtained by FGM(0, m) model and SVR model are given different weights, respectively. Then, the results obtained by the method described above are summed up, which is the final result. The calculation process is as follows:where is the forecasting value of the FGM(0, m) model and is the forecasting result of the SVR model. is the actual values.
The weights of the two calculation results are divided, and then, the results are summed up:
The APE is calculated between the final results and the actual values:
The MAPE from December 20 to 30 in 2018 is taken as the fitting error. The MAPE from December 31, 2018, to January 3, 2019, is taken as the predictive error. The process of modeling is shown in Figure 3.

4. The Calculation Process and Results in Shijiazhuang
In order to verify the accuracy of the hybrid model, three air pollutants in Shijiazhuang are used (PM10, PM2.5, and NO2) from December 20 to January 3 between 2014 and 2017 and December 20 to December 30, 2018, as the samples for forecasting. The FGM(0, m) model and the SVM model are used to predict the values from December 31 in 2018 to January 3 in 2019. According to equations (11)-(12), the weights of the FGM(0, m) model and the SVM model are determined. Then, the calculated results of the FGM(0, m) model and SVR model are multiplied by the above weights, respectively. According to equation (13), the fitting and forecasting values of the hybrid model can be obtained. The accuracy of the hybrid model is proved by calculating the MAPE between the final results and the actual values of the three models. The original data of the air pollutants are obtained from the website (http://www.airitilibrary.com/Publication/alDetailedMesh?docid=10289488-201903-201903060025-201903060025-s19-28). There are eight monitoring stations, and seventeen monitoring stations have been set up in Shijiazhuang and Chongqing, respectively. According to the mean of the pollutant concentration in each monitoring station for a natural day (24 hours), the 24-hour air pollutant concentration can be obtained.
4.1. Calculation Results of PM10
4.1.1. Calculation Result of PM10 by Using FGM(0, m) Model
Air quality has attracted more and more attention of the government in recent years. Since the meteorological indicators in the same historical period are similar, the meteorological conditions (favourable or unfavourable to pollutant dispersion) affect air quality, and similar meteorological condition generally affects air quality in a similar way. The governance measures and efforts of governments in the same region are not different from each other, and it accounts for that the air quality in the same region is also similar. The same period data (December 20 to January 3) from 2014 to 2017 are used as the independent variables (), and the data from December 20 to January 3, 2018, are used as the dependent variable (). Data of December 20 to December 30, 2018, are fitted and data of December 31, 2018, to January 3, 2019, are forecasted. The calculation process of PM10 in Shijiazhuang is taken as an example, and the original data of PM10 in Shijiazhuang is shown in Table 3.
FGM(0, m) model is used to calculate the fitting values from December 20 to December 30, 2018, and the values from December 31, 2018, to January 3, 2019, are predicted. The calculation process is shown as follows.
A FGM(0, 4) model is established:is the system characteristics sequence, andare the sequences of related factors.
When , the MAPE of FGM(0, m) model is the smallest after the repeated experiments. Then, the system characteristic sequence and relevant factor sequence are carried out using 0.2-order accumulation, respectively; then
Then,
Thus,
Therefore, the estimation equation of FGM(0, 4) is given by
Therefore, the fitting values for PM10 from December 20 to December 30, 2018, and the forecasting values from December 31, 2018, to January 3, 2019, are shown in Table 4.
4.1.2. Calculation Result of PM10 by Using SVR Model
In the research process of this paper, the firefly optimization algorithm and support vector regression model are established by using the toolkit in Matlab. The two parameters and of SVR method are optimized. In the process of selection, the optimal result can be obtained when the MAPE between the actual values and the fitting values is the smallest. In this paper, the optimal results of and are = 3.2103 and = 0.023268, respectively. The kernel function sigmoid is used for learning. The optimal calculation results are obtained after the data are operated 50 times.
SVR is used to calculate the fitting values from December 20, 2018, to December 30, 2018, and the values from December 31, 2018, to January 3, 2019, are forecasted. In this paper, the data from December 20 to December 30 in 2014–2017 were used as the training set. Meanwhile, the data from December 31 to January 3 of the following year in 2014–2017 were used as the testing set. The MAPE was used as the standard to measure the effect of fitting. Taking the concentration of PM10 in Shijiazhuang as an example, the MAPE between the fitting values and the original data is 0.48%. It can be seen that SVR has a good fitting effect and can be used for the forecasting of air pollutants. The calculation results are shown in Table 5.
4.1.3. Calculation Result of PM10 by Using the Hybrid Model
According to equations (11)-(12), the weights are given to the calculation results obtained by the FGM(0, m) model and SVR model, respectively:
Thus,
Then, the predicted value of the hybrid model is given by
The APE of the hybrid model is given by
Similarly, the fitting values of the hybrid model from December 20 to December 30, 2018, and the predicted values from December 31, 2018, to January 3, 2019, are shown in Table 5. Meanwhile, the comparison results between the hybrid model and the Holt–Winters model are shown in Table 6.
The MAPE of FGM(0, m) is 38.4% and 17.7%, respectively. The MAPE of SVR is 13.2% and 26.6%, respectively. However, the MAPE of the hybrid model is 12.3% and 4.4%, respectively. Although the fitting accuracy is improved slightly, the predictive accuracy is improved clearly. However, the MAPE of the hybrid model is relatively small, so the prediction effect of the hybrid model is better. The fitting and prediction accuracies of the hybrid model are 22.1% and 37.1% higher than those of the Holt–Winters model, respectively. It indicates that the fitting and prediction accuracies of the hybrid model are significantly higher than those of the Holt–Winters model.
4.2. Calculation Result of Hybrid Model for PM2.5
It can be seen from Table 7 that the MAPE is 14.1% and 13.7%, respectively. It can be seen that the MAPEs of the fitting values and prediction values of the hybrid model are all smaller, which indicates that the fitting accuracy and prediction accuracy are higher.
4.3. Calculation Results of NO2 by Using the Hybrid Model
According to the calculation results of Table 8, the MAPE for NO2 of the hybrid model is 12.4% and 2.3%, respectively. The MAPE of the hybrid model is very small, and it indicates that both the fitting and prediction accuracies of the hybrid model are high.
5. Calculation Results of the Air Pollutants in Chongqing
Similarly, three pollutants of Chongqing (PM10, PM2.5, and NO2) from December 20 to January 1 between 2014 and 2017, and December 20 to December 30, 2018, are used as samples for forecasting.
5.1. Calculation Result of PM10 by Using Hybrid Model
In this paper, three pollutants (PM10, PM2.5, and NO2) are predicted by using the hybrid model. The same method is used to verify that the hybrid model has higher accuracy than the FGM(0, m) model and SVR model. The FGM(0, m) model and SVR model are used to calculate the fitting values from December 20 to December 30, 2018, and the predictive values from December 31, 2018, to January 1, 2019, respectively. The calculation results of the two models are multiplied by their weights, respectively. The results are shown in Table 9.
The MAPE of fitting and forecasting is 22.6% and 1.9%, respectively. The MAPE of the forecasting values is small, which indicates that the prediction accuracy of the hybrid model is high.
5.2. Calculation Result of PM2.5 by Using Hybrid Model
It can be seen from Table 10 that, for the hybrid model of FGM(0, m) model and SVR model, the MAPE of fitting and forecasting is 9.5% and 3.4%, respectively. The MAPE of the fitting and forecasting values is small, and it can be seen that the precision of the hybrid model has been significantly improved.
5.3. Calculation Result of NO2 by Using Hybrid Model
The hybrid model of the FGM(0, m) model and SVR model is used to calculate the fitting values from December 20 to December 30, 2018, and the forecasting values from December 31, 2018, to January 3, 2019, respectively. As shown in Table 11, the fitting and prediction accuracies are 4.7% and 8.4%, respectively. The fitting and prediction errors of the hybrid model are smaller. It indicates that the fitting and prediction accuracies of the hybrid model are higher. The error level of fitting and prediction is shown in Table 12.
According to the criteria mentioned above, the prediction accuracy of two air pollutants (PM10 and NO2) in Shijiazhuang is 4.4% and 2.3%, respectively. The prediction accuracy of three air pollutants (PM10, PM2.5, and NO2) in Chongqing is 1.9%, 3.4%, and 8.4%, respectively. Although the prediction accuracy of PM2.5 is 13.7%, the prediction accuracy appears to be in the relatively superior range. It indicates that the prediction accuracy of the hybrid model is relatively high.
6. Conclusions and Suggestions
6.1. Suggestion for Shijiazhuang
In 2018, Shijiazhuang ranked 168th among 169 key cities with poor air quality in China, but this year, it plans to move out from the last 10. The plan represents an effort to improve the atmosphere. Combined with the specific situation of Shijiazhuang, the suggestions are given from the following aspects:(1)Firstly, as a city with a relatively dense population, Shijiazhuang should actively reduce the burden of the city and guide the transfer of population to surrounding counties and cities. This will not only reduce the pressure on the city, but also reduce the living pollution, traffic, and environmental pollution of Shijiazhuang.(2)The transformation of the industrial structure should be accelerated, and the transformation of urban development should be promoted from heavy to light. In addition, “reduce the weight” of the city should be taken seriously. The transformation of industries from high-emission industries to emerging industries and environmental protection industries should be guided by the government actively.
6.2. Suggestion for Chongqing
As a city with developed tourism and economy, the annual pollution brought by tourism cannot be underestimated. Therefore, it is necessary to reduce the content of the pollutants in the air. Suggestions are given from the following aspects:(1)Firstly, the main pollution should be cleaned up, such as dust pollution, coal pollution, and industrial pollution, which will lead to the increase of the pollutant content in the air and eventually lead to the decline of the air quality.(2)Secondly, because of the developed tourism in Chongqing, the traffic pollution caused by the huge population flow cannot be ignored. In order to reduce air pollution, citizens and the public should be encouraged to take public transportation instead of private cars.
7. Conclusions
It has been proved that the hybrid model is of great research significance and practicability in the M4 competition in 2019. However, the hybrid models with good prediction effect in M4 are the combination of artificial intelligence algorithm and statistical algorithm. It is shown that the forecasting effect of the hybrid model (FGM(0, m)-SVR) is better. The hybrid model is used to predict three air pollutants (PM10, PM2.5, and NO2) in Shijiazhuang and Chongqing, and it is shown that the prediction accuracy of the hybrid model is significantly higher than that of the single model. The hybrid model can also be used to predict other air pollutants in other cities.
Data Availability
The data used in this study can be accessed via the website of Shijiazhuang Environmental Protection Bureau (http://www.sjzhb.gov.cn) and Chongqing Environmental Protection Bureau (http://www.cepb.gov.cn/).
Conflicts of Interest
The authors declare that they have no conflicts of interest.
Acknowledgments
The relevant researches in this paper are supported by the National Natural Science Foundation of China (71871084), the Excellent Young Scientist Foundation of Hebei Education Department (SLRC2019001), and the Project of High-Level Talent in Hebei Province.