Abstract
The regional ecological risk model is built to predict the regional ecological risk level more accurately by using principal component analysis and optimizing standard BP neural network. Taking Xiangxi Tujia and Miao Autonomous Prefecture as an example, twelve primary factors affecting regional risk are selected. The sample data are processed by principal component analysis. The obtained main components are then used as input factors of the improved BP neural network, and the level of ecological risk is used as output factor. The results indicate that the error between the expected output and the actual output is 4.36% in 2016, 1.08% in 2017, and 5.18% in 2018, respectively, with all controlled within 6%. Compared with the prediction accuracy made by standard BP neural network without principal component analysis, the prediction accuracy made by improved BP neural network with principal component analysis is greatly improved. This comprehensive prediction model provides a better evaluation method for prediction of ecological risk level.
1. Introduction
Just like political security, economic security, and military security, regional ecological security constitutes an important part of national security. Accurate prediction of regional ecological risk is the key to the maintenance of regional ecological security. Before the ecological environment deteriorates, we should make accurate prediction for ecological risk level, take effective measures for controlling ecological risk, and guide the regional ecological system to return to the virtuous circle. Regional ecological risk prediction is a complex systematic project. The predication methods are various, and the evaluation indexes are dramatically different. In recent years, different scholars have put forward plenty of prediction methods, including fuzzy matter element method [1, 2], artificial neural network method [3, 4], grey sequence model [5], and probabilistic method [6]. The aforementioned methods mainly focus on partial evaluation indexes in the process of ecosystem evolution. The precision of these methods is not very high, indicating that the results cannot precisely reflect the actual situation. When the standard backpropagation (BP) neural network method is applied to the prediction of regional ecological risk level, it ignores the correlation among the input variables and may lead to large prediction error. Besides, due to the excessive input data, the efficiency of standard BP neural network method is also obviously decreased [7]. In view of these disadvantages, to predict the regional ecological risk level more accurately, a model which combines the principal component analysis method with improved BP neural network method is built in this paper. The principal components of the original sample data are analyzed by the SPSS software. These independent principal components can summarize most of the information of the raw data and can be used as the input factors for the improved BP neural network. In this way, the efficiency of this model can be greatly improved, which consequently increases the prediction accuracy of regional ecological risk.
2. Prediction Model of Regional Ecological Risk
2.1. Basic Principle of the Principal Component Analysis Method
Principal component analysis (PCA) is a kind of data dimensionality reduction method [8]. In the process of the analysis, multiple indexes are transformed into several representative indexes, and there are few losses of data information in this process. The mathematical model of PCA is shown as follows [9–12].
This paper supposes a set of variables X = {X1, X2, …, Xn}, which are used to describe the research subjects. If there are m evaluation subjects, the sample matrix can be built as follows:
The original index data should be standardized owing to the differences in dimensions and orders of magnitude. The standardization matrix can thereafter be built.
According to formula (2), the correlation coefficient Rij between different variables can be calculated, and the covariance matrix R can be established.
If Rij is large, it indicates that the correlation between different variables is high and PCA should be conducted.
Based on the covariance matrix R, the eigenvalues, the principal component contribution rate, and the accumulative variance contribution rate can be calculated. The number of principal components can be determined. The load matrix of initial factor is established, which can be used to explain the principal components. μ represents the mean value of the random variable X, and the random variable X can be linear transformation. The principal components are unrelated linear combinations. The linear combinations of the initial variables are as follows:
2.2. Improved Backpropagation (BP) Neural Network
Backpropagation (BP) neural network is a multilayer feed forward network, which is trained by the algorithm of error backpropagation [13]. In the forward propagation process, the input information is processed by the input layer and the hidden layer. The actual output of each neuron is calculated. If the actual output does not conform to the expected output in the output layer, the output error is reversely propagated in some way by the hidden layer. At the same time, the error is apportioned among all the units in the hidden layer and the error signal of each layer is obtained. Based on the error signal, the weight of each unit is corrected. There is a continuous cycle between the process of information forward propagation and the process of error backpropagation, which will stop when the squared error of the network reaches minimum [14]. Standard backpropagation algorithm is widely used [15–17]. However, there are some shortcomings in the standard backpropagation algorithm, such as long training time, and slow convergence speed.
The Levenberg–Marquardt algorithm is specifically used to minimize the squared error [18]. Essentially, L-M algorithm combines the gradient descent method with the Newton method. This algorithm can shorten the training time of neural network, accelerate the convergence rate of the network, and obtain accurate prediction results. The squared error of this algorithm is shown as follows:where p represents the sample of p and represents the vector, which consists of the element of . The current location is , and it moves to the new location of ω1. If the amount of movement is small, can be expanded into the first-order Taylor series.where the element of Z is as follows:and the error function can be written in the following form:
In order to achieve the minimum value of E, the derivative ofω1 should be calculated. Therefore, the following formula can be obtained:
Since the step length may be too long, the squared error should be corrected by the following formula:
The minimum value of ω1 can be calculated by the following formula:
When is very small, it becomes the Newton method. When is very large, it becomes the gradient descent method. The step length is . In the process of calculation, should be adjusted according to the actual situation. There is a frequently used method. In the beginning, is arbitrarily selected. The changes of E should be analyzed in each step. If the error declines after using formula (10), ω1 can be retained. should be reduced to this value, and these steps should be repeated. If the error increases, ω0 can be maintained. should be increased tenfold, andω1 should be recalculated. This process repeats until E reaches the required precision [19].
2.3. Prediction Model Based on the PCA Method and Improved BP Neural Network
The prediction model of regional ecological risk is built by combining the PCA method with improved BP neural network. Firstly, the original data related to ecological risk are collected and processed for correlation analysis by using the SPSS software. Secondly, after the original data are standardized by the SPSS software, the principal components (X1, X2, …, Xk) which contain vast majority of information of raw materials can be extracted by PCA. Lastly, the principal components (X1, X2, …, Xk) are used as the input factor for the improved BP neural network, and Y is used as the output factor. This model guarantees the precise prediction of regional ecological risk level. During this process, the input variables with correlation relations can be transformed into those with no correlation by using the PCA method. In this way, this model can reduce the dimensions of data and the number of input factors for the improved BP neural network. Compared with the standard BP neural network, the algorithm for the improved BP neural network is changed, which makes the training time obviously shortened, the convergence rate accelerated, and the prediction accuracy increased. In summary, this prediction model makes full use of the advantages of these two methods, which can effectively solve the classification problems in regional ecological risk assessment. Its structure is shown in Figure 1.

3. Case Study
Taking Xiangxi Tujia and Miao Autonomous Prefecture as an example, the ecological risk level in this area is predicted by the PCA method and improved BP neural network. Twelve factors affecting regional ecological risk are selected [20–24], including the density of population (I1), pesticide usage of per hectare cultivated land (I2), fertilizer usage of per hectare cultivated land (I3), volume of wastewater discharged by every ten thousand yuan industrial output (I4), volume of solid waste produced by every ten thousand yuan industrial output (I5), domestic sewage discharged by per capita (I6), energy consumption of every ten thousand yuan GDP (I7), water consumption of every ten thousand yuan industrial output (I8), the proportion of environmental investment in gross fixed assets formation (I9), the standard discharge rate of industrial wastewater (I10), the comprehensive utilization of solid waste (I11), and the repeating utilization rate of industrial water (I12). The data come from the relevant statistical materials about Xiangxi Tujia and Miao Autonomous Prefecture, which include Xiangxi statistical yearbook (2009–2018), the twelfth 5-year plan in Xiangxi, and the network of Xiangxi statistical information. Specific data are shown in Table 1. Based on the twelve evaluation indexes, the regional ecological risk level is calculated by using the variable weight method and the grey correlation theory [24]. The evaluation results are also shown in Table 1. The numbers of 1, 2, 3, 4, and 5 represent the ecological risk level of I, II, III, IV, and V, which indicate great risk, large risk, normal risk, small risk, and no risk, respectively. The characteristics of each ecological risk level are presented in Table 2.
3.1. Correlation Analysis
In order to prevent collinearity among different factors, which may cause errors in the grading results, the data shown in Table 2 are processed for correlation analysis by SPSS software. The correlation coefficient is calculated by the simple Pearson correlation coefficient. Significance test is carried out through the two-tailed method. Based on the diagnosis results of Pearson correlation, the Pearson correlation coefficient matrix is established (Table 3). The results show that there is obvious collinearity among the density of population, pesticide usage of per hectare cultivated land, domestic sewage discharged by per capita, energy consumption of every ten thousand yuan GDP, the standard discharge rate of industrial wastewater, and the repeating utilization rate of industrial water. Therefore, it is necessary to conduct PCA.
3.2. Principal Component Analysis
The original data are standardized by SPSS software, and the results are shown in Table 4.
The data shown in Table 4 are analyzed by PCA provided by SPSS software. The scree plot of PCA (Figure 2), the list of principal components (Table 5), and the load matrix of principal components (Table 6) can be obtained. Figure 2 indicates that the difference of eigenvalue between Component 1 and Component 2 is relatively large and the difference of eigenvalue among other components is small. It can be preliminarily determined that the first two components can be extracted from the vast majority of information.

Table 5 shows that the eigenvalues of the first two components are both greater than 1 and they are able to explain 85.678% of the total variation. The results meet the requirement that the variance of principal components accounts for 75%–85% of the total variance. Therefore, the first two components are selected as the principal components, which can replace the original variables.
Table 6 shows the correlation coefficient between the original variables and the principal components, which expresses the loading of the two components F1 and F2 on each original variable. According to formula (3), the factor expressions for principal components can be described as follows: F1 = 0.830X1 − 0.986X2 − 0.853X3 + 0.209X4 + 0.925X5 + 0.714X6 − 0.913X7 + 0.730X8 − 0.591X9 + 0.774X10 − 0.655X11 + 0.522X12 F2 = 0.461X1 + 0.055X2 + 0.356X3 − 0.871X4 − 0.273X5 + 0.536X6 − 0.366X7 − 0.513X8 + 0.512X9 + 0.521X10 + 0.676X11 + 0.785X12
Based on the above factor expressions, the principal components of the standardized data can be calculated, which should be used as the input data for the improved BP neural network. The results are shown in Table 7.
3.3. Training and Prediction of Improved BP Neural Network
In the improved BP neural network, the principal components F1 and F2 can be used as the input factor, and the regional ecological risk level R can be used as the output factor. The model can be established by using Matlab software. The data in Table 7 should be divided into two subsets—the training sample subsets (2009–2015) and the prediction sample subsets (2016–2018). In the process of constructing the improved BP neural network, the related parameters should be set as follows: the learning rate is 0.9 and the momentum factor is 0.7. The network structure can be finally constructed through the training, which includes two input nodes, ten hidden layer nodes, and one output node. The training process of the standard BP neural network without PCA is shown in Figure 3 while the training process of improved BP neural network with PCA is shown in Figure 4. These two figures show that the learning steps of improved BP neural network with PCA are obviously reduced, and the training speed is significantly accelerated.


The predictions are shown in Table 8. From 2016 to 2018, the ecological risk levels of Xiangxi Tujia and Miao Autonomous Prefecture are the levels of III,III, and IV. The relative error between the actual output and the desired output brought by improved BP neural network with PCA is less than 6%; the relative error brought by standard BP neural network without PCA is greater than 9%. Compared with the predictions made by the standard BP neural network without PCA, the predicted accuracy of improved BP neural network with PCA is greatly improved.
4. Conclusions
In this paper, twelve factors affecting regional ecological risk are selected. The principal components of the original sample data are analyzed by SPSS software. In this way, the correlation between different indexes is eliminated, and the number of input variables in neural network is reduced. The improved BP neural network is used to predict the regional ecological risk level, which speeds up the training speed and improves the prediction accuracy.
The relative error between the actual output and the desired output brought by improved BP neural network with PCA is 4.36%, 1.08%, and 5.18%, respectively, all controlled within 6%. Compared with the prediction accuracy of standard BP neural network without PCA, the prediction accuracy of improved BP neural network with PCA is obviously improved.
Based on the prediction model combining the principal components analysis method with improved BP neural network, the ecological risk level in Xiangxi Tujia and Miao Autonomous Prefecture can be predicted. The predicted results are consistent with the expected output of the network. It shows that the prediction model is reasonable and feasible and is a better solution for regional ecological risk prediction.
Data Availability
The data come from the relevant statistical materials about Xiangxi Tujia and Miao Autonomous Prefecture, which include Xiangxi statistical yearbook (2009–2018), the twelfth 5-year plan in Xiangxi, and the network of Xiangxi statistical information.
Conflicts of Interest
The authors declare that they have no conflicts of interest.
Acknowledgments
This research was funded by the National Social Science Foundation of China (18BJY057).