Abstract
BP neural network (BPNN) is widely used due to its good generalization and robustness, but the model has the defect that it cannot automatically optimize the input variables. In response to this problem, this study uses the grey relational analysis method to rank the importance of input variables, obtains the key variables and the best BPNN model structure through multiple training and learning for the BPNN models, and proposes a variable optimization selection algorithm combining grey relational analysis and BP neural network. The predicted values from the metabolic GM (1, 1) model for key variables was used as input to the best BPNN model for prediction modeling, and a grey BP neural network model prediction model (GR-BPNN) was proposed. The long short-term memory neural network (LSTM), convolutional neural network (CNN), traditional BP neural network (BP), GM (1, N) model, and stepwise regression (SR) are also implemented as benchmark models to prove the superiority and applicability of the new model. Finally, the GR-BPNN forecasting model was applied to the grain yield forecast of the whole province and subregions for Henan Province. The forecasting results found that the growth rate of grain production in Henan Province slowed down and the center of gravity for grain production shifted northwards.
1. Introduction
Grain is not only an important strategic material related to the national economy and people’s livelihood but also the most basic means of livelihood for the people. Scientific analysis and prediction of grain yield are of great significance to the harmonious and stable development of society and the maintenance of national food security. Henan Province is the main grain export province in China and is the first wheat export province, with the production of all major agricultural products ranking steadily among the top in China. As one of the important grain production areas in China, Henan Province has made great contributions to national food security. Therefore, scientific statistics of Henan Province grain yield data and reasonable prediction of its development trend is helpful to stabilize grain production and guarantee food security.
There are various methods for forecasting grain yield. Traditional grain yield prediction methods mainly include the empirical method, exponential smoothing method, linear regression method, and time series analysis method. These methods are simple and easy to realize, but they are only applicable to short-term grain yield prediction and are still insufficient in mining complex data information, which has great limitations [1]. In addition, grain production is often influenced by a variety of complex factors such as meteorology, land, human use activities, and institutions making accurate forecasting of grain yield very difficult. In recent years, neural network models have become one of the research hotspots of scholars at home and abroad. For example, LSTM, CNN, and BPNN have shown high accuracy and high timeliness when dealing with multivariate, multitemporal heterogeneous data and mining of nonlinear data [2, 3]. The neural network models have proven their power in data mining and agricultural analysis, including crop type classification and yield prediction [4].
Scholars have carried out a great deal of in-depth research into grain production systems using a variety of different methods, among which the BPNN is one of the most widely used important methods [5–9]. In contrast, BPNN has received increasing attention due to its fast convergence, high accuracy, and strong nonlinear mapping capability. Gu et al. [10] integrated particle swarm optimization algorithm and BPNN to construct NCPSO-BP prediction model algorithm to solve the complex problem of grain yield prediction. Li et al. [11] and Zhang and Pan [12] studied the simulation ability and new data prediction ability of multiple linear regression model and BPNN model, and the results showed that the BPNN model was better than the linear regression model in accuracy, stability, generalization degree, and theoretical basis. Some scholars have tried to combine methods such as rough sets, genetic algorithms, and linear regression to improve the prediction speed and accuracy [13, 14]. Guo [15] and Zheng [16] both established a combination prediction model combining principal component analysis and BPNN, which improved the learning convergence speed and prediction accuracy of the neural network model. Wang and Zhu [17] combined the BPNN model with other forecasting models to construct a multiscale combined forecasting model. But because these scholars did not consider the applicability of the combination method and application objects, overfitting problems occur occasionally. Some scholars have also used different algorithms to optimally filter the initial weights and thresholds of BPNN, thus building prediction models suitable for small samples [18]. Hu et al. [19] established the IPSO-BP grain yield prediction model by introducing reproduction and mutation mechanisms and using improved particle swarm optimization (IPSO) to optimize the connection weights and thresholds of the BPNN model. Rong et al. [20] carried out multiple screening and comparisons on the internal nodes and thresholds of the BPNN layer and obtained the optimal network structure. The application results showed that the prediction accuracy of grain output had been significantly improved. But they did not identify the importance of the input variables; the improvement of prediction accuracy has certain limitations. On the whole, scholars have improved and optimized the prediction accuracy of BPNN models from different perspectives, and the results of the relevant research results are remarkable and have a good reference significance.
However, these optimization measures have improved the accuracy of the model to a certain extent. But due to the screening and judgment that the priority of the indicator variables is ignored, it is impossible to fundamentally change the defect that the BPNN model cannot automatically optimize multiple variables, which leads to the learning speed converging slowly and easily falling into the situation of local minimums. The correct choice of input variables determines the validity of the prediction results, while few scholars have considered improving the prediction accuracy of BPNN from the perspective of variable optimization. In addition, grain production is a process of multiple factors acting together, and it is a complex nonlinear dynamic system that includes grayness, randomness, and uncertainty. Existing studies have insufficiently considered the grayness and randomness of data information. The LSTM and CNN, which are similar to BPNN as neural network models, are also lacking at the level of input variable identification, resulting in these network models still suffering from computationally time-consuming and gradient vanishing defects. The improvement in prediction accuracy is often limited by the omission of valuable time-series information.
One of the key objectives of this study is to enhance the adaptability of BPNN and make it more effective in modeling multivariable complex systems by optimizing the number of input variables. Grey relational analysis is an important content of grey system theory [21], which can provide a basis for advantage analysis, factor discrimination, etc. This method can solve the problem of optimal selection of input variables of the BP neural network algorithm on the basis of taking into account the grey nature of forecast information in just the right way. Furthermore, as a tool for dealing with uncertain information, the method is able to reflect more accurately the actual state of its external information in dealing with a wide range of uncertainties, such as randomness, ambiguity, and greyness, and has a good variable screening capability.
In recognition of the strong nonlinear approximation ability of BP neural networks and the variable screening ability of grey correlation, this study combines grey system theory with BPNN and uses grey relational analysis to improve the BPNN for its shortcomings of not being able to identify the priority and importance of input variables and constructs the variable optimization selection algorithm. In addition, models such as long and short-term memory network LSTM and convolutional neural network CNN are used as benchmark models to test the prediction performance of the proposed GR-BPNN model [22]. Finally, the GR-BPNN was developed by integrating the variable optimization selection algorithm with the metabolic GM (1, 1) model. Through the application of GR-BPNN in grain yield forecasting in Henan Province, another key objective of this study is to improve the accuracy of grain yield forecasting data in Henan Province, providing a reliable basis for the formulation of national grain policies, and also providing a new way to quantify and intellectualize grain yield forecasting.
The rest of this paper is organized as follows. Section 2 presents the main steps of the GR-BPNN prediction model. Section 3 verifies the prediction performance of the GR-BPNN model by comparing it with several commonly used benchmark models. Section 4 gives the prediction result and analysis of grain yield in Henan Province. Section 5 draws conclusions and puts forward relevant countermeasures and suggestions for increasing grain yield in Henan Province.
2. Construction of GR-BPNN Model
2.1. Variable Optimization Selection Algorithm
In order to filter out the key input variables and overcome the defect that BPNN cannot automatically optimize multiple variables, this study combined the grey relational analysis and BPNN to build a variable optimization improved algorithm to improve the BPNN model’s recognition ability of important variables. The specific steps are as follows.
2.1.1. Determine Input Variable Priority
In this study, grey relational analysis was used to rank the importance of the input variables and prioritize them. The modeling steps are as follows.
Set as the sequence of system characteristic behaviors, is the correlation factor sequence, and and have the same length, where
Obtain the initial image of each sequence. Make
Strive for the degree of incidence.is Deng’s grey incidence coefficient. is the distinguishing coefficient and generally takes . Well callis Deng’s degree of incidence for sequence and .
Sort relational order.
In general, the factors are always in order as long as they can form a relationship and calculate the degree of correlation. The grey correlation between the sequence of characteristic system behaviors and the sequence of related factors can be noted as . The corresponding correlation sequence is obtained by arranging the elements in correlation degree according to their values from largest to smallest.
2.1.2. Establish BP Neural Network Model
Traditional BPNN divides the learning process into two stages: forward propagation of the signal and backward propagation of the error, and adjusts the “connection weights” and “thresholds” between neurons by training on all input variables. The parameters include the number of neurons in each layer, the activation function, the connection weights, and thresholds. In this study, the first variables corresponding to the correlation series obtained from the grey relational analysis were used as input factors (the number of input nodes is ). The characteristic behavior sequences were used as output factors (the number of output nodes is ).
Well call is the training pair, xi is the input, is the actual output, and its target output is set as . The input node, intermediate node and output node are, respectively, represented by the subscript . The weight from the input layer to the middle layer node is represented by . The weight of node from the middle layer to the output layer is expressed by . The thresholds of functional neurons in the middle layer and output layer are represented by and bj, respectively. stands for the activation function. The topological structure of the BPNN is shown in Figure 1:

When the training pair is inputted, the input weighted sum and output Yj (k) of the middle layer node are, respectively,
The input weighted sum and output Yj (k) of the output layer node are, respectively,
The error of node of the output layer is as follows:
If the sum of error squares of all output nodes for inputs is used as the total network error, then there is a loss function:
The gradient descent method is adopted, and the derivative of with respect to weight and threshold value is taken one by one, and the gradient that makes decrease can be obtained and used as the direction of adjusting weight , and threshold value ah, bj. The adjusted weight is denoted as , , and the adjusted threshold value is denoted as , .
The adjustment formula from the middle layer to the output layer weight is as follows:
The adjustment formula from the input layer to the middle layer weight is as follows:
The adjustment formula of intermediate layer threshold is as follows:
The adjustment formula of output layer threshold is as follows:
The connection weights and thresholds of each unit layer are dynamically adjusted according to the error signal (loss function). Through the cyclic forward propagation and reverse regulation, the weights between neurons and the thresholds of each functional neuron are continuously revised. When the output error signal meets the accuracy requirements, stop learning and get the neural network model corresponding to the first variable. By repeating (6)–(21), variable corresponding to BP neural network model can be obtained.
2.1.3. Accuracy Evaluation
We take the mean absolute percentage error (MAPE), mean absolute error (MAE), root mean squared error (RMSE), and coefficient of determination () to evaluate the accuracy of BPNN models [23–25]. Denote the number of input variables corresponding to the model with the highest accuracy as . The specific formulas are as follows:where refers to the number of samples, is the observation value, is the predictive value, and is the mean of the observed values. The smaller the values of the three evaluation indexes, the higher the accuracy of model fitting.
2.1.4. Flow of Variable Optimization Selection Algorithm
The variable optimization selection algorithm is based on the correlation sequence of input variables, and the optimal BPNN model is obtained by stepwise input of variables and intermodel accuracy evaluation screening. At the same time, the number of corresponding key variables under the model is determined. Then, the predictive modeling is based on the key variables obtained through the screening and the optimal BPNN model. The specific algorithm flow is shown in Figure 2:

2.2. GR-BPNN Prediction Model
2.2.1. Metabolic GM (1, 1) Model
In this study, a metabolic GM (1, 1) model is used to predict key variables. The metabolic GM (1, 1) model is based on the GM (1, 1) model by iterating over old and new information, and the resulting modeling sequence is continuously adjusted to reflect the current operating characteristics of the system as it evolves [26].
(1) GM (1, 1) model modeling steps: Let the observed value of a characteristic behavior sequence of the system be . One accumulative generated sequence is where . The mean adjacent to the sequence is where .
The establishment of first-order linear differential equation model for :where is the development coefficient; is the grey action. The value of the parameter vectors is estimated by the least square method.
of them are as follows:
The solution of the differential equation (22) is as follows:
To make B-b reduction:
The grey prediction model of the original sequence is as follows:
(2) Metabolic GM (1, 1) model predicts principle: First, the GM (1, 1) model was established from the original sequence to predict the new data , which is added to the original sequence and remove at the same time. Then, the GM (1, 1) model is built again to predict the next data. The earliest data is removed and the new predicted data is added to the prediction sequence, so the metabolism is completed until the prediction purpose is completed.
2.2.2. Concrete Steps of GR-BPNN Prediction Model
The priority of variables was determined by grey relational analysis, and the system variables were ranked in order of importance. The optimal BPNN model structure and the corresponding number of input variable nodes under the model were determined by establishing a BPNN prediction model based on different variable combinations. The sorting processing of a large number of input variables by the grey relational model makes the selected variables representative, which realizes the optimization of the node number of input variables of BPNN without subjective screening. It enhances the modeling ability of the BPNN algorithm for multivariable complex systems and the adaptability of the network. The specific steps of the GR-BPNN prediction model are as follows:(i)Step 1. Based on equations (2)–(5), Deng’s degree of incidence of the sequence of characteristic behaviors and the sequence of relevant factors is obtained.(ii)Step 2. Rank correlation according to correlation degree.(iii)Step 3. Establish the BPNN models based on the first variable of the correlation order by formulas (6)–(21).(iv)Step 4. According to equations (22)–(24), the accuracy of m models was tested to obtain the optimal model structure, and the number of corresponding key variables under the model was determined.(v)Step 5. Calculate formulas (26)–(31) to obtain the predicted value of key impact factors.(vi)Step 6. Substitute the predicted value of key impact factors into the best trained BPNN model in step 4 and predict the characteristic behavior sequence.
3. Prediction Performance Evaluation of GR-BPNN Model
3.1. Selection and Treatment of Predictors
In addition to factors that cannot be determined, three aspects of grain production capacity, grain production guarantee, and economic scale are considered to build a system of indicators related to grain yield forecasting in Henan Province. Relevant indicators are selected according to the basic principles of feasibility, purpose, comprehensiveness, comparability, and a combination of quantitative and qualitative indicators. The grain yield at the target layer is taken as the output of the BPNN prediction model. By consulting relevant literature [27] and combining with expert experience, 14 impact factors at the indicator layer in Table 1 are preliminarily selected as the input of the BPNN prediction model. Data for the relevant indicators are taken from the Henan Statistical Yearbook 2001–2020 [28].
3.1.1. Strives for the Degree of Incidence
Using grey modeling software to analyze the relevant index data, the degree of incidence between the output factor and each input factor is as follows:
3.1.2. Arrange the Order of Correlation
The relational order is obtained by the magnitude of the degree of incidence, which is as follows:
3.2. Determination of Key Variables and the Best BPNN Model
The typical three-layer feedforward BPNN structure was selected. According to Figure 2, the variable optimization selection algorithm is written using MATLAB software. The training samples are 15 sample data from 2000 to 2014, and the detection samples are 5 independent samples from 2015 to 2019. The Sigmoid function is used for both the hidden and output layer activation functions. The network connection initialization weights and thresholds are set to random numbers on according to the random generator procedure; The number of nodes in the input layer corresponds to the first variables after the variables have been sorted. The number of nodes in the output layer is 1. Determine the number of hidden layer nodes according to the empirical method of [29]. According to the grey relational analysis results, 14 BPNN models are trained, and the model accuracy comparison is shown in Figure 3.

According to Figure 3, when the first 10 predictors were input, the BP network model reached the highest prediction accuracy. Therefore, the optimal BPNN model topology is 10-10-1 (Nodes of input layer-Nodes of hidden layer-Nodes of output layer). The key factors affecting grain yield are as follows: . Show the observed and predicted values under the best BPNN obtained by screening in the line chart and get the fitting curve graph in the training sample under the best model. Figure 4 shows that the GR-BPNN model only had a poor fitting effect in 2002, but it is quickly adjusted in 2003, indicating that the model has good adaptability.

3.3. Comparison between GR-BPNN Model and Benchmark Model
3.3.1. Benchmark Models and Classification of Accuracy Classes
Considering that GR-BPNN is an improvement on the traditional BPNN, LSTM and CNN are suitable for modeling the time series data in this study, and the traditional econometric models SR, GM (1, N) are equally capable of capturing the feature facts and aggregating the essential elements of interest in this study. Therefore, this study chooses LSTM, CNN, BPNN, SR, GM (1, N) as the benchmark model to compare the accuracy of the model proposed in this research and accurately evaluate the predictive ability of the GR-BPNN model.
The benchmark model is correspondingly introduced as follows: LSTM solves the long memory problem that recurrent neural network does not have by introducing a gate mechanism and can show better performance in nonlinear time series forecasting. The network model parameters mainly include the number of neurons in each network layer and the learning rate; CNN uses local connectivity and weight sharing to transform and abstract the original data matrix in a high-dimensional way and can build different dimensional structure models based on the characteristics of the data set. Its network model parameters mainly include learning rate, number of nodes in each layer of the network, activation function, and step size. Traditional BPNN is described in detail in Section 2.1.2. The basic idea of SR is to reduce the degree of multicollinearity by eliminating variables that are less important and highly correlated with other variables. The optimal set of variables is obtained and predictive modeling is carried out by introducing variables one by one and iteratively testing the variables; GM (1, N) model is a development of the one-dimensional series grey prediction model GM (1, 1). The magnitude and sign of the weight coefficients of the factor variables in this model are used to understand the degree of influence of each factor on the behavioral variables and to model the predictions in the form of differential equations.
In order to verify the robustness and sensitivity of the model estimation results, refer to the formula (24) and use the coefficient of determination to judge the overall fitting accuracy of the model. The small error probability and the posterior difference ratio are used to divide and judge the accuracy levels of different models. Lu [30] and Cao [31] graded the accuracy of the grain yield prediction model based on the posterior difference ratio and the probability of small errors and qualitatively evaluated different grain yield prediction models. The details are shown in Table 2.
3.3.2. Analysis of Comparative Results
The predicted values of the key variables for the metabolic GM (1, 1) model are used as the input of the best BPNN model. The 2015–2019 grain production data in Henan Province is selected as the test sample. For LSTM, a three-layer LSTM network structure is used, and the network model topology is 14-10-1. The loss function refers to formula (11), using the Adam optimizer, the learning rate is initially defined as 0.01 and decreases with iteration. For CNN, a three-layer LSTM neural network is also used, the network model topology is 14-7-1, and the learning rate is initially defined as 0.01. In addition, comprehensively considering three evaluation indicators (MAPE, MAE, and RMSE), the accuracy of six models is evaluated according to formulae (22)–(24). The evaluation results are shown in Figure 5.

It can be seen from Figure 5 that the prediction performance of the six models tends to decrease in order. Among them, the ST prediction model and the GM (1, 1) model have the worst accuracy. The MAPE reached 20.5% and 18.22%, respectively. The MAPE of BP, LSTM, and CNN were all above 5%, and the MAPE, MAE, while the MAPE, MAE, and RMSE of GR-BPNN are 2.97%, 1.64 tons, 2.83 tons, respectively, which is the least error among all models. The prediction performance of the GR-BPNN algorithm model has been greatly improved due to its own factor screening capability and its ability to describe both linear and nonlinear relationships between variables. Therefore, the GR-BPNN algorithm model is more suitable for the prediction of grain yield in Henan Province.
Reference [30] calculate the value and value of each prediction model and was calculated with reference to formula (25). The calculation results were shown in Table 3. As can be seen from Table 3, excluding the SR and GM (1, N), as compared to BPNN, LSTM, and CNN, the prediction model exhibited values that were, respectively, improved by 16.25%, 8.14%, and 5.68%. Furthermore, SR model has the worst accuracy grade, which may be due to the setting bias caused by the elimination of important related variables by the stepwise regression method. Both GM (1, N) and BP are barely qualified, but there is still a large error probability. LSTM, CNN, and GR-BPNN algorithms all reached the accuracy level of qualified or above. Among them, the GR-BPNN model showed the best performance, and the model level is “good,” which proves that the food output prediction algorithm of this research has high prediction accuracy. In the establishment of the GR-BPNN model, 15 prediction variables were selected and 10 key variables were selected to participate in the construction of the prediction model, which effectively simplified the complexity of the model and had high sparsity. Therefore, this model is superior to other models in computational efficiency and has a high running speed. In addition, compared with [30, 31] in the literature, the model built in this study achieves higher accuracy, indicating that the GR-BPNN algorithm model has certain advantages. Based on the above data analysis, it shows that the GR-BPNN algorithm has fast calculation speed, high accuracy, and strong reliability in the grain output prediction.
4. Prediction of Grain Yield in Henan Province Based on GR-BPNN Model
4.1. Predicted Results
Based on the original sequence of 10 key impact factors, namely the corresponding index data from 2010 to 2019, the metabolic GM (1, 1) model is established, respectively, to predict the specific measurement values of key impact factors affecting grain yield in Henan Province during 2020–2025. The prediction results are shown in Table 4—prediction of grain yield in Henan Province based on GR-BPNN model.
The predicted values of 10 key impact factors for grain yield (2019–2023) are used as the input of the trained optimal BPNN model, and the predicted values of grain yield in Henan Province from 2019 to 2023 are obtained accordingly. The prediction results are shown in Table 5.
According to the predicted results in Table 5 and the actual value of grain yield in 2000–2019, the trend chart of grain yield in Henan Province is plotted (Figure 6). As can be seen from Figure 6, the total grain yield of Henan Province will show a steady and increasing trend in the next few years, which is a good development situation for the grain supply of Henan Province and even the whole country. In 2003 (about 35.6 947 million tons), there was a serious reduction in grain production, which was 15.21% lower than in 2002 (about 42.099 8 million tons); the reason is that the grain yield increased year by year during this period, but the phenomena of “sell grain” and “increasing production without increasing income” appeared immediately, which seriously dampened the enthusiasm of farmers to grow grain. In addition, there are some serious natural disasters, such as continuous rain, low temperature, lack of light and heat, hail, collapse, and flood in 2003, which seriously affected the grain yield of Henan Province.

4.2. Results Analysis
4.2.1. Temporal Change Trend of Grain Production in Henan Province
Based on the predicted results in Table 4, the trend graph of the volatility for total grain yield shown in Figure 7 was drawn. As shown in Figure 7, the grain yield increased significantly after 2003, especially in 2004, the grain yield increased by 19.35% compared with the previous year. During this period, the state began to adopt a series of measures to stimulate grain yield, and Henan Province fully implemented a series of national policies for the development of grain production: continuously accelerating economic system reforms, implementing the household contract responsibility system with joint output to adjust rural production relations, put forward to exempt the whole province agricultural tax program and so on. Driven by this series of policies supporting agriculture and benefiting agriculture, farmers’ enthusiasm for growing grain has been significantly improved. Therefore, Henan’s grain production will increase steadily in the coming years, provided that there are no major changes in the environment and policy. The total grain yield of Henan Province continued to grow from 2004 to 2019, making an important contribution to national grain security. However, it is worth noting that the growth rate has been decreasing year by year. The forecast results show that from 2020 to 2025, Henan Province’s grain yield will increase by within 1%, they are 0.72%, 0.82%, 0.92%, 0.93%, and 0.92%. It can be seen that the overall growth rate of grain yield in Henan Province is slowing down, while the ability to stabilize yield is gradually stronger. Annual fluctuation of total grain yield , where is the total grain yield and is the year.

4.2.2. Spatial Evolution Trend of Grain Yield in Henan Province
In order to further explore the spatial differentiation law of grain yield for Henan Province and its dynamic change trend, according to Table 1, the grain production and related indicators of 18 cities in Henan Province in 2000–2019 were selected (data from Henan Statistical Yearbook 2001–2020). Using the GR-BPNN prediction model, the predicted value of grain yield in 2020–2025 for 18 cities in Henan Province is obtained in the same way. The equal interval grading method in ArcGIS was used to classify the grain yield in 2020–2025 of the 18 cities into 5 classes. The spatial distribution map of grain yield shown in Figure 8 was obtained by selecting 2015, 2020, and 2025, respectively. As can be seen from Figure 8, the main areas of grain yield in Henan Province in 2015 were Shangqiu, Zhumadian, Nanyang, and Xinyang; it is estimated that the main grain production areas in Henan Province in 2020 will be Shangqiu, Zhoukou, Zhumadian, and Nanyang; it is estimated that the main grain production areas in Henan Province in 2025 will be Xinxiang, Shangqiu, Zhoukou, Zhumadian, and Nanyang. The main grain-producing areas in Henan Province are mainly located in the eastern plains, where strict measures have been implemented to protect arable land in a balanced manner, develop reserve arable land resources, and rehabilitate the land, and its excellent natural resource conditions and policy support have largely increased the grain production capacity of the region. The areas with low grain yield in three years were all distributed in the northwest of Henan, which has a relatively fragile ecological environment and has seen a dramatic increase in the area returned to the forest since the national policy of returning farmland to forests was implemented in 1998. Among them, Sanmenxia has a severe lack of water resources and is topographically located in the mountainous, hilly region of western Henan, resulting in extremely low grain production. In addition, Zhengzhou and its surrounding cities are areas of high industrialization and urbanization, and the reduction in the land area caused by the demands of various industrial constructions has put enormous pressure on arable land conservation and food production. On the whole, the main grain production areas in Henan Province showed a trend of moving to the north, reflecting the spatial imbalance of grain supply and demand, and the grain security problem was gradually highlighted in some areas.

(a)

(b)

(c)
The model of elemental transfer center of gravity [32] is an analytical tool to study the spatiotemporal variation law of elements in the process of regional development. The fluctuation of grain yield in a region with time will cause the spatial center of gravity for grain yield in the region to move, which is of great significance to the rational utilization of regional cultivated land and the guarantee of grain security. Based on the measured value (2015–2019) and predicted value (2020–2025) of grain yield in 18 regional units of Henan Province, the following model of grain yield elemental transfer center of gravity was established:where is the 18 regional units in Henan Province; is the grain yield of zone unit in Year ; is the geospatial barycenter coordinates of the region unit ; is the spatial barycentric coordinate of grain yield in Henan Province in the year . Using ArcGIS software, the grain production center coordinates in Henan Province from 2015 to 2025 were obtained, and the grain production center trajectory in Henan Province was obtained (Figure 9).

According to the elemental transfer center of the gravity model, the center of gravity should be located in the geometric center of the region if the grain production of each region in Henan Province is in equilibrium. Otherwise, it will lead to a shift of the center of gravity. Based on the calculations in Figure 8, it can be seen that the center of grains production in Henan Province deviates from the regional geometric center. From 2015 to 2025, the geographical coordinates of grain yield gravity center in Henan Province were between 114°7′20″E∼114°8′20″E, 33°59′0 ″N∼34°0′20″N, and the grain yield gravity center of Henan Province did not change strongly. Comparing the trajectory of the grain production center for gravity in Henan Province with the spatial distribution of regional grain production, the changes in the two are corresponding. The changing trajectory of the center of gravity reveals that the center of gravity for grain production in Henan Province is gradually shifting to the north. It also shows that the pressure on regional food security is constantly moving northward, and the northern region has to bear greater production pressure.
5. Conclusions
This study established the GR-BPNN prediction model by combining grey system theory and BPNN, making full use of the inclusiveness of grey relational analysis for sample data, the randomness of grey forecasting model weak data, the regularity of cumulative data, and the high nonlinearity of neural network. Combining the advantages of two predictive models, it can effectively deal with the analysis and modeling of multivariable complex realistic systems.
Applying the model developed in this paper to grain yield forecasting in Henan Province, it is found that the total grain yield in Henan Province increases with each year; the total grain yield growth slows down and tends to be stable, and the ability to stabilize grain production is gradually enhanced; grain production in Henan Province is strong regionally, and there are great differences among regions. The center of gravity of grain production keeps moving from the South to the North during 2015–2025. Changes in total grain yield are closely related to grain production capacity and security, economic development level, and input of production factors. The increase in the use of pesticides, plastic films, and chemical fertilizers, the construction and renovation of irrigation and water conservancy facilities, and the increase in labor force and technology input have played a huge role in improving the total grain yield. The following aspects should be considered in future efforts to increase grain yield in Henan Province: (1) Strengthening agricultural infrastructure. (2) Deepen policies to strengthen agriculture and benefit farmers, maintain policy strength, and play a synergistic effect. (3) Increase the role of scientific and technological innovation in grain production. (4) More policy preferences should be provided to major grain-producing regions to boost the development of the grain industry. (5) Strengthen interregional communication and exchange to achieve balanced and coordinated development of regional grain production.
For the forecast of grain yield in Henan Province, only three factors have been selected in this paper, including grain production capacity, food production security, and economic scale. Although there are many indicators, they may be missing, and other aspects such as the natural environment and meteorological conditions have not been considered. Changes in grain yield are the result of the interaction for natural resource endowments, socioeconomic level, agricultural technology progress, degree of marketization, and policy factors. The relevant indicators of Henan Province’s grain yield can be established in a comprehensive and multilayered approach to further improve the prediction accuracy of grain yield. This will also be the next step in the research.
Data Availability
The relevant index data of grain production in Henan Province from 2000 to 2020 from the “Henan Statistical Yearbook” presented in this manuscript are open and available. Link for data: http://www.ha.stats.gov.cn/.
Conflicts of Interest
The authors declare that they have no conflicts of interest.
Acknowledgments
This work was supported by the Key Project of Soft Science Research in Henan Province (202400410051).