Abstract
The gray prediction model, based on the GM(1,1) method, is an important branch of gray theory with the most active research and the most fruitful results, and it is the most widely used because of its small sample size, simple modeling process, and easy to use. Such advantages have been successfully applied in many fields such as transportation, agriculture, energy, medicine, and environment and have been gradually developed into a mainstream predictive modeling method. This study combines the Three-parameter Whitenization Grey Model (TWGM(1,1)), which fits the inhomogeneous exponential law sequence, and the Particle Swarm Algorithm (PSA) to optimize the order and background value coefficients under the condition of the minimum sum of squares of simulation errors, and hence, to solve the problem that the cumulative order is fixed to “1” and the background value coefficient value is fixed to “0.5.” As a result, a parameter-optimized gray system model with flexibility, adaptability, and dynamic adjustment is designed to simulate and predict China’s higher education gross enrollment rate. The application shows that the model has better overall simulation and prediction performance than others. On the one hand, the parametric optimization model significantly improves its own performance, and on the other hand, its intelligent and adjustable adaptivity improves the accuracy and further extends its application.
1. Introduction
Since Professor Ju-Long [1] established the gray system theory in 1982, the gray prediction model based on the GM(1,1) method is the most active, fruitful, and widely used branch of gray theory. With feature of the small sample size, simple modeling process, and low cost of learning, it has been successfully applied to many fields, such as transportation, agriculture, energy, medicine, and environment [2–6]. The model gradually developed into a mainstream predictive modeling method. The enrollment rate of higher education is influenced by many factors such as social, economic, cultural, and geographical factors, which are characterized by partly known and partly unknown information; thus, the GM(1,1) model, as a model to study the uncertainty system problem of “partly known and partly unknown,” has been used to predict the enrollment rate of higher education. For example, Liu et al. [7] used the GM(1, 1) prediction model to make a mathematical analysis of the gross enrollment rate of higher education. It scientifically predicted and eased the employment situation of college students. It provided reference for promoting social stability and economic development. Dong [8] used the GM(1, 1) model to predict the fluctuation of China’s medium- and long-term gross enrollment ratio in higher education, which provides a reference for government assessment and citizens’ choice of higher education.
The GM(1, 1), as the first gray prediction model proposed to be applicable to univariate modeling, has shown favorable simulation and prediction performance for the sequence of unity exponential growth characteristics. However, according to the Statistical Yearbook, the gross enrollment rate of higher education in China was 51.6% in 2019, which exceeded the 50% mark for the first time, indicating that China’s higher education has moved from the mass stage to the generalized stage. The shift in development scale has undoubtedly increased the complexity and uncertainty of the higher education GER system, which makes the system behavior series more similar to nonuniform exponential growth characteristics, and the original system behavior series with similar unity exponential growth characteristics can no longer satisfy the existing system. How to develop a high-performance gray prediction model that is more flexible, adaptable, and suitable for modeling and predicting the gross enrollment ratio in higher education dynamically is the focus of this research.
In recent years, researchers have carried out a lot of research on the modeling ability of approximate nonhomogeneous exponential sequences. These studies can be divided into three categories according to different modeling ideas. The first is to optimize the structure of the GM(1,1) model to ensure that the final reduction formula of the model presents an approximate homogeneous exponential form. The NHGM(1, 1, k) [9] model extends the basic form of the GM(1,1) model to . The SIGM model [10] expands the basic form of the GM(1,1) model to . The KRNGM model [11] introduces the nonlinear function into the whitening differential equation of the GM(1,1) model. The second is to modify the adaptability of modeling objects. The conversion from a nonhomogeneous exponential sequence to a homogeneous exponential sequence can be realized by the difference between adjacent elements in the original sequence. The third is the direct modeling method. Wang [12] proposed a direct modeling method of the GM(1,1) model generated from the original data without accumulation, and it had been gradually optimized [13, 14]. Ye and Li [15] established a whitening weight gray prediction model to measure the influence of the probability of interval gray number in the prediction results. Zeng and Liu [16] established a direct modeling method of DGM(1, 1) based on the original sequence by omitting the accumulation and subtraction processes. In addition, the models mentioned above have been applied in variate fields. Xiao et al. [17–20] studied parameter optimization of Grey Riccati model and its application in the prediction on the energy consumption and carbon. A new structure of the Gray Verhulst model is proposed by Xiao et al. [21–24], which improves the ability of the gray model to model saturated S-sequences, and it is applied in China’s tight gas production forecasting.
The above optimization models around approximate nonaligned exponential sequences have better properties, modeling capabilities, and a wider range of applications than the GM(1,1) model. The parameters (such as order, background value coefficients, and initial values.) of the gray prediction model are crucial to the performance of the model. However, these performance parameters are often simplified to a specific value in the above optimization model. Therefore, optimization of performance parameters is a key means to improve the stability, applicability, and flexibility of the model. A large number of studies have been launched for the optimization of performance parameters from many directions, such as initial value [25–27], order [28, 29], and background value [30, 31]. The optimal value of each performance parameter should meet the condition of the minimum sum of squared simulation errors of the model, and the optimization process requires a lot of calculations. The Particle Swarm Algorithm (PSA) provides an optional solution for the optimization of those parameters, which find the global optimum by following the currently searched optimum. A distinctive feature compared to other modern optimization methods is that the PSA requires few parameters to be adjusted, is simple and easy to implement, and converges quickly, which has become a research hotspot in the field of modern optimization methods [32–35]. Zeng and Liu [36] proposed the SAIGM_FO model and used the PSA to optimize the order of the model. Wang and Li [37] used the PSA to optimize the structural parameters (a, b) of the model. It has been proved that the PSA in the gray prediction model improves the modeling accuracy and model flexibility.
In this study, it is firstly proposed that the PSA is applied in the Three-parameter Whitenization Grey Model (TWGM(1,1)) [38] to optimize the order and background value coefficients under the condition of minimum simulation error squared. The solution is not only capable to solve the problem that the cumulative order is fixed to “1” and the background value coefficient value is fixed to “0.5” but also displays a feature of flexible, adaptable, and dynamically adjustable in the application in modeling China’s higher education gross enrollment ratio (GER). The empirical analysis shows that the model has better overall simulation and prediction performance than others. On the one hand, the parametric optimization model significantly improves its own performance, and on the other hand, its intelligent and adjustable adaptivity improves the accuracy and further extends its application. The rest of the study is organized as below. Section 2 describes the original Three-parameter Whitenization Grey Model. Then, the parameter optimization process of the Three-parameter Whitenization Grey Model is followed in Section 3, and Section 4 gives an application of the optimization model on China’s higher education gross enrollment rate. The conclusion of the study is revealed in the last section.
2. Original TWGM(1,1) Model
According to the derivation process of the time-response function of the classic GM(1,1) model, the whitenization equation of the Three-parameter Whitenization Grey Model is established, and then the time-response function is derived by solving the differential equation. This model is called the Three-parameter Whitenization Grey Model, or TWGM (1,1) model in short.
Definition 1. Suppose sequence is the original data sequence, where , , and is the 1-AGO sequence of , where is the immediate mean-generating sequence of , where
Definition 2. Assuming that the sequences  and  are as shown in Definition 1, thenEquation (3) is the whitening differential equation of the TWGM(1,1) model. The equationis the basic form of the TWGM(1,1) model.
According to the modeling ideas of the GM(1,1), the model parameters a, b, and c are estimated through the basic form of the TWGM(1,1) model. The time-response function of the TWGM(1,1) model is obtained by solving the differential equation.
Theorem 1. Suppose the sequence , and are as shown in Definition 1, is the parameter list, andThen, the least-square estimation parameter list of the TWGM(1,1) model is satisfied.
Theorem 2. The time-response function of the TWGM(1,1) model isThe inverse-accumulating reduction formula of the TWGM(1,1) model is
3. Parameter Optimization of the Three-Parameter Whitenization Gray Model
3.1. Background Value Optimization
In the TWGM(1,1) model modeling process, the sequence generated next to the mean is used as the background value, which is a common smoothing method to weaken the influence of extreme values or outliers in the 1-AGO sequence on the magnitude of the gray effect. Among them, the background value coefficient is the weight of adjacent elements in the process of constructing the adjacent mean sequence. The difference of its value will affect the calculation result of the adjacent mean value series and then affect the simulation and prediction effects of the model. In the above modeling process, the background value coefficient is set to 0.5. This processing method lacks flexibility and cannot guarantee that the model achieves the best simulation effect. So, it is necessary to optimize the background value of the original model. The optimization of the background value is mainly based on the optimization of the background value coefficient. Optimize the definition of the sequence immediately adjacent to the mean.
Definition 3. Suppose that  is as shown in Definition 1,  is the optimized sequence generated by the immediate mean of , and  is the background value coefficient,, whereThen,is the TWGM(1,1) model with the background value coefficient .
Among them, the optimal value of the background value coefficient  should satisfy the minimum sum of squared simulation errors of the model, namely,In equation (10),  is the original modeling data and  is the simulated data of the model.
3.2. Order Optimization
In the original modeling process, the cumulative order is fixed at “1,” which leads to poor flexibility and adaptability of the model. Therefore, Wu et al. [39, 40] introduced fractional order into gray modeling based on the “in between” idea. This realizes the expansion of the cumulative order of the gray prediction model from integer to fraction. Meng et al. [38, 41] used the Gamma function to give the functional expression of the fractional accumulation operator, which provided the basis for constructing the fractional gray prediction model. This section optimizes the order of the TWGM(1,1) model in order to improve model performance and model adaptability.
Definition 4. Suppose  is as shown in Definition 1 and  is a new sequence; then, sequence  is called the r-order cumulative generating sequence of , whereThen, the sequence  is called the adjacent mean-value generating sequence of , whereThen,is the TWGM(1,1) model of r order.
Among them, the optimal value of the r order should satisfy the minimum sum of squared simulation errors of the model, namely,
3.3. Simultaneous Optimization of Background Value and Order
Based on the above optimization analysis, this section derives and constructs a new TWGM(1,1) model under the condition of simultaneous optimization of background value and order. The new model is called the TWGM() model.
3.3.1. Definition
Definition 5. Suppose the sequences  and  are as shown in Definitions 1 and 4; then,is called the whitening differential equation of the TWGM () model.
The sequence  is called the optimized sequence generated by the adjacent mean value of  under the condition of the background value coefficient , whereIntegrating both ends of formula (15) in interval , we can obtainBecauseamong them, the size of  can be approximately replaced by the area represented by , namely,So, equation (15) can be converted toEquation (20) is the basic form of the TWGM() model.
3.3.2. Parameter Estimation
Theorem 3. Suppose the sequences , and are as shown in Definitions 1, 4, and 5, is the parameter list, andThen, the least-square estimation parameter list of the TWGM() model satisfies .
3.3.3. Time-Response Derivation
According to the whitening differential equation of the TWGM() model to derive its time-response function and according to formula (15), the corresponding homogeneous equation is
Then,
The general solution of homogeneous equation (22) is
Using the constant variation method, replace formula (24) with , and let
Derivation from both ends of equation (25) with respect to t:
Substituting equation (26) into formula (15), we obtain
According to equation (27), there is , so
Substituting equation (28) into (25), we obtain
Calculating the above equation,
Among them, is a known item. At that time , equation (30) can be obtained:
We obtain
Substituting equation (32) into (30), we obtain
Among them, and equation (33) is the time-response formula. At this point, the time-response derivation is over.
3.3.4. Derivation of Accumulative Reduction
Definition 6. Suppose sequence is as shown in Definition 1 and if , then sequence is called the r-order accumulative generating sequence of , where
Theorem 4. Suppose sequence is as defined in Definition 1, ,, is the qth order cumulative generation sequence of , is the pth order cumulative generation sequence of , is the p + q-order cumulative generation sequence of , is the q-order cumulative generation sequence of , and is the p-order cumulative generating sequence of ; then, the multiple cumulative generating operator satisfies the commutative law and exponential rate, namely,
Corollary 1. According to Theorem 1, it can be derived asAccording to Corollary 1, which is equation (36), the final reduction formula of the TWGM() model can be derived asIn equation (37), when , is called the simulated value and when , is called the predicted value.
3.4. TWGM(1,1) Model Parameter Optimization Based on PSO Algorithm
In the TWGM() model, there are two undetermined parameters (the background value coefficient and order). In order to obtain the best performance of the TWGM() model, the optimal values of these two undetermined parameters are required. The combined solution of the two parameters is easy to cause errors due to mutual influence and mutual interference of the parameters, and at the same time, it weakens the optimization effect of the independent parameter on the model. Therefore, in the process of solving the optimal value of the parameter, a step-by-step method is adopted. Specifically, first, in , assuming that the order does not change, that is, r = 1, the optimal background value coefficient is solved. Secondly, on the basis of , in , the optimal order is solved. According to Definitions 3 and 4, the optimal background value coefficient and the optimal order are found within their respective value ranges and should be obtained under the condition that the sum of squared simulation errors of the model is minimized, that is, to satisfy equations (10) and (14).
Obviously, the optimization process of each parameter to be determined takes a lot of time and takes up limited computer resources. Various group optimization algorithms (such as Particle Swarm Optimization and ant colony algorithm.) provide good solutions to complex distributed optimization problems. The Particle Swarm Optimization (PSO) is a swarm-intelligent global optimization algorithm that simulates the predation behavior of birds. Its basic concept comes from the study of the foraging behavior of birds. The algorithm has the advantages of simple structure, few parameters, and easy programming. At the same time, the Particle Swarm Optimization algorithm based on adaptive mutation of the population fitness variance effectively solves the phenomenon of premature convergence and can significantly improve the global convergence performance. It has been used in function optimization and neural network. It is widely used in training, engineering, and other fields [28–31]. Therefore, this paper uses PSO to optimize the background value coefficient and order step by step. The PSO algorithm solving steps are as follows: Step 1. Initialize randomly the position and velocity of the particles in the particle swarm. Step 2. Set the particle in the current position to the position of the best particle in the initial population. Step 3. Calculate the average relative simulation error of the TWGM() model when (or ): Step 3.1. Calculate the cumulative sequence of order . Step 3.2. Calculate the sequence immediately adjacent to the background value coefficient . Step 3.3. Construct matrix B and Y, and solve model parameter . Step 3.4. Calculate the simulation value . Step 3.5. Calculate the average relative simulation error of . Step 3.6. Judge whether is less than the given convergence value ; if it is satisfied, go to Step 9. Otherwise, execute Step 4. Step 4. Perform the following operations for all particles in the particle swarm: Step 4.1. Update particle position and velocity: Step 4.2. If the particle fitness is better than the fitness, set it to the new position. Step 4.3. If the particle fitness is better than the fitness, set it to the new position. Step 5. Calculate the variance of the group fitness , and calculate : Step 6. Calculate the probability of mutation . Step 7. Generate random number ; if , press to execute mutation operation. Otherwise, go to Step 8. Step 8. Determine whether the algorithm convergence criterion is satisfied; if it is satisfied, execute Step 9. Otherwise, turn to Step 3. Step 9. Output the optimal value of the background value coefficient (or order ) and the simulation and prediction data of the TWGM() model at this time, and the algorithm operation ends.
4. Model Application: Forecast of China’s Higher Education Gross Enrollment Rate
To identify the optimized performance on each parameter, the variables (the background value coefficient and order r) are simulated with four different groups. The original TWGM(1,1) model means that the model is not optimized for order and background value coefficients at all. The first group is the model with, = 0.5 and r = 1, that is, the TWGM(1,1|0.5,1)model. The second one is the model with = 0.5 and r = r∗, that is, the TWGM(1,1|0.5,r∗) model, which indicates that the model only optimizes the order. The third one is the model with, = and r = 1, that is, the TWGM() model, which means that the model only optimizes the background value coefficient. The last group is = and r = r∗, that is, the TWGM() model, which suggests that the model optimizes the order and background value coefficients at the same time. The simulation results with Matlab for four parameter groups are shown in Table 1, where is China’s high education enrollment data from 1991 to 2019.
In order to intuitively reflect the simulation effects of different models, a comparison chart of the simulation curves of different models is drawn, as shown in Figure 1, where models with different parameter values show different simulation effects. The curve of the TWGM(1,1| 0.5, 1) and TWGM() models overlap and show a same trend. The TWGM() model initially displays better simulating and predicting effects for the gross enrollment ratio in higher education in China since it produces the most close result with raw data.
In order to measure the optimization performance of different parameters, the average relative simulation percentage error , average relative prediction percentage error , and comprehensive percentage error of the gray model of each model are calculated according to the results in Table 1. The results are shown in Table 2. The relevant calculation equation is as follows:
Among them, represents the simulation/prediction error of , represents the number of experimental data, and represents the number of experimental data used for simulation.
Based on the results of the calculations in Table 2, the relative percentage error comparison figure with different parameter values is plotted, as shown in Figure 2 in supplementary materials. From Table 2 and Figure 2 in supplementary materials, the following conclusions are drawn:(1)The background value optimization effect: comparing the TWGM(1,1|0.5,1) model and TWGM() model with a = 0.4538, the graph shows that the integrated percent error of the gray model after the background value optimization is slightly lower than the original model, but the difference is not remarkable, which suggests that the background value optimization has an optimization effect on the comprehensive performance of China’s higher education gross enrollment rate, but the effect is not significant.(2)The order optimization effect: compare the TWGM(1,1|0.5,1) model with the TWGM(1,1|0.5,r∗) model with a r∗ = 1.1127 in Figure 2 in supplementary materials, the average relative percentage prediction error and the comprehensive percentage error after the order optimization are significantly lower than the original model, which shows that the order optimization has a remarkable improvement on the model performance in simulating China’s higher education gross enrollment rate.(3)The TWGM() model possesses the lowest error of 3.7782%, which indicates that the optimization of both the background value and the order is of the best option for improvement on the model performance. Thus, the TWGM() model provides the best fitting effect and is preferred for simulation and prediction.
5. Conclusions
The study proposed a new optimization model by combining the Three-parameter Whitenization Grey Model with the PSA. To demonstrate the performance of this new model, an application analysis is carried out in the predication of China’s high education enrollment ratio with Matlab simulation. The findings suggest that the new optimization model proposed in this study improves the performance of the original gray model remarkably. Specifically, compared with the original gray model, the average relative percentage prediction error and comprehensive percentage error after the optimization are significantly reduced, which indicates that the simultaneous optimization of the background value and order has a notable improvement on the overall performance in simulating and predicting China’s higher education gross enrollment ratio. Besides, it is indicated that the intelligent and adjustable adaptivity improves the accuracy of optimization and, furthermore, extends its application in the empirical study in the future.
Data Availability
The data (High Education Gross Enrollment Ratio of China) can be found in the supplementary information files or in “The Education Statistic Yearbook of China,” Chapter 1, Page 21: Gross Enrollment Ratio of Education by Level (https://data.cnki.net/trade/Yearbook/Single/N2020070382?z=Z017).
Conflicts of Interest
The authors declare that they have no conflicts of interest.
Acknowledgments
The study is supported by Fundamental Research Funds for the Central Scientific Research Institutes of NIES, “Research on the evaluation of first-class undergraduate education,” No. GYB2019002.
Supplementary Materials
Figure 1: Comparison of simulation curves of different models. Figure 2: Comparison of relative percentage errors of different models. (Supplementary Materials)