Abstract

In view of the poor timeliness of dynamic blood glucose data and a delay of insulin effect for blood glucose control, and considering the nonlinearity and nonstationarity of the glucose data, a new blood Glucose Prediction algorithm combined Correlation coefficient-based complete ensemble empirical mode decomposition with adaptive noise and back propagation neural network (GPCEMBP) was proposed to increase the prediction time and improve the prediction accuracy. It refined the mode decomposition algorithm and integrated the correlation mode filter function to extract the characteristic intrinsic mode functions from the original signal. A new neural network prediction model was constructed by optimizing the number of hidden layer neurons, the number of hidden layers, activation functions, the number of inputs, structure, and other parameters. Finally, a predicted blood glucose value was calculated by phase space reconstruction technology. Through ablation and comparison experiments, it was demonstrated that the GPCEMBP algorithm had better prediction accuracy, convergence, and robustness in blood glucose prediction within 84 min. In addition, it has good adaptability to deal with different quality glucose data.

1. Introduction

In 2019, the International Diabetes Federation (IDF) released the distribution and change trend data of the number of adult diabetes patients (20–79 years old) in the global in 2019, 2030, and 2045 [1]. Among them, the number of diabetics reached 463 million in 2019 and was expected to reach 510 million in 2030 and 540 million in 2045. In addition, a number of diabetics in the working age group (20–64 years old) reached about 350 million by 2019, 417 million by 2030, and 486 million by 2045. In 2019, diabetes and its complications caused about 4.2 million adult deaths, which accounted for 11.3% of all worldwide deaths.

The disease course of diabetes is long, and persistent hyperglycemia, hypoglycemia, and frequent fluctuation of glucose are easy to cause damages to whole body tissues and organs, and even death [2, 3]. An artificial pancreas is generally considered the most ideal treatment for diabetes, and can effectively control and stabilize the patient’s glucose level [4]. The artificial pancreas generally consists of three parts: (1) continuous glucose monitor (CGM); (2) glucose controller (CG); (3) continuous insulin injection pump (CIIP) [5]. As a key technology to connect the CGM and CG in the artificial pancreas, glucose prediction is mainly used to solve the time deviation problem of judgment data caused by the delays of continuous glucose monitoring, diet, and drug metabolism to achieve glucose model predictive control. Then, the CG calculates the CIIP command in advance based on the predictive glucose data to implement effective and continuous insulin therapy.

Taking into account the following factors that the time series of patients’ blood glucose concentration is nonlinear and nonstationary, and there are many influencing factors for the glucose prediction. Therefore, according to the glucose characteristics, domestic and foreign scholars mainly carry out data prediction research based on two kinds of models. The first is the physiological model, which mainly builds the metabolic kinetic model based on the intake, decomposition, consumption, and storage of sugar in the body and the action mechanism of insulin. The early studies of physiological models mainly focused on dynamic models of the production and mechanism of insulin, dietary glucose absorption and glycogen decomposition, and glucose action. At present, the widely recognized dynamic models of insulin and glucose mainly include the Dalla Man [6] model, Hovorka [7] model, and Cobelli [8] model. Dalla Man [6] proposed a new in silico dietary simulation model that analyzed various glucose and insulin flow relationships that occurred during the meal. Hovorka proposed a nonlinear physiological prediction model that represented the glucoregulatory system with submodels of subcutaneously administered short-acting insulin, Lispro, and gut absorption. Cocbelli compared the atrioventricular model and the non-atrioventricular model from the physiological perspective of blood glucose to highlight the advantages of the atrioventricular method as a more reasonable physiological model. Because the clinical parameters of the above model are difficult to collect and confirm, the model validation is still immature and needs further study. Compared with the complexity of physiological models, glucose prediction by data-driven models has attracted more researchers’ attention.

The second model is a data-driven model, whose data sources are mainly composed of glucose data collected by CGM and recorded data on diet and insulin injection. These data are more easily accessible for making the data-driven model easier to implement. According to different core prediction algorithms, data-driven models can be divided into linear prediction and nonlinear prediction models. Domestic and external studies on linear models mainly included autoregressive (AR) proposed by Wang et al. [9]. Based on AR models, Wang et al. [9] adopted the least squares method and adaptive AIC criteria to optimize the AR model. The result showed that the AR prediction algorithm could effectively predict blood glucose within 30 min [10]. Subsequently, Yang et al. [11] proposed a new autoregressive integrated moving average (ARIMA) model with an adaptive identification algorithm of model orders. It brought an average prediction time of 25.5 min for therapeutic action [11]. In addition, Christou et al. [12] proposed a variety of autoregressive moving average (ARMA) prediction models with exogenous inputs. The predicted root-mean-square error (RMSE) for 30, 45, and 60 min was 9.04, 11.84, and 14.82 mg/dl respectively [12]. As the prediction time becomes longer, the correlation between data points becomes worse, and the performance of linear prediction will be reduced. Therefore, further studies are needed to predict glucose over a longer prediction horizon (pH).

The nonlinear prediction model is based on the cognition of nonlinear variation characteristics of clinical glucose data, and a glucose prediction model is learned and trained by nonlinear fitting. All kinds of neural network algorithms have advantages in nonlinear fitting, so researchers started using these algorithms to make a prediction. Li et al. [13] proposed a convolutional recurrent neural network (CRNN) glucose prediction model and verified its feasibility. Ali et al. [14] proposed an artificial neural network (ANN) and optimized the evaluation index to improve the prediction accuracy of the algorithm. Dave et al. [15] optimized the RF model by introducing insulin and carbohydrate inputs into the algorithm and changing the amount of historical glucose inputs. Compared with the linear model, there was little difference in the prediction of 15 min, but the prediction effect of more than 30 min was improved, but the prediction sensitivity still did not reach more than 90%. However, the above algorithms introduce a variety of input parameters to improve the pH and accuracy but rarely consider the purity of the original glucose signal. Moreover, the effective 30–45-min pH of most algorithms is too short to cover the delay of glucose detection and food intake.

Based on the clinical data of continuous glucose monitoring, this paper proposes a new data-driven Glucose Prediction algorithm that combines correlation coefficient-based complete ensemble empirical mode decomposition with adaptive noise and back propagation neural network (GPCEMBP). This algorithm could provide a high-accuracy prediction over a longer pH. In addition to the application in blood glucose prediction, the proposed algorithm can also be extended to a variety of multivariate, nonlinear, and nonsteady dynamic prediction intervention application scenarios such as epilepsy seizures [16], wind power generation [17], aquaculture dissolved oxygen [18], traffic flow [19], and so on.

2. Method

In general, the study started with the decomposition of the glucose signal into multiple time–frequency components, including a high-frequency noise unsteady component and a low-frequency effective steady-state component. Then, a correlation adaptive screening method was used to distinguish noise components from effective components. After removing the noise components, the effective steady-state components would be input into the neural network prediction algorithm, respectively, so as to complete a component prediction. Finally, the glucose predictive value was reconstructed by each component’s predictive value.

Based on the above ideas, compared with the time–frequency domain decomposition method–wavelet transform, which needs to set the wavelet base in advance, this paper chooses an empirical mode decomposition method that is more adaptive so as to adapt to the blood glucose signals with large individual differences and many interfering factors. On the basis of an in-depth theoretical analysis of various empirical mode decomposition algorithms and according to the hierarchical sufficiency and “filtering” principle, a new correlation coefficient-based complete ensemble empirical mode decomposition with adaptive noise (CEM) was designed. At the same time, a back propagation neural network algorithm (BP) with good predictive performance was designed, and several key parameters of the BP were optimized for glucose signals so as to improve its predictive performance.

2.1. CEEMDAN Algorithm

Complete ensemble empirical mode decomposition with adaptive noise (CEEMDAN) is a good time–frequency domain signal analysis method that is improved from ensemble empirical mode decomposition (EEMD), and the EEMD algorithm adds some noise to reduce the aliasing effect of empirical mode decomposition (EMD). EMD is a basic modal decomposition algorithm proposed by Huang et al. [20] to decompose unstable signals into several relatively stable intrinsic modal functions and a residual component on a time scale. Each intrinsic modal function (IMF) contains different local characteristics of the original data [21]. By adding adaptive white noise to replace the addition of standard white noise each time in the process of EEMD decomposition, the CEEMDAN algorithm reduces the noise residue in each IMF and makes the modal decomposition more complete [22, 23]. The following is the derivation of each mode decomposition algorithm.

First, the decomposition theory of the EMD algorithm is as follows [20]:(1)Initialize the original signal .(2)The original signal sequence is decomposed by EMD to obtain the ith inherent eigenmode component . The solving formula is as follows:where is a residual component after decomposing the Nth modal component and represents the variation trend of original signal .(3)Repeat the above two steps for decomposition. and are the sequence of upper and lower envelope of the decomposed modal component, respectively, and is the mean sequence of and , which meet the following requirements:

When the extreme point of is greater than 2, , return to Step 2; otherwise, break out of the loop and output the decomposition results.

The EMD algorithm is prone to cause modal aliasing defects when dealing with original signals with an insufficient number of its own poles, while the EEMD algorithm can increase its own poles by introducing white noise with a zero mean value, which will cancel out after multiple average calculations. When signal-to-noise ratio (SNR) is greater than 10 dB, its layering effect is more prominent. However, the EEMD method also has the disadvantage of noise residue in each IMF, and white noise with different amplitudes should be added each time. The CEEMDAN algorithm is realized by adaptive noise to reduce the noise residue in each IMF.

Second, the decomposition steps of CEEMDAN are as follows [22]:(1)Initialize the original signal . Adaptive white noise is added to the original signal , where represents the number of noise additions. Then the ith signal can be expressed as , where is the standard deviation of the ith white noise.(2)The signal is decomposed I times by EMD algorithm to obtain the first :where, the first residue is calculated as .(3)Construct a new signal , , and carry out times decomposition. The operator represents the jth mode of a given signal decomposed by EMD. Then, the second is calculated:(4)Repeat the previous two steps; the kth residue is calculated. Then, the kth IMF is defined as(5)Until the termination of the procedure, a total of IMFs are generated, then the final residue is . So can be expressed as

2.2. CEM Algorithm

In order to achieve better decomposition, the global mean of modes is obtained as the IMF after I times EMD decomposition of the signal with the adaptive noise in the CEEMDAN algorithm. Although the algorithm effectively reduces the randomness of each superimposed noise, it will also cause mean overfitting. Therefore, in combination with the characteristics of a continuous blood glucose signal, I times EMD local decomposition of the signal was performed to obtain the local mean. The implementation steps of the Improved CEEMDAN algorithm are as follows:(1)Initialize the original signal . The local means of I realizations is calculated by EMD, and the first residue is obtained as , where the operator represents the local mean of the signal and .(2)According to above formulas, the first mode is calculated as .(3)The second residue is calculated as the local means of the second realization, and then the second mode is obtained:(4) Repeat the third step until the termination of the procedure, then the final residue and the final mode are calculated:

The CEM algorithm is a combination of the improved CEEMDAN algorithm and the correlation coefficient. Although the improved CEEMDAN algorithm is decomposed sufficiently, it still retains invalid IMFs. Based on Improved CEEMDAN, A new CEM algorithm was proposed to screen the decomposed effective IMFs and reduce the invalid IMFs in the prediction results.

On the basis of improving the level of modal decomposition and obtaining clearer modal stratification, the concept of the IMF correlation coefficient is introduced. The correlation coefficient is widely used in many fields of science and technology. The correlation coefficient is a dimensionless index used to express the multivariate statistical relationship between two sets of variables. Its value ranges from −1 to 1, and it is divided into three categories: positive correlation, irrelevant correlation, and negative correlation. In general, the combination of negative and positive correlations requires some processing in the calculation process. In practical application, the value range of the correlation coefficient is generally 0 ∼ 1. The larger the value, the stronger the correlation. Set two groups of variables , and their correlation coefficient iswhere , , , , and .

Therefore, the correlation coefficient can also be expressed as

The glucose concentration is affected by many factors, and there are many interference signals in the original signal, so the main characteristics of the original signal are not obvious. Therefore, the result of the direct prediction based on the original glucose data is poor without high accuracy. A satisfactory prediction result can be obtained by removing useless components, and the prediction model is simplified. According to the correlation coefficients between each IMF decomposed by the Improved CEEMDAN algorithm and the original signal, The IMF screening method can be obtained through the following steps:(1)The original signal is decomposed into by using the improved CEEMDAN algorithm.(2)All of the correlative coefficient between the original signal and the are calculated using Formula (12). The correlative coefficient threshold , which is obtained by Formula (13), is used to selected sensitive IMFs:where the maximum of correlative coefficients represents the , , and represents, respectively, a proportional coefficient and an adjustment coefficient. is generally the total number of IMFs, and is generally the number of high-frequency IMFs. If the correlative coefficient is larger than h, then the IMFi is selected as a sensitive IMF. If not, it will be removed as an invalid IMF.(3)The sensitive IMFs are selected for the next prediction.

2.3. BP Algorithm

The artificial neural network has the ability to efficiently deal with nonlinear fitting problem. A BP neural network, as a local optimization technology of multilayer perceptual and gradient descent, corrects correlation by the backward error of the network so as to improve its nonlinear fitting performance [24]. The BP algorithm is mainly used for short-term prediction and fuzzy recognition [25]. Its structure generally consists of three elements: an input layer, hidden layers, and an output layer. The glucose predictive structure of the BP neural network proposed in this paper is shown in Figure 1.

There were four layers for glucose prediction: one input layer, two hidden layers, and one output layer, in Figure 1. The whole interconnection mode is adopted between layers, and there is no interconnection between the same layer. N history glucose values (i = 1, 2, ⋯, N) and a time variable serve as the input layer, and represents the glucose value based on time after time .

Before applying the BP neural network to the glucose prediction, sample training and learning are required, and its training process is shown in Figure 2:

(1)First is initializing the parameters of the network model, and then importing the first iteration activation functions to complete a setting of the BP weak predictor.(2)Random weight and bias are imported into the weak predictor, and then training samples are introduced to train the weak predictor. The sample weight and bias are constantly updated during the training process. The weights and bias are generated after the training(3)Based on the predicted glucose from the trained weak predictor, it calculates the error between the predicted glucose and the original signal. If the error is greater than the threshold, it adjusts the activation functions for the next training, which is repeating steps 1 and 2. Otherwise, it would get a set of activation functions, weights and biases, and other parameters to complete the construction of a BP prediction model.

2.4. BP Neural Network Parameter Estimation

In order to build a perfect BP glucose prediction algorithm, the following key parameters need to be researched: number of hidden layer neurons, number of hidden layers, activation functions, number of inputs, and maximum number of iterations. To research the parameters of the algorithm, it mainly uses the trial-and-error method. In order to evaluate the performance of the improved algorithm by updating parameters, a set of evaluation indexes is introduced: RMSE, FIT, normalized prediction error (NPE), and algorithm elapsed time (ET). The mathematical expression is as follows:where is the glucose concentration collected by CGM at time , is the predicted glucose value as the simplified representation of , and is the mean of .

When ET is calculated in operation, the smaller it is, the better the timeliness, while its importance is secondary. The smaller the RMSE and NPE, the better it is. FIT refers to the match of the predicted signal; the larger is better.

The methods and principles for obtaining key parameters are described as follows:(1)Number of hidden layer neurons. Too few neurons could not deal with the complex relationship, and too many would result in supersaturation of fitting and reduce algorithm efficiency. Comparing the calculation results in Table 1, it could be seen that the more the total number of neurons in the hidden layer, the better the prediction effect. When the number of neurons reached 45, the improvement in evaluation indexes was not significant. It was preliminatively concluded that a better prediction result and computational convergence could be obtained when the total number of neurons was 1–3 times of the inputs. The inequality is expressed aswhere is the total number of neurons, and is the number of inputs. The total neurons are eventually distributed to multiple hidden layers. The number of neurons in each layer depends on the number of hidden layers and further study, and generally reduces from the first hidden layer to the output layer. In the experimental evaluation, priority should be given to whether the RMSE and NPE indexes are lower, followed by whether FIT is higher and ET is lower. The experimental comparative data are shown in Table 1. It was concluded that the number of neurons in this paper should be between 28 and 35.

(2)Number of hidden layers. In the case of high real-time requirements for data prediction, the hidden layers are generally 1–3 layers. Among them, one layer mainly solves some linear problems, while three layers may take a longer time. It can be seen from the data in Table 1 that the comprehensive prediction performance (FIT&ET) of two hidden layers is the best with the same neurons, while the single layer’s is the worst. So this paper finally chooses two layers.(3)Activation functions. The activation functions contain two parts: the transfer function and the training function. Transfer functions are important parameters that affect the prediction effect. Transfer functions are generally used as the three types of “purelin,” “tansig,” and “logsig,” representing linear type, tangent S type, and logarithmic S type, respectively. Considering the nonlinearity of blood glucose data, the two functions of “tansig” and “logsig” are mainly considered. In order to avoid local convergence and improve convergence speed, the paper selected a “traingdx” algorithm based on learning rate adjustment, a “trainlm” algorithm based on Levenberg–Marquardt method, and a “trainbr” algorithm based on Bayesian regularization as training functions for experimental comparison. The “12-15-13-1” network with a good comprehensive performance in Table 1 was selected for 63-minute prediction to compare different transfer and training function combinations as shown in Table 2. At last, the combination of two “logsig” transfer functions and a “trainlm” training function was chosen for better performance in this paper.(4)Number of inputs. The trial-and-error method and the limitation method are adopted. When the number of inputs gradually increases, an optimal structure is determined according to the principle that the RMSE and NPE indexes decrease to saturation and the ET is lower, where the fine-tuning of the total number and distribution of neurons are performed synchronously. The experimental data are shown in Table 3.Under the same network structure, the prediction evaluation indexes of 15 inputs tended to be saturated, and their convergence rate was faster. Meanwhile, under the same inputs, the convergence speed of “15-13-1” network structure was fastest with a better prediction. Therefore, the complete network architecture was the “15-15-13-1” architecture, as shown in Figure 3.(5)Maximum number of iterations. Considering the timeliness of algorithm convergence, the maximum number of iterations was set to 2,000 through experiments. If it would be too large, ET will get larger when the data does not converge easily. However, if it is too small, the model will often fail to reach convergence, and the prediction accuracy will be low.

The key parameters of the BP network determined by the above methods are shown in Table 4.

2.5. GPCEMBP Algorithm

The components of the new glucose prediction algorithm mentioned above have been basically established. In this section, the CEM modal decomposition algorithm and the BP neural network are fused to produce a new GPCEMBP prediction algorithm. The system block diagram is shown in Figure 4, where . Phase Space Reconstruction (PSR) is adopted to optimize the original data [26]. The red “CEEMDAN” in Figure 4 is the Improved CEEMDAN algorithm, which decomposes the glucose data into IMFs of high and low frequency. effective IMFs are screened out by the correlation adaptive modal filtering algorithm, and the remaining IMFs are removed as noise. The K effective IMFs are used to predict Ks through the designed BP neural network modal. Finally, these are reconstructed by arithmetic superposition to output glucose prediction data .

2.6. Sources of Glucose Data

In this study, the CT-100 continuous glucose monitor from Zhejiang Kailite Medical Device Co., Ltd. was used to collect experimental glucose data. In the experiment, seven diabetic patients (three women and four men) aged between 24 and 55, who had an average history of 16 years of diabetes without serious complications, were selected to have continuous glucose monitoring for 6 days. During this period, intravenous blood glucose data were collected five times a day with a glucometer device from Roche Diabetes Care, Inc. to ensure the effectiveness of the continuous glucose data.

The glucose data of one patient were randomly selected to compare and verify the effect of the proposed prediction algorithm. The glucose data are shown in Figure 5.

The figure includes Interstitial Glucose concentrations (IG) for six consecutive days, and data is recorded every 3 min. So, eventually, there are about 2,880 data (480 data per 24-hr period) stored from one patient. The red dot here is blood glucose (BG) concentrations. Five blood glucose concentrations are collected during the 24 hr period at the following times: 6 : 00 A.M. before breakfast, 2 hr before lunch, 1 hr after lunch, before dinner, and 2 hr after dinner. Finally, a total of 30 effective BG data per person are collected.

In addition, another data from the study entitled “Accuracy of Continuous Glucose Monitoring in Children with Type 1 Diabetes” in the Diabetes Research in Children Network were selected. The CGM data was collected using the Medtronic Minimed device at 5-min intervals. Here, 864 uninterrupted sampling points (3 days) of three patients were selected to evaluate the prediction effect of the proposed method on data from different sources.

3. Experimental Results and Discussion

In the CEM algorithm of this paper, the noise IMFs in the signal are filtered by the IMF correlation method, and there are also some other noise IMF filtering algorithms. Examples include the Hausdorff distance filter (HD) [27], the consecutive minimum square error filter (CMSE) [28], the selection criterion approach filter (SC) [29], the Energy Level filter (EL) [30], and so on. A comparative experiment was conducted to prove the superiority of the proposed method, and the experimental results are shown in Table 5.

In the experiment, the CEEMDAN algorithm was selected to decompose the same glucose data, and the decomposed modes were filtered by their respective filtering algorithms to obtain their respective effective modes, which were used to predict a new 63-min glucose data by the BP algorithm. It can be seen from Table 5 that the proposed method performed better in the three evaluation indicators of FIT, RMSE, and NPE, while ET performed generally. However, considering that ET was small compared with 63-min pH, it had little influence. Therefore, it is confirmed that the proposed filtering method had a better effective mode of screening function.

Then, in order to illustrate the prediction effect of the GPCEMBP prediction algorithm and the role of modal decomposition algorithms, a comparison experiment of mode decomposition algorithms was designed. Four groups of direct prediction (BP), classical modal decomposition prediction (EMDBP), improved modal decomposition prediction (CEEMDANBP), and correlation adaptive screening improved modal decomposition prediction (GPCEMBP) were selected for comparison. EEMD, as a transition mode decomposition algorithm, should not be added to the comparison group if it has limited comparability.

Meanwhile, in order to better evaluate the performance of each algorithm, Errorsum, SNR, RMSE, correlation coefficient (R), mean absolute percentage error (MAPE), and maximum absolute percentage error (MAXPE) are introduced here, where Errorsum indicates the cumulative absolute error sum of the predicted value relative to the original glucose value. Their calculation formulas are defined as

where and were respectively the average of and , respectively.

In order to show the modal decomposition in more detail, the result of the Improved CEEMDAN modal decomposition of the glucose signal is introduced in Figure 6.

The algorithm in Figure 6 decomposed the signal into 11 IMFs and one . The frequency of the signal decreased gradually from to . The frequency of was the lowest, while its amplitude of the low-frequency IMF was relatively larger. Meanwhile, the number of modes decomposed by EMD, and the EEMD algorithm was 8 and 10, respectively. It could be seen that the Improved CEEMDAN algorithm decomposed to get more time–frequency components from the signal, just like a bunch of mixed buttons were automatically distinguished one by one according to their different types. So that it was easier to find its rules in the data recognition or mapping.

The data of volunteers were introduced for glucose prediction, and the prediction results of each algorithm at 84 min are shown in Figure 7.

As could be seen from Figure 7, the repetition ratio between the glucose prediction value curve and the original glucose value curve gradually increased with BP, EMDBP, CEEMDANBP, and GPCEMBP. Among them, the positions of curves with poor repetition ratio were mainly focused on turning points and fluctuation points. For the glucose prediction, the main performance concerns were pH and accuracy. In order to better quantify the prediction result of each algorithm under different pH, a set of pHs of 15, 30, 48, 63, and 84 min was designed. The comparison results of evaluation indexes under different pHs of each algorithm were obtained through experiments, as shown in Figure 8.

The marked points in Figure 8 are the experimental results of each index for different PHs, while the dotted lines corresponding to the same color are the fitting trend lines of each index.

It could be seen from Figure 8(a) that Errorsum index increases linearly with the pH in all algorithms. The Errorsum curve of the GPCEMBP algorithm is at the lowest position, which indicates it is superior to other algorithms. In Figure 8(b), the RMSE index also increases linearly with pH, and its linearity is better than Errorsum’s. In addition, the trend line slope of the RMSE index of GPCEMBP is the smallest, indicating that its prediction performance is better in a long pH. In Figure 8(c), the SNR exponential curve of GPCEMBP has the highest starting point and the slowest attenuation. In Figure 8(d), the R curve also decreases exponentially with the pH, and an advantage of the GPCEMBP algorithm is prominent in the prediction after more than 40 min. As could be seen from Figure 8(e), the correlation between ET and pH is not strong, and most ET is within 3-min. Therefore, the updating and prediction of glucose data are not affected by ET. The absolute value |PER| of PER is used to show the prediction error range. In Figure 8(f), the corresponding curve of each algorithm is a polynomial fitting. Except for BP, the results of other algorithms show that it tends to be saturated after 60 min. From Figures 8(g) and 8(h), the prediction results of all algorithms show the same trend of linear or polynomial rise, and the GPCEMBP algorithm has excellent performance.

The blood glucose data and experimental results are analyzed, and there are some interference signals in the original signal. Through the prediction algorithm processing, directly using the BP neural network prediction may take a big error for the randomness of the algorithm. That is to say that the single BP glucose prediction algorithm tends to be locally optimal in the long-term prediction and leads to low prediction accuracy. After adding the EMD mode decomposition algorithm, the EMDBP algorithm effectively reduced the noise interference and greatly improved the prediction performance. By adding adaptive noise in CEEMDANBP algorithm, the residual of superimposed noise in IMFs is reduced to solve EMD algorithm mode aliasing. Furthermore, the GPCEMBP algorithm selects the high-correlation modal components of the signal to most greatly reduce the low-correlation modal components as noises, which brings the best performance in long-term prediction. Thus, compared with CEEMDANBP, there is little difference in prediction effect within 60 min, while the GPCEMBP algorithm shows a better effect when pH is greater than 60 min. Differences are so obvious that Table 6 lists the improvement rates of evaluation indexes compared with the previous algorithm at 84-min pH.

The evaluation indexes of the latter algorithm have got 10%–50% improvements in different degrees, where Errorsum, RMSE, and SNR were obvious. Especially, Errorsum, RMSE, SNR, and MAXPE of the GPCEMBP algorithm compared to CEEMDANBP’s respectively improve −12.77%, 11.69%, −22.07%, and 5.12%.

In the above experiments, the prediction accuracy of each algorithm was evaluated with the same patient’s data. Meanwhile, the adaptive flexibility of the GPCEMBP algorithm also needs to be verified for other patients, so the algorithm is used to predict the glucose data of nine other patients. The results of the main indexes in 84 min are shown in Figure 9.

The dotted line of “Patient” in Figure 9 is the prediction result of the GPCEMBP algorithm in Figure 8. The other three dotted lines are the prediction results of three patient data collected by Medtronic CGM, and the other solid lines respectively represented the result of the other six patients by CT-100 CGM. The “Patient” dotted line is used as a reference to compare and analyze the data of other patients. In Figure 9(a), the curves of most patients are clearly stratified, which indicates that the prediction algorithm treats different patient data from different CGMs with high stability. In addition, compared with the reference dotted line, some lines are higher and some are lower. It indicates there are different blood glucose noise levels for different patients and different CGM devices. However, the overall noise levels are within a clinically acceptable range (MAPE < 20%) with the GPCEMBP algorithm. Figure 9(b) shows the exponential downtrend curve of each SNR along with the PHs, which is the same as the reference dotted curve. In addition, it could be seen from Figures 9(c) and 9(d) that the overall rules of prediction indexes are the same with each other.

For continuous monitoring of glucose concentration in the blood, there is a 20–40 min delay [31]. In addition, taking sugar from food or injecting insulin requires a metabolic process that would take time. Therefore, a prediction algorithm with a longer pH and a higher accuracy will be more favorable for diabetics to take glucose regulation measures or for artificial pancreas control in advance.

In addition, combined with the results of the GPCEMBP algorithm for different patients under the long pH in Figure 9, it could be seen that, excluding the deviation caused by individual data point, which includes the accidental error of the algorithm or the reason why the convergence of the patient’s data decreases under a long pH, the prediction results of all patients basically reach the range of clinical acceptance. In conclusion, the GPCEMBP algorithm has a good adaptive ability in treating blood glucose data from different patients.

To provide a more objective evaluation of the prediction performance of the GPCEMBP algorithm, it is compared with other prediction algorithms with the same data source, number of inputs, and pH.

The long short-term memory network (LSTM) is a prominent prediction algorithm that can effectively retain characteristics of historical data to make it an optimal choice for regression tasks involving time variables [32]. In addition, the GPCEM-LSTM algorithm can be combined with LSTM and the proposed IMF filtering algorithm to absorb the advantages of the two algorithms and improve the prediction effect.

Particle swarm optimization (PSO) is an optimization algorithm that mimics bird predation behavior. The algorithm can find the optimal solution from a group of random solutions by updating the particle velocity and position iteratively. The initial weights and thresholds of each layer of the BP neural network have a great impact on the training of the model, but the initial weights and thresholds are generated randomly, which will have a great impact on the prediction accuracy of the neural network. Therefore, the initial weights and thresholds of the BP neural network can be optimized by the PSO algorithm for the PSO-BP model [33]. At the same time, considering the filtering of the CGM data, the GPCEM-PSO-BP prediction algorithm is further integrated. The 84 min prediction results are shown in Table 7.

Table 7 shows that compared with the poor prediction effect of LSTM, the GPCEM-LSTM algorithm has a good improvement, but its ET increases a lot. The PSO-BP algorithm could bring a better prediction accuracy by optimizing the training model of BP, but the optimization processing is time-consuming, resulting in poor timeliness. The prediction accuracy of the GPCEMBP algorithm is second closed to that of the GPCEM-PSO-BP algorithm. However, the ET index of GPCEM-PSO-BP was too large to meet the timeliness requirement of the dynamic glucose prediction. Therefore, it could be seen that GPCEMBP is the most appropriate algorithm for a dynamic glucose prediction.

As a key technology to control glucose level, numerous research studies about glucose prediction have been developed and reported in the literature. Through horizontal comparison with other algorithms [13, 3438], Table 8 presents the comparison results. It can be seen from the table that most algorithms use a neural network algorithm for prediction, and the pH is mostly less than 60 min. More use of clinical data is very important for a real-world application. The comparison shows that the proposed algorithm has high accuracy in a longer pH.

4. Conclusion

The Collection of blood glucose concentrations is a daily task for every diabetic. By observing the changes in glucose concentration, diet, exercise, and medicine can be adjusted to stabilize glucose and improve the quality of life. Compared with sporadic glucose collection, continuous glucose monitoring has a lot superiorities, but also there are some natural defects. CGM is the using of minimally invasive technology to monitor the glucose concentration in the cell tissue fluid, and there is some delay compared to the blood glucose concentration. In addition, it takes some time for diet and medication to regulate the blood glucose level through metabolism. Therefore, instant blood glucose as a basis for immediate treatment lacks timeliness. At this time, the blood glucose prediction technology as a reference data source for the blood glucose control of diabetic patients is particularly important and is one of the key technologies for the development of artificial pancreas in vitro.

In this paper, the glucose prediction algorithm was studied based on different clinical data. Fusing modal decomposition, modal filtering, and neural network prediction, a new continuous glucose prediction algorithm–GPCEMBP was proposed through theoretical derivation and experimental comparison. Among them, the neural network structure design and parameter optimization methods were also explained in detail. Compared with other algorithms, the GPCEMBP algorithm showed a better full mode decomposition and less additional noise residual in the mode in the mode decomposition algorithm part, a better effective mode filtering in the mode filtering algorithm part, and a more excellent nonlinear fitting with a more efficient convergence in the neural network algorithm part. The GPCEMBP algorithm has better prediction accuracy and robustness within 84 min. In the future, based on glucose data collected from different CGMs, the proposed algorithm will be integrated into the control system of an artificial pancreas controller to help it implement glucose regulation measures in advance and in time and stabilize the dynamic glucose fluctuations.

In addition, on the one hand, the research of the algorithm can be improved on the efficiency of dynamic parameter optimization, so as to improve the timeliness of the fusion. On the other hand, the differences between the predictive data and the original data can be further quantified and analyzed to carry out research on the soft alarm of glucose control.

Data Availability

The data used to support the findings of this study are available from the corresponding author upon request.

Conflicts of Interest

The authors declare that they have no conflicts of interest.

Acknowledgments

This work has been partially funded by the National Natural Science Foundation of China (No. 61471233) and the Key Project of Basic Research of the Shanghai Municipal Science and Technology Commission, China (No. 13NM1401300).