Abstract
Considering that it is easily disturbed by various engineering factors such as weather, hydrology, and construction during engineering monitoring, the collected subsidence data contain various noises. In order to reduce the influence of engineering noise on the accuracy of subsidence prediction, it is proposed to use the Daubechies (DB) wavelet to decompose the original subsidence time series; the items with the low-frequency trend, after decomposition, are predicted using long short-term memory (LSTM) model, items with high-frequency noise used the autoregressive (AR) time series model to make predictions, and the prediction results of the low-frequency trend term and the high-frequency noise term are summed to obtain the total time series predicted value. Combining the actual engineering subsidence monitoring data of the old goaf, compared with the prediction results of the LSTM and RNN models without DB wavelet decomposition and the gray model GM (1,1), the results show that the DB wavelet has an obvious improvement effect in reducing the influence of measurement data noise on prediction error. Compared with the single prediction model LSTM, RNN, and GM (1,1), the proposed prediction model has higher prediction accuracy, smaller error, and better trend. It can be used as a calculation method to improve the prediction accuracy of surface subsidence in old goaf.
1. Introduction
In the untreated goaf, the overlying rock layer gradually collapsed and compacted over time. The voids in the collapsed rock mass, the separation and cracks of the overlying rock, and the voids at the boundary of the goaf are the remnants of the old goaf main source of subsidence. The subsidence prediction methods of old goafs mainly include traditional theoretical analysis methods, physical numerical simulation methods, and intelligent prediction methods based on artificial intelligence and big data. Traditional theoretical analysis methods mainly include the influence function method, empirical method, and profile function method [1–4], and this kind of analysis method assumes that the goaf is of regular shape, so it is only applicable to the regular goaf. At the same time, traditional theoretical analysis methods require a large amount of actual measurement data. When the actual measurement data are insufficient, accurate prediction parameters cannot be obtained, and the analysis process of the prediction method is relatively complicated. It is necessary to make certain improvements to the prediction method according to the specific situation, and the prediction efficiency is low; physical numerical simulation methods mainly include model experiments, finite element, finite difference, and discrete element methods [5–9], and this kind of analysis method needs to be able to accurately restore the three-dimensional structure of the old goaf. At the same time, physical numerical simulation methods have high requirements for rock mass parameters. In the absence of measured data, the results of the physical numerical simulation often have large deviations and cannot be accepted by the project. Intelligent prediction methods mainly include time series models, gray prediction, and neural network methods [10–15], and this kind of analysis method has been widely used, but there are still some defects. The time series and gray prediction methods in intelligent prediction cannot make a rolling prediction of subsidence, and the widely used back-propagation neural network and recurrent neural network (RNN) are prone to gradient disappearance and gradient explosion problems when the derivative is small or large, which causes nonconvergence of the prediction results [16]; through the improved long short-term memory (LSTM), the gradient is solved, and disappearance and gradient explosion problems can make high-precision predictions on the time series of geotechnical engineering stress and deformation [17].
However, the above prediction methods do not consider the impact of noise on the prediction. When monitoring the subsidence of actual project monitoring points, owing to the interference of various random factors such as weather, hydrology, and construction, the collected subsidence data contain various noises. The existence of these noises will make the prediction results inaccurate.
Wavelet noise reduction has been successfully applied in many fields [18–28]; however, its application in the field of surface subsidence prediction in old goafs is relatively rare. Consequently, this study proposes a combined prediction model of wavelet decomposition and noise reduction considering engineering noise (hereinafter referred to as the combined model LSTM-AR). In this model using the DB wavelet, the measured subsidence data of the old goaf are decomposed into low-frequency trend items and high-frequency noise items. LSTM is suitable for the prediction of stepped and trending data; thus, low-frequency trend items are predicted using LSTM, and autoregressive (AR) models are suitable for the prediction of stationary time series, and high-frequency noise items are predicted using AR models. The combined prediction model decomposes the noise in the original subsidence data through the DB wavelet and fully considers the influence of noise on subsidence prediction.
The combined prediction model is applied to the surface subsidence prediction of the old goaf in the reconstruction and extension project of the Jixi section of the national highway Dan-A to test the practical application effect of the combined prediction model. Engineering examples show that the combined model LSTM-AR has high prediction accuracy, the prediction trend is in line with reality, and the engineering practicability is good.
The rest of this study is structured as follows. Section 2 proposes a combined model LSTM-AR and introduces the specific steps of the combined model LSTM-AR. Section 3 introduces the engineering overview of the project, on which the study is based, and the layout of on-site monitoring points. Section 4 uses the proposed combined model LSTM-AR for subsidence prediction and compares it with the prediction results of LSTM, RNN, and GM (1,1). Section 5 gives the conclusion of this study.
2. Combined Prediction Model
The application of the combined model LSTM-AR in the subsidence prediction of goaves of many years old can be established according to the following steps:(1)Wavelet decomposition measured subsidence time series. Wavelet analysis converts the time-series function to the time-frequency domain. Through the expansion, translation, and calculation of the function, the function can be gradually refined in multiple scales, and the time and frequency subdivisions of high and low frequencies can be achieved, respectively. Not only it can better observe the local characteristics of the function but can also observe the time and frequency information of the function simultaneously. DB wavelet is a member of the wavelet function family and has a wide range of applications in various fields, including but not limited to mechanical fault diagnosis in the mechanical field [29], deformation detection of the bored pile in the civil field [30], radar signal denoising in the signal field [31], decomposition of heart signal in the medical field [32], two-dimensional plane elastic problem in the mathematical field [33], and transient power disturbance detection in power grid field [34].The time series is decomposed by selecting different decomposition levels of the DB wavelet basis function to obtain the low-frequency trend item and high-frequency noise item, respectively. The signal-to-noise ratio (SNR) of the high-frequency noise term (the larger the value, the better the decomposition effect) and the root mean square error (RMSE) of the low-frequency trend term (the smaller the value, the better the decomposition effect) are calculated and the optimal decomposition level is chosen. Through wavelet decomposition, the original time series function is decomposed into a superposition of various subfunctions: where are the approximate function and detail function, respectively; are the coefficients corresponding to the function, respectively. The approximate function represents the low-frequency part of the original time series, and the detailed function represents the high-frequency part of the original time series. Through multilayer decomposition of the original time series function, a low-frequency function and a high-frequency function can be obtained at each layer. The low-frequency trend item and high-frequency noise item after the decomposition of the original time series are shown in Figure 1. The original time series can be obtained by adding the values of the low-frequency trend item and the high-frequency noise item, as shown in Figure 1.(2)Low-frequency trend items are predicted using the LSTM models. LSTM is an improvement of a simple RNN [35]. LSTM creatively adds a cell state Ct to the hidden layer to record long-term information and hidden state ht to record short-term information. Meanwhile, the “gate” structure of the forget gate, input gate, and output gate is set up to update and discard information at all times during the network training process. This setting enables the LSTM model to solve the problem of gradient disappearance and explosion perfectly. It is excellent in the prediction performance of the long-term series [36]. The schematic diagram of LSTM model is shown in Figure 2. represents the input value of the sequence, and the meanings of the other letters are as follows: is the input gate, which is calculated as follows: is the output gate, which is calculated as follows: is the forget gate, which is calculated as follows: is the candidate value of the cell state, which is calculated as follows: is the cell state, which is calculated as follows: is the state value of the hidden layer, which is calculated as follows: where represents the sigmoid function, is the activation function, and and represent the weight and bias, respectively. It is necessary to construct neural network training samples before predicting the trend items obtained by the decomposition. Table 1 lists the construction method of the training samples, where represents the trend item in the trend sequence. The network input item represents the number of samples needed in the prediction, and the network output represents the number of items that need to be predicted from the input. For example, when , , it means that the 3 input items predict the 3 output items. Different values of and will result in different results, and the accuracy of the prediction results will also change accordingly. In this study, the rolling prediction method is used; that is, is used to predict , and is used to predict , etc. Every first item predicts the last item, and finally, realizes the prediction of multiple trend items.(3)High frequency noise items were predicted using the AR(p) method. AR is a model used in predicting a stationary time series. When this method is applied, the data must have autocorrelation. Thus, before using the AR model to make predictions, the autocorrelation of the time series should be calculated first, and the autocorrelation is represented by the autocorrelation coefficient . In the time series, set as the autocorrelation coefficient; thus, the autocorrelation coefficient with a time delay of can be expressed by the degree of correlation between the time series value in period and the time series value in period . is calculated as follows: where is the average value of the time series. The predicted value of the AR model consists of a constant term, a random error term, and a numerical value in the time series. The specific expression of the AR model is as follows: where is a constant term, is the assumed random error value, the average value of equals zero, the standard deviation is , and the value of is assumed to remain unchanged at any time .(4)Total Subsidence Prediction. The predicted values of the low-frequency trend item and the high-frequency noise item are summed to obtain the total subsidence predicted value, and the predicted value is compared with the actual monitoring value to calculate the error value and the prediction accuracy. The average absolute error (MAE) was used as an evaluation indicator of the prediction accuracy:where is the actual monitoring value, is the predicted value, is the number of prediction periods, and is the number of observation periods.



The prediction flowchart is shown in Figure 3. After summing the low-frequency trend term predicted by LSTM and the high-frequency noise term predicted by AR, the combined model prediction result can be obtained. The prediction results of the combined model are compared with the LSTM prediction model, the RNN prediction model, and the gray prediction model GM (1, 1) to evaluate the combined prediction model are compared with a single prediction model and traditional time series to evaluate the prediction accuracy of the prediction model.
3. Engineering Overview
The reconstruction and expansion project of the Jixi section of the national highway Dan-A highway is a Class I highway. Its design speed is 80km/h and the width of the pavement is 25.5 m. The starting and ending points of this project are K1600 + 000 − K1651 + 498.739; the length of the route is 51.478 km. The scope of the old goaf is K1605 + 000 − K1628 + 660 and K1631 + 985 − K1647 + 100, with a total length of 38.775 km. The schematic diagram of the planned route is shown in Figure 4.

The proposed project passes through KQ1 (Muling Mine), KQ2 (Pinggang Mine), KQ3 (Laodagou Mine), KQ4 (Hengshan Mine), and KQ5 (Lixin Mine) from south to north, each with a long history of mining. The phenomenon of private digging and random mining is serious, and mining in various mining areas has basically stopped. The route traverses or is adjacent to 16 working faces in 5 major mines and 21 local coal mines, with a total length of 22.722 km across working faces or coal mines. Muling Mine is located in a low mountain and hilly area. The ground elevation is 230–360 m. The overall terrain is low in the north and high in the south. The inclination of the coal seams is 10–30°, and the average thickness of each coal seam is 0.5–2.0 m. The illegal mining point adopts the lane and pillar type, and the recovery rate is 30%–60%; the Pinggang and Laodagou Mines are located in the low mountain and hilly areas. The ground elevation is 360–510 m. The overall terrain increases from north to south and then decreased. The inclination angle of the coal seams is 30–40°, and the average thickness of each coal seam is 0.8–2.0 m. Longwall mining with a recovery rate of 60%; the Hengshan Mine is located in a low mountain and hilly area. The ground elevation is 220–456 m. The overall terrain is higher in the north and lower in the south. The inclination angle of the coal seams is 14–25°. The average thickness of each coal seam is 0.5–1.7 m. The mining point adopts the road and pillar type, and the recovery rate is 30%–50%; The Lixin Mine is located in a low mountain and hilly area. The ground elevation is 230–270 m. The overall terrain is higher in the north and lower in the south. The inclination angle of the coal seams is 25–35°, and the average thickness of each coal seam is 0.3–2.0 m. The mining point adopts the lane and pillar type, and the recovery rate is 30%–50%.
After multiple exploration methods such as geophysical prospecting and drilling, it is determined that most of the old goaf areas along the road have collapsed, and there is no possibility of large-scale sudden subsidence; however, subsidence and deformation are continuing. In order to find out the subsidence deformation trend of old goaf and its impact on the subgrade and pavement. A total of 78 reference points and 135 monitoring points were arranged along both sides of the road. After one and a half years of monitoring, 135 groups of 25 phases of subsidence and deformation data were obtained. This study selects the monitoring points KQ1-Z, KQ2-Z, KQ3-Z, KQ4-Z, and KQ5-Z near the middle of each mining area for the training of the subsidence prediction model and the test of the prediction effect of the model. The pile numbers of different monitoring points and the distances from the mined-out areas in the mining area are listed in Table 2.
4. Engineering Case Analysis
By analyzing the KQ1-Z subsidence monitoring data, the establishment of the combined model LSTM-AR is explained. The KQ1-Z subsidence monitoring data are presented in Table 3.
4.1. Decomposition of Subsidence Monitoring Data
When the DB wavelet denoises the subsidence data, it can better decompose the subsidence data into low-frequency trend items and high-frequency noise items, and the denoising effect is better than other wavelets [37]; in this study, the DB wavelet function is used to decompose the subsidence data. By selecting different DB functions, the subsidence data are decomposed into different layers. After the low-frequency trend term and high-frequency noise term are obtained, RMSE and SNR are calculated. The calculation results are presented in Table 4.
Because the prediction result of the subsidence data is reconstructed from the low-frequency trend term and the high-frequency noise term decomposed by the DB wavelet function, too many decomposition layers will lead to the accumulation of prediction errors [38], which will lead to a decrease in prediction accuracy. Therefore, when calculating RMSE and SNR, this study only selects the number of decomposition layers 1 and 2 for calculation.
As listed in Table 4, when the wavelet function is DB6 and the decomposition layer is 1, the decomposition effect of monitoring subsidence data was the best. At this time, the SNR was 45.061, and the RMSE was 0.060. The low-frequency trend item and high-frequency noise item after decomposition are shown in Figure 5.

As there are many kinds of decomposition algorithms, wavelet decomposition is only one of them, so other decomposition algorithms can be used to verify the effectiveness of DB wavelet decomposition. Considering that the variational modal decomposition (VMD) has a strict mathematical theory and is an adaptive and completely nonrecursive modal variation and signal processing method, the adaptive decomposition of the target signal can be achieved by iteratively searching for the optimal solution of the variational model, the decomposition effect is good, and this algorithm and its improved algorithm have been widely used in many fields [39, 40], so the optimal decomposition effect of VMD algorithm is compared with that of the algorithms adopted in this study.
By constantly changing the size of the penalty factor , the subsidence time series is decomposed into low-frequency trend item and high-frequency noise item by the VMD algorithm, and the corresponding RMSE is calculated to obtain the optimal decomposition effect. The optimal decomposition effect of the two algorithms is compared, as listed in Table 5.
It can be seen from Table 2 that when the number of decomposition layers is 1, the RMSE difference between the two algorithms is only 0.002; when the number of decomposition layers is 2, the RMSE difference between the two algorithms is only 0.003. Therefore, the DB wavelet decomposition algorithm used in this study is effective enough.
4.2. Low-Frequency Trend Item Prediction
Low-frequency trend items were predicted using LSTM models. Regarding the accuracy of the prediction results of different network input and output items, some scholars indicated that the length of the network input items is not as long as possible, and the more distant subsidence information has a negligible correlation with the current subsidence prediction [41]. Under different network input and output items (n-k), the average absolute error of the prediction results is listed in Table 6.
As shown in Table 6, when the 5–5 model is used to predict the subsidence, the average values of the absolute errors are 0.187, which are the minimum values. Based on the above analysis, the final prediction mode of LSTM adopts the modes of n = 5 and k = 5; that is, the 6–10 items are predicted through the 1–5 low-frequency trend items, after updating the 6–10 low-frequency trend items, and the 1–10 items are used to predict the 11–15 items, and so on, until the 31–35 items are predicted.
4.3. High-Frequency Noise Term Prediction
The high-frequency noise term was predicted using the AR model. The p value in the AR model was determined by calculating the size of the criteria AIC for different p values, as listed in Table 7. When , the values of the criteria AIC are the smallest, indicating that the model order at this time is optimal. Therefore, the AR(4) model is used to predict the high-frequency noise term.
4.4. Total Subsidence Prediction
By adding the prediction results of the low-frequency trend term and the high-frequency noise term, the total subsidence prediction result can be obtained, as shown in Figure 6. In the mode that does not use wavelet decomposition, the subsidence prediction results of the LSTM, RNN, and GM (1, 1) prediction models are shown in Figure 6. The absolute value of the error between the prediction results of the combined model and the single model and the measured value is shown in Figure 7(the left axis corresponds to the upper left data, and the right axis corresponds to the lower right data, the same below).

(a)

(b)

(a)

(b)

(c)

(d)
Figures 6 and 7 show that the absolute value of the combined prediction model LSTM-AR error is between 0.05 and −0.5 mm, the prediction results are all within the 95%–100% confidence interval of the actual monitoring results, the predicted subsidence trend is almost the same as the actual subsidence trend, and the predicted result is close to the measured value; the absolute value of the single-model LSTM prediction error is between 0.1 and −1.2 mm. The prediction results are mostly within the 90%–95% confidence interval of the actual monitoring results, the subsidence prediction trend is roughly in line with the actual subsidence trend, and the prediction result is too small; the absolute value of the single-model RNN prediction error is between 0.05 and −1.3 mm. The prediction results are mostly within the 90%–95% confidence interval of the actual monitoring results. The subsidence prediction trend is roughly in line with the actual subsidence trend, and the prediction result is too small; the absolute value of the prediction error of the GM (1,1) model is between 0.1 and −3.0 mm. Only part of the prediction result is within the 90%–100% confidence interval of the actual monitoring result, and there is a significant difference between the prediction subsidence trend and the actual subsidence trend.
The above analysis only qualitatively judges that the prediction accuracy of the proposed combined prediction model is better than that of a single prediction model, and no quantitative analysis is made. The MAE between the predicted results and the measured values of different prediction models can be calculated by formula (8), and the prediction accuracy of different prediction models can be quantitatively analyzed by it. The calculation results are listed in Table 8.
It can be seen from Table 8 that, for the same subsidence time series, the MAE of combined model LSTM-AR is 0.206 mm, and that of single models LSTM, RNN, and GM (1,1) are 0.455, 0.511, and 1.013 mm, respectively. The prediction accuracy of combined model LSTM-AR is obviously better than that of single models LSTM, RNN, and GM (1,1).In the effect of DB6 wavelet on reducing the prediction error caused by measurement data noise, the absolute value of the average prediction error of a single model is used as the prediction error caused by measurement data noise. Compared with the single model LSTM, the DB6 wavelet in the LSTM-AR combined model reduces 54.73% of the prediction error caused by noise in the measurement data; compared with the single model RNN, the DB6 wavelet in the LSTM-AR combined model reduces the prediction error caused by the noise of the measurement data by 59.69%; compared with the single model GM (1,1), the DB6 wavelet in the LSTM-AR combined model reduces the prediction error caused by the noise of the measurement data by 88.55%. Although the accuracy of the single-model prediction results on some data is higher than that of the combined prediction model, in terms of overall data and prediction trends, the combined prediction model has higher accuracy and better trends. In conclusion, the prediction effect of the combined prediction model considering engineering noise is more in line with the actual working conditions.
4.5. Subsidence Prediction of Other Mining Areas
Using the same method and steps to predict the subsidence of the remaining four monitoring points, the effect is shown in Figures 8–11.




It can be seen from Figures 8–11 that the subsidence prediction trend of the combined prediction model LSTM-AR is almost consistent with the actual subsidence trend, and the prediction result is close to the actual measured value; the subsidence prediction trend of the single-model LSTM is roughly in line with the actual subsidence trend; the subsidence prediction trend of the single-model RNN is roughly in line with the actual subsidence trend; the subsidence prediction trend of the GM (1,1) model differs significantly from the actual subsidence trend.
The MAE of the predicted results and the measured values of different prediction models are calculated by formula (8), and the prediction accuracy of different prediction models is further quantitatively analyzed by it. The calculation results are listed in Table 9.
It can be seen from Table 9 that for the other four monitoring points, the MAEs of the combined model LSTM-AR are 0.112, 0.151, 0.102, and 0.111 mm, respectively, the MAEs of the single model LSTM are 0.197, 0.226, 0.184, and 0.189 mm, respectively, the MAEs of single model RNN are 0.240, 0.244, 0.156, and 0.186 mm, respectively, and the MAEs of GM (1,1) are 0.734, 0.423, 0.709, and 0.740 mm, respectively. The prediction accuracy of the combined model LSTM-AR at each monitoring point is better than that of the single model LSTM, RNN, and GM (1,1). In the prediction of the subsidence monitoring data from KQ2-Z to KQ5-Z, the effect of DB6 wavelet to reduce the prediction error caused by the noise of the measurement data is as follows: compared with the single model LSTM, the DB6 wavelet in the LSTM-AR combined model reduces the prediction error caused by the noise of the measurement data by 43.15%, 33.19%, 44.57%, and 39.67%, respectively; Compared with the single model RNN, the DB6 wavelet in the LSTM-AR combined model reduces the prediction error caused by the noise of the measurement data by 53.33%, 38.11%, 34.62%, and 40.32%, respectively; Compared with the single model, the DB6 wavelet in the LSTM-AR combined model reduces the prediction error caused by the noise of the measurement data by 84.74%, 57.21%, 85.61%, and 85.00%, respectively. Through comprehensive comparison, the combined prediction model LSTM-AR is superior to the single prediction models LSTM, RNN, and GM (1,1) in prediction trend and prediction accuracy.
5. Conclusions
In this study, the DB wavelet is used to decompose the subsidence data of the old goaf to reduce engineering noise, the combination model LSTM-AR is used to predict the subsidence data, and the application of the combination model in the subsidence prediction of the old goaf is discussed by combined with practical engineering. Further, the combined prediction results are compared with the single model and the traditional model, and the following conclusions are drawn:(1)Due to the interference of various factors during engineering monitoring, the collected subsidence data contain various engineering noises. The existence of noise affects the accuracy of subsidence prediction.(2)For reducing the prediction error caused by the noise of the measurement data, DB6 wavelet has obvious effect. Compared with the single model LSTM, the prediction error caused by the noise of measurement data is reduced by 43.06% on average; compared with the single model RNN, the prediction error caused by the noise of the measurement data is reduced by 45.21% on average; compared with the traditional model, the prediction error caused by the noise of the measurement data is reduced by 80.22% on average.(3)The combined prediction model considering the impact of measurement data noise exerts the prediction advantages of each prediction model. The results of actual case prediction show that the predicted value of the combined prediction model has higher accuracy, smaller errors, and better trends than a single prediction model and traditional prediction models. It can be used as a calculation method to improve the prediction accuracy of surface subsidence in old goaf.
Data Availability
The data used to support the findings of this study are available from the corresponding author upon request.
Conflicts of Interest
The authors declare that they have no conflicts of interest.
Authors’ Contributions
C.D. provided the initial idea and wrote the manuscript; C.D. designed and performed the research; C.P.H and F.J.Z helped in the discussion and financed the research; C.P.H and F.J.Z reviewed the manuscript and made relevant suggestions; F.J.Z was responsible for on-site subsidence monitoring and provision of engineering data. All authors have read and agreed to the published version of the manuscript.
Acknowledgments
This work was supported by the Department of Transportation of Heilongjiang Province. This research was funded by the Science and Technology Project of the Department of Transportation of Heilongjiang Province (20210430).