Abstract

To improve prediction accuracy of top-coal drawing capability in steep seams, principal component analysis (PCA) and the general regression neural network (GRNN) are combined (PCA–GRNN model) to predict top-coal drawing capability in steep seams. Nine commonly used influencing factors are selected to establish a predictive index system for top-coal drawing capability in steep seams. The PCA is used to eliminate correlation and reduce dimensions of various indices, thus obtaining three linearly uncorrelated principal components (PCs) , , and , which form the input vectors of the GRNN. In this way, the factors that most affect the top-coal drawing capability in steep seams are found to be floor flatness, dip angle of the coal seam, and the hardness of the coal seam. The results show that the PCA–GRNN model outperforms the GRNN and random forest models in prediction results, which indicates that the PCA improves prediction accuracy of the GRNN model. It is feasible to predict top-coal drawing capability in steep seams by combining or even integrating different analytical models into one. The proposed PCA–GRNN model can be used to predict top-coal drawing capability in steep seams.

1. Introduction

Top-coal drawing capability refers to the difficulty of drawing of the top part of a coal seam under the mine pressure and its weight. The selection of an appropriate top-coal caving technology directly can determine the mining efficiency, and high top-coal drawing capability is the premise of applying the technology [1, 2]. In many top-coal caving faces, the top-coal drawing capability is not evaluated before mining, which leads to reduced economic benefits. Therefore, research into top-coal drawing capability cannot only improve the economic benefit but also provide an important theoretical basis for mine production [39].

At present, several methods have been proposed for identifying top-coal drawing capability or difficulty, such as the distance discrimination method, support vector machine (SVM), Fisher discriminant analysis, and neural network methods. According to the merits and demerits of these methods, Liu et al. [10] established a distance discrimination analysis model for distinguishing difficulty of top-coal caving in steep seams, which helped the popularization and application of the roadway caving method in steep seams. Liu et al. [11] built an SVM model for identifying difficulty of top-coal caving based on a radial basis function, which provides a new method for determining the top-coal drawing capability. Long-jun et al. [12] established a Fisher discriminant analysis model for judging top-coal drawing capability. They selected nine indices including mining depth of main roof, thickness of coal seam, and Protodyakonov coefficient as classification indices of the model, which allows accurate prediction of top-coal drawing capability in coal seams under different mining conditions. Wang et al. [13] evaluated the top-coal drawing capability in fully mechanized caving faces using the artificial neural network method. They not only assessed the extent of top-coal caving but also predicted the comprehensive technical and economic indicators of the working face. Each of these methods has their own merits and demerits. For example, the distance discrimination method regards that various indices or factors pertaining to samples are of equal importance when determining the distance, while in fact, these indices or factors do not play roles of equal importance in determining the classification of samples. Therefore, if the importance of various indices or factors is not determined in advance, the distance discrimination method is likely to overstate the effect of some less important indices, leading to misjudgements in predictions [14]. The SVM is a machine learning method based on statistical learning theory and optimization theory that maximizes the geometric spacing for separating hyperplanes. By introducing kernel functions, the method can transform nonlinear classification problems into linear ones in a high-dimensional space. However, the method is limited to data decomposition itself and ignores the intrinsic structural compactness of data. This not only results in high complexity of the algorithm but is likely to lead to errors in processing noisy data, thus reducing classification accuracy. The Fisher discriminant analysis is a statistic analysis technique to identify newly obtained samples according to some existing quantitative characteristics of observation samples. The approach maps high-dimensional data points into a low-dimensional space, to render data points more dense. However, the matrix inversion and eigenvector calculation increase the computational workload, and the pairwise extraction and classification criterion need to be introduced for classification of multiple classes. The neural network method not only has the self-learning function but also the associative memory function. In the case of an onerous computational burden, the use of a feedback artificial neural network designed for a specific problem can give full play to the high-speed computing power available, so it may seek the optimal solution rapidly.

The general regression neural network (GRNN) has the advantages of simple structure, easy training, fast convergence, and strong fault tolerance and is mainly used in pattern classification problems such as fault diagnosis. In fact, most of the top-coal drawing capability prediction indicators have certain correlations, so the correlation between the prediction indicators should be eliminated before applying the GRNN network. Common methods to eliminate the correlation between indicators include limiting the number of indicators, separating overlapping elements, modifying indicator weights, principal component analysis, and factor analysis. Considering that there are many prediction indicators used in this paper, the principal component analysis (PCA) is used to preprocess the data of the prediction indicators of top-coal drawing capability, which can not only eliminate the correlation between the indicators but also reduce the dimensionality of the indicator data and improve the PNN. Based on the above theoretical analysis, in this paper, 25 groups of data on the influencing factors of top-coal drawing cavability are selected, and principal component analysis (PCA) is used to reduce the dimensionality of the indicators and convert multiple indicators into a few independent indicators. This eliminates correlation and realizes dimension reduction of indexes. Then, the general regression neural network (GRNN) is introduced. The PCA–GNRR prediction model for top-coal drawing capability in steep seams is established by combining the PCA and GRNN. In addition, a random forest model is also used to compare accuracy of prediction results of various models. In this way, the performance of the established model in predicting top-coal drawing capability in steep seams is evaluated, which provides a basis for improving accuracy of such predictions.

2. Factors Influencing Top-Coal Drawing Capability in Steep Seams

2.1. Roof Conditions

Roof conditions influence the top-coal drawing capability mainly through stability of the immediate roof and the main roof. If the immediate roof can cave following mining, it does not influence the top-coal drawing capability; if the immediate roof is very hard and does not cave over a large area, it will bring significantly affect the working face upon failure of the roof.

In the top-coal caving mining, stress on the main roof is relieved due to the buffering effect and absorption of the top coal for weighting, so the working face is less affected and suffers less damage during weighting, while the range of influence of the pressure is enlarged. Meanwhile, intense weighting on the main roof during slicing mining may induce spalling of the working face [15, 16].

2.2. Floor Conditions

Top-coal drawing capability of a coal seam is related to two interdependent factors, i.e., the stability and flatness of floors. That is, the flatter the floor, the more stable it is, and the better the top-coal drawing capability [17, 18].

2.3. Gas

Gas pressure is one of the causes of breakage of top coal. A high gas content can soften coal seams, which is conducive to breakage of top coal; however, the roadway caving method is not applicable to coal seams prone to gas outburst. Gas accumulation should be paid close attention to in coal seams with a high gas content [1923].

2.4. Mining Depth

The mining depth directly influences the magnitudes of in situ stress and peak abutment pressure in surrounding rocks of a working face. The abutment pressure plays a decisive role in breakage of top coal. When ignoring the influences of the tectonic stress field, the greater the depth of occurrence of a coal seam, the more readily the critical failure condition of top coal is met and the higher the top-coal drawing capability, according to the Griffith strength criterion [2426].

2.5. Dip Angle of Coal Seams

For coal seams with a large dip angle, the self-weight of a coal mass in the vertical direction is larger than the resultant of other forces acting thereon, so that coal mass is more likely to cave, which is favorable for drawing of top coal. If the dip angle reaches 90°, the force acting along the vertical direction only includes the weight of broken coal gangue in the goaf above the top coal, so the coal does not readily cave [27, 28].

2.6. Thickness of Coal Seams

Top coal is subjected to presplitting and damage in the mining process and stores significant amounts of energy. The energy is released during the migration of top coal, which breaks coal mass. If the top coal is too thin, it is difficult to ensure its caving at the tail of support and causes advanced breakage of the immediate roof, which is drawn out together with top coal. As a result, a large amount of top coal is lost in the goaf. If the top coal is too thick, it is challenging to ensure sufficient looseness of top coal in the roof-control zone, so the top coal does not readily cave in the caving zone. In addition, there will not be enough room for caving if the top coal is too thick [29].

2.7. Hardness of Coal Seams

The hardness of coal seams is an important index for evaluating damage resistance of coal seams and directly influences the failure process and breakage degree of top coal under compressive stress. Therefore, it is inversely proportional to the top-coal drawing capability: the lower the hardness, the higher the capability. The hardness is represented by Protodyakonov’s coefficient [3032].

2.8. Dirt Bands

The extent of dirt bands is expressed as , where and separately denote the total thickness of dirt bands in a coal seam and that of the coal seam. Influences of dirt bands on top-coal drawing capability are shown as follows: if dirt bands are weaker than the coal mass, they form a weak plane in coal seams and their presence is conducive to the breakage, caving, and drawing of top coal, which improves the top-coal drawing capability. The more, the thicker, and the softer the dirt bands are, the better the top-coal drawing capability, whereas dirt bands that are harder than coal mass are unfavorable for drawing of top coal [33].

3. Principles Underpinning the Methods

3.1. PCA

The PCA is a dimension reduction method in mathematics that uses orthogonal transform to convert a series of possibly linearly correlated variables into a set of linearly uncorrelated new variables or principal components (PCs). In this way, new variables are used to characterize data features in a lower dimension. These PCs are linear combination of original variables and their number is less than that of original ones. The combination is equivalent to generation of a new set of observations, which have different meanings with original data while contain most features of original data, show lower dimensions, and therefore are convenient for further analysis.

The PCA can be spatially interpreted as mapping original data into a new coordinate system. The first PC corresponds to the first coordinate axis, which represents the range of variation of the new variable transformed in a certain way from multiple variables in the original data; the second PC corresponds to the second coordinate axis and represents the range of variation of the second new variable transformed in a certain way from multiple variables in the original data. In a similar fashion, the difference in samples interpreted by original data is transformed into that interpreted by new variables. To remain interpretation of original data as far as possible, the maximum variance theory or minimum damage theory is generally used to ensure that the first PC has the largest variance (able to interpret difference in original data as much as possible). Each of subsequent PCs is orthogonal with the previous one and has the largest variance only second to the previous one. The PCs are derived as follows:

The PCs are calculated as follows: (1)For the matrix X:

After standardization, where (2)The symmetric correlation coefficient matrix of standardized variables is calculated as

The correlation coefficient between variables is (3)The matrix R is subjected to eigenvalue decomposition, to calculate eigenvalues 1, 2, …, m and eigenvectors , , …, (4)PCs are calculated as (5)Contribution and cumulative contribution of each PC are calculated as follows:

In actual application, the first to the th () PCs corresponding to eigenvalues whose cumulative contribution is greater than 85%, or PCs whose eigenvalues are greater than 1 are generally selected.

3.2. GRNN

The GRNN integrates the density estimation and Bayesian decision theory based on the radial basis function (RBF) neural network and substitutes the sigmoid activation function using an activation function deduced by the statistical method. The GRNN is also similar to the probabilistic neural network (PNN) in terms of structure, both comprising the input, model, summation, and output layers. The difference lies in that the GRNN has two types of neurons on the summation layer, allowing more comprehensive computation than the PNN.

The mapping relationship of the GRNN as shown in Figure 1 is established according to the following steps: (1)The input layer is responsible for transferring input variables to the model layer via a linear function; neurons on the model layer correspond to different samples and the transfer function iswhere denotes the output of the th neuron on the hidden layer and is the learning sample corresponding to the th neuron (2)The output of the model layer is calculated through summation, in two ways: one is the arithmetic summation :

The other is weighted summation : where is the connection weight and is valued as the th element in the th output sample and represents the dimension of the output vector of the learning samples (3)By dividing outputs of the summation layer of each neuron, the output of each neuron on the output layer is obtained as

3.3. PCA–GRNN Prediction Model for Top-Coal Drawing Capability in Steep Seams

The flowchart of the proposed PCA–GRNN prediction model for top-coal drawing capability in steep seams is shown in Figure 2. The main calculation steps are as follows.

Step 1. Selecting evaluation indices for factors influencing top-coal drawing capability.

Step 2. Collecting case data according to the indices.

Step 3. Zero-mean normalization of data about top-coal drawing capability to eliminate influences of different dimensions across indices on the test results, followed by correlation analysis of normalized data.

Step 4. Using PCA to eliminate correlation and reduce dimensions of indices, thus determining PCs.

Step 5. Establishing the GRNN model, in which the smooth factor is input. The model is trained with training samples until attaining satisfactory results.

Step 6. Inputting testing samples for predicting levels of top-coal drawing capability in the trained model and using the evaluation indices to evaluate and compare the prediction accuracy.

4. Calculation Process and Results

4.1. Evaluation Indices and Data Pertaining to Top-Coal Drawing Capability in Steep Seams

Through comprehensive analysis, the type of main roof (), the stability of the immediate roof (), floor flatness (), gas content (), mining depth (), dip angle of coal seams (), thickness of coal seams (), hardness of coal seams (), and extent of dirt bands () are selected as discriminant factors affecting top-coal drawing capability. Therein, the type of main roof is graded into four levels as insignificant, significant, intense, and very intense according to the degree of first weighting. The stability of immediate roof is divided into four grades, that is, unstable, moderately stable, stable, and very stable, based on different first caving steps of the immediate roof: less than 8 m, 8–18 m, 18–28 m, and 28–50 m. The floor flatness is graded at four levels as flat, less flat, rough, and very rough. The gas content , mining depth , dip angle of coal seams , and thickness of coal seams take corresponding values; the hardness of coal seams is represented by the Protodyakonov coefficient; and the extent of dirt bands is expressed as , that is, the ratio of thickness of dirt bands to that of coal seams. The top-coal drawing capability is graded into four levels: very high (A), high (B), general (C), and poor (D). Table 1 lists 25 groups of data selected in the research.

4.2. PCA Preprocessing

To avoid influences induced by dimensional differences of sample data, data in the training set and the test set are standardized and then subjected to PCA. Correlation coefficients of various factors are listed in Table 2. The absolute value of the correlation coefficient reflects the degree of correlation between two factors. When , , , , and , two factors are uncorrelated, slightly correlated, significantly correlated, extremely significantly correlated, and completely correlated, respectively. The absolute values of correlation coefficients between and , and , and , and , and , and , and , and and are all greater than 0.5, so they are significantly correlated.

The total variances, variance percent, and cumulative variance of the initial eigenvalues, extraction sums of squared loadings, and rotation sums of squared loadings of each PC are listed in Table 3: the first three PCs have eigenvalues larger than 1 and initial cumulative contributions of 40.670%, 62.567%, and 75.280%, respectively, which contain the majority of information pertaining to the original factors. Therefore, the first three PCs are selected here as the comprehensive evaluation indices reflecting top-coal drawing capability.

The scoring coefficients of components in each PC are listed in Table 4. On this basis, the formula for each PC is derived as

When reducing dimensions used the PCA in the proposed PCA–GRNN model, each of the original influencing factors contributes to different degrees. According to contributions of original influencing factors and those of PCs after dimension reduction, the floor flatness (), dip angle of coal seams (), and hardness of coal seams () are found to make the highest contributions (Table 5). Therefore, it is inferred that they have more obvious influences on top-coal drawing capability.

4.3. Parameter Determination of the GRNN

In order to demonstrate the superiority of the PCA in predicting top-coal drawing capability, the traditional GRNN model is compared with the model established in the present research. The PCA–GRNN model takes the comprehensive evaluation indices , , and for top-coal drawing capability as the input vectors, while the GRNN uses the selected indices , , , , , , , , and for top-coal drawing capability as the input vectors. The research finds that the selection of the smooth factor plays a critical role in the model performance. If the smoothing factor is too small, the network is likely to be overfitted, while if too large, a smoothing factor fails to distinguish between various details. The smooth factor here is selected through use of the construction method according to the following steps: inputting training samples to train the original GRNN model and then setting the smoothing factor to different values in the range of [0.1, 1.0], with an increment of 0.1. The relationship between prediction accuracy of the model for training samples and the value of the smoothing factor is illustrated in Figure 3. The value of the smoothing factor corresponding to the highest accuracy is selected.

4.4. Predicted Results

Apart from studying differences of the GRNN model and the PCA–GRNN model subject to dimension reduction with PCA in accuracy in the model training stage, the research also introduces the random forest model to compare test results of different models. At first, 20 groups (five groups at each level) of training samples are input as the learning samples to train each model, and then, five groups of samples to be judged are input to test performance of these models.

The test results are illustrated in Figure 4 and Table 6. For testing samples, the PCA–GRNN model always has accuracy higher than other models. Among learning samples, the PCA–GRNN model yields results different from the actual level in only one learning sample, while its results in testing samples are same as the actual level. It is evident that the PCA can improve the prediction accuracy of the model. The result indicates that the PCA–GRNN model provides a prediction method for accurately determining top-coal drawing capability.

When comparing with the random forest model and the GRNN model as shown in Table 7, the GRNN and random forest models have lower accuracy in the training samples and prediction samples. Also, PCA–GRNN model take less time to run. Therefore, the application of this model to accurately identify the cavability of coal seams is beneficial to reduce the blindness in the promotion work and improve the mining effect of top-coal caving in steeply inclined coal seams.

In view of the different level of top-coal drawing capability, necessary top-coal weakening measures will be taken to ensure the top-coal discharge rate and effectively prevent the occurrence of roof disasters.

5. Conclusions

(1)The PCA is used to process data related to factors influencing top-coal drawing capability, thus transforming nine influencing factors into three PCs. This reduces dimensions of data, simplifies the prediction model, and finally improves prediction efficiency and accuracy(2)PCA and GRNN are combined to establish the PCA–GRNN prediction model for top-coal drawing capability in steep seams, and an appropriate smooth factor is selected. Comparisons with the GRNN and random forest models prove that the accuracy of the PCA–GRNN model is higher(3)According to analysis of contributions of each influencing factor to PCs and PCs to the model in the PCA process, importance of each influencing factor is calculated and predicted. In this way, the floor flatness (), dip angle of coal seams (), and hardness of coal seams () are found to make the largest contributions, so it is inferred that they exert significant influences on top-coal drawing capability(4)The combined model PCA–GRNN proposed in the research can improve the prediction accuracy, while the prediction results can be further optimized. In future research, the number of samples will be increase to improve the generalization ability and prediction accuracy of the model. At the same time, considering the influence of multicoal seam mining, groundwater, and other factors on the cavitation of top coal, the prediction model will be further improved

Data Availability

The original contributions presented in the study are included in the article/supplementary material; further inquiries can be directed to the corresponding author.

Conflicts of Interest

The authors declare that they have no conflicts of interest.

Acknowledgments

The authors acknowledge the State Key Laboratory of Coal Mining and Clean Utilization (2021-CMCU-KF016) and the Basic Scientific Research Projects of Universities in Liaoning Province (LJKZ0343).