Abstract
A rock bolt refers to a reinforcing bar used commonly in geotechnical engineering. Also, defect identification of bolt anchorage system determines the safe operation of the reinforced structures. In the present paper, to accurately extract defect information, a CNN model based on time-frequency analysis is proposed, covering both time-domain and frequency-domain information. The effect of the number of convolution kernels on the defect identification results is discussed. By laboratory experiments, the performances of STFT-based CNN with those of time-domain input or frequency-domain input-based 1D CNN are compared, and the results demonstrate that the proposed method showed enhanced performance in identification accuracy.
1. Introduction
A rock bolt refers to a reinforcing element used commonly in engineering to ensure the safe operation of reinforcing structures. Rock bolts for reinforcing the underground mine roofs have been increasingly used in the mining and tunneling sectors [1–3], having acted as the primary support system in underground mines. The performance of the bolt anchorage system is vital for both the safety of personnel and the productivity of the operation. Each year, millions of rock bolts are installed all over the world, and a feasible method is required for monitoring their integrity, especially assessing whether the bolt has experienced any breakage and will pose a safety risk.
Various signal-based algorithms have been proposed for the identification of the anchorage system. For the signal-based algorithms, identification accuracy is determined by the features sensitive to the fault. The process of feature extraction can be achieved by computing statistics metrics in time, frequency, or time-frequency domain of the signal representation [4–7]. The authors of [8] proposed a new time-frequency analysis method, nonlinear sparse mode decomposition (NSMD), which is proved to be a feasible signal decomposition method and an effective fault diagnosis method for planetary gearbox fault diagnosis. In [9], 17 time-domain and frequency-domain features were taken to build the characteristic matrix, and the feature with larger cumulative contribution was adopted as the input of SVM to achieve the identification of bolt anchor. However, the designing of appropriate features is a critical task, and it is highly dependent on extensive domain expertise and prior knowledge. To achieve such design, researchers should understand the structure and the principle of the anchorage system very well, making new features difficult to mine. Besides, due to the effect of environment vibration and noise, considerable noise signals are mixed in real signals, leading to necessary information missed or detected by mistake. Thus, a novel approach oriented to the original acquisition signal as much as possible should be developed to maintain the defect features and reduce the effect of noise.
As fueled by the extensive advancement of deep learning, numerous studies are being conducted on feature extraction. This method has the primary advantage of its ability to mine representative information and sensitive features from raw data. High-level abstractions of data can be modeled well based on the complex deep structures, through which feature extraction can be more efficient, as compared with the shallow networks. In [10], a 1D deep convolutional neural network was proposed for feature extraction of vibration signals. In [11], a deep multistream CNN was proposed to learn deep features for writer identification. In [12], three deep neural networks (deep Boltzmann machines, deep belief networks, and stacked autoencoders) were employed for feature extraction to identify the fault condition of rolling bearing. In [13], deep learning was adopted to forecast stock data. Benefited from the feature learning characteristic, deep learning has been extensively used in visual recognition and language understanding, and such feature learning ability has also become its critical advantage [14–18]. Obviously, the advantage of the feature learning ability of deep learning just satisfies the requirements of an adaptive feature extraction method. The use of deep learning and its feature learning ability for defect identification of bolt anchorage systems has bright prospect, and it is highly demanded.
Convolutional neural network (CNN), one of the most successful network architecture in deep learning method, has been applied with great success to learn features from raw data and adopted as the dominant approach for almost all recognition and detection tasks [19–24]. This paper develops a CNN model to learn features directly from the raw acceleration signals and tests the performance of feature learning from combined time-frequency data. The raw signal is converted into a time-frequency map using STFT. Subsequently, the time-frequency map acts as input to CNN. For comparison, three different inputs (time-domain, frequency-domain, and time-frequency map) are adopted as input to CNN. The results suggest that compared with the other two methods, the proposed method can achieve higher identification accuracy.
Compared with SVM, the BP neural network, and decision tree, the proposed model has the following apparent advantages:(1)The model automatically extracts and recognizes the deep features that are embedded in the bolt signal without designing special feature extraction methods(2)Compared with the single time-domain or frequency-domain feature, the model mines the time-frequency characteristics of the bolt signal, which provides more abundant information for the accurate identification of subsequent defects
The rest of this paper is organized as follows. In Section 2, how to build a suitable CNN model for the defect identification of bolt anchorage system is discussed. In Section 3, the experimental results for three different inputs are illustrated. Lastly, in Section 4, some conclusions are drawn.
2. Time-Frequency-Based Deep Model
2.1. Convolutional Neural Network
Convolutional neural networks (CNNs) were first proposed by LeCun [25] for image processing, characterized by two key properties (spatially shared weights and spatial pooling). CNN aims to learn abstract features by alternating and stacking convolutional layers and pooling layers. The architecture of typical CNN is shown in Figure 1, as structured by series of stages. The first few stages consist of two combined types of layers (convolutional layers and pooling layers), while the last stage of the architecture consists of a fully connected layer and a conventional classification model. In CNN, the convolutional layers (convolutional kernels) convolve multiple local filters with raw input data and generate translation invariant local features and the subsequent pooling layers extract features with a fixed-length over sliding windows of the raw input data following several rules (e.g., average and max).

2.2. Analysis of Input Representations for CNN
CNN models have achieved various successful computer vision applications where input data are usually 2D data [26–28]. The 1D (one dimensional) CNN model (only time-domain information is considered) has been studied for the anchorage defect identification, exhibiting better identification accuracy than shallow learning. Now, both time-domain and frequency-domain information are considered, which is expected to achieve better recognition rate than that of 1D CNN.
The DHDAS dynamic test device is employed to collect the acceleration signals for the normal reinforced bolts and three types of reinforced anchor bolts with different defects. The signal acquisition process is illustrated in Figure 2. Anchorage models are classified into four types, which are rebar, cement, mortar, C45 concrete, and PVC tube (see Figure 3).


The experimental operation process is as follows: firstly, an excitation signal is applied to the tip of the anchor bar with a small hammer, producing the stress wave signal, and the reflected wave is received by the IEPE piezoelectric acceleration sensor. Then, the acceleration signal is collected and stored with the DH5923 N dynamic signal test and analysis device, where the sampling frequency is 10 kHz.
The major specifications of the anchorage models are listed in Table 1, where No.1 denotes the normal reinforced bolt, and Nos. 2–4 are reinforced bolts with different defects. The corresponding tangent diagrams are given in Figure 4. Besides, a set of acceleration signals collected in the experiment are shown in Figure 5.

(a)

(b)

(c)

(d)

(a)

(b)

(c)

(d)
During the experiment, 260 sets of acceleration signals are collected for each type of anchorage models. If all the sample points are adopted as the CNN input, the training time will be very long, and the data features will be redundant, thus reducing the recognition rate of the anchorage defects. Figure 5 suggests that the acceleration signals vary obviously between 0.02 s and 0.04 s, which reflects the different characteristics of the anchorage systems. Accordingly, sampling points between 0.02 s and 0.04 s are used for experimental analysis.
In actual engineering, the anchor bolt is interfered by construction or natural conditions, which will cause the collected signals of anchor bolts with the same type to be slightly different. In order to enhance the generalization ability of the established identification model, the generative adversarial network (GAN) is introduced to enhance the data of the dataset. The principle of the GAN is shown in Figure 6.

In the experiment, the GAN network is used to generate 740 data for each type of bolt, which are combined with the original samples collected to form a data set. The data set is randomly disrupted, taking 3300 data for training set and 700 for testing.
The short-term Fourier transform (STFT) has been a feasible method to analyze frequency information over time. STFT is based on the Fourier transform of short fragments which are sampled by moving a window, which is commonly a Hamming window or Gaussian window. In the experiment, a Hamming window with a window size of 128 and overlapping samples of 64 is employed to transform the acceleration signals into time-frequency spectrogram using STFT, and the results are presented in Figure 7. Figure 8 illustrates that, to reduce the amount of calculation and optimize the training of CNN, grayscale processing is performed on the spectrogram.

(a)

(b)

(c)

(d)

(a)

(b)

(c)

(d)
2.3. The Architecture of CNN Based on STFT
In this section, the architecture of CNN is discussed to make it suitable for bolt defect identification. Since a CNN is capable of inputting the original image directly, the time-frequency spectrogram after grayscale processing is taken as the input of CNN.
To avoid overfitting, a CNN model is built with 2 convolutional layers and 2 pooling layers (see Figure 9). The convolution kernel size is set as 5 × 5, and the stride of pooling is 2. To simplify the network, the number of convolution kernels of the second convolutional layer is set as twice that of the first convolutional layer. The defect identification results of bolt anchorage system at the number of convolution kernels of the two convolutional layers set as (2, 4), (3, 6), (4, 8), (5, 10), (6, 12), (7, 14), and (8, 16), respectively, are listed in Table 2, where No.1 represents structure (2, 4), and the rest, etc.

Table 2 indicates that, as the number of convolution kernels is upregulated, the defect identification rate rises first and then falls. The reason is that when the number of convolution kernels is very small, the network cannot learn the internal characteristics of the data well, resulting in lower identification rate. While the number of convolution kernels is too large, the extracted characteristics will be redundant, leading to the decrease in the identification rate. With the increasing complexity of the CNN network structure, the training time is prolonged. Given the defect identification rate, mean square error, and training time, the model adopts the network structure of 5 and 10 convolution kernels in the first convolutional layer and in the second convolutional layer, respectively.
From the discussion above, the CNN model for bolt defect identification is illustrated in Figure 9, comprising of input layer, 2 convolutional layers, 2 pooling layers, and output layer.
In Figure 9, five 5 × 5 convolutional kernels are employed to convolute the 64 × 64 input feature map. Five feature maps, at a size of 60 × 60, are obtained. The general form of the convolution operation is expressed by where, denotes the operator of two-dimensional discrete convolution; is jth output feature map of lth layer; is ith feature map of l-1th layer; is the jth convolution kernel of lth layer; and is the jth bias of lth layer, respectively. denotes the activation function. Convolutional and pooling layers appear alternately. Pooling operation can reduce the resolution of the output feature map and still maintain the features extracted from the high-resolution feature maps. The commonly adopted pooling operations cover max and average pooling expressed as follows:where down(·) denotes the pooling function and is the activation function of pooling layer, respectively. Average pooling is adopted here, taking the average of all values of sampling window as the eigenvalue, as shown in Figure 10.

As shown in Figure 9, the five 60 × 60 feature maps are obtained by the convolution of input feature maps. The five 30 × 30 feature maps are obtained after pooling, so the dimension of the feature maps is downregulated. There are two sets of convolutional and pooling layers—the 2nd convolutional layer performs convolution on the output of the first pooling layer using 10 filters with size 5 × 5 to produce a total of 10 feature maps. After average pooling, ten 13 × 13 feature maps are yielded. Softmax is taken as the activation function in the output layer, taking a vector of arbitrary real-valued scores and squashing it to a vector of values between zero and one. As discussed above, the convolution + pooling layers act as feature extractors from the input image, while output layer serves as a classifier.
3. Results and Discussion
In this section, three different data formats are adopted as input to the CNN for the defect identification of bolt anchorage system. For the time-domain representation, the original acceleration signal collected from the experiment directly serves as input to the CNN model. For the frequency-domain representation, the original acceleration signal processed by FFT is adopted as the input to the CNN model. Also, for the time-frequency spectrum format, the original acceleration signal processed by STFT is taken as the input of CNN model.
Shallow learning models, such as SVM, BP neural network, and decision tree model are also applied to the defect identification of bolt anchorage system. Among them, the input of SVM and BP neural network is the 3-layer wavelet decomposition coefficient of the time-domain signal, and the input of the decision tree is the combination of time and frequency-domain features. The corresponding identification results are listed in Table 3.
Table 4 shows identification results of the CNN model with different inputs.
Tables 3 and 4 show that, compared with SVM, BP, and the decision tree, the defect recognition method based on the CNN model obtained a higher recognition rate. The reason is that SVM, BP, and decision tree need to use artificial feature extractors for feature extraction before identifying the type of defects, which cannot fully excavate the new internal features of the anchor bolt signal, thus limiting the improvement of the recognition rate. However, the CNN-based recognition method does not require manual feature extraction, which mines the representative information of the original signal automatically and forms higher-level features for identification.
Table 4 shows that the CNN model with the time-frequency spectrogram as input has the highest recognition accuracy of 98.98%, which is 0.84% and 14.85% higher than the recognition rate of the CNN model with time domain and frequency domain input, respectively. The reason is that the time-frequency spectrogram covers both time domain and frequency domain information, with more information included than single time-domain data or frequency-domain data. For the CNN model with time-frequency spectrogram, mean square error is the smallest, while the training time is the longest. Though the test time is longer than the other two models, it is within an acceptable range. The analysis results suggest that the CNN recognition model based on STFT obtains better recognition results by consuming more training time.
4. Conclusions
Since 1D CNN model can only learn the time-domain information or frequency-domain information, a CNN model based on STFT for defect identification of bolt anchorage system is built. Based on analyzing the effect of the number of convolution kernel on the recognition results, the optimal structure of CNN model based on STFT is built, and compared with the other two CNN models with time-domain input and frequency-domain input. The experimental results suggest the following:(1)For the number of convolution kernel, it is not the more the better. The optimal structure of the network is associated with the input data type, the size of the data set, and the specific problem solved.(2)When the time-frequency spectrogram is adopted as the CNN input, the model obtains a higher recognition rate, whereas the training time is prolonged, suggesting that the model is at the expense of training time in exchange for recognition rate.
Data Availability
The data used to support the findings of this study are included within the article.
Conflicts of Interest
The authors declare that they have no conflicts of interest.
Acknowledgments
This work was financed by the National Natural Science Foundation of China, under Grant no. 51674169, the Department of Education of Hebei Province of China under Grant nos. ZD2019140 and QN2019031, and the Natural Science Foundation of Hebei Province of China under Grant no. F2019210243.