Abstract

In view of the low accuracy of existing tomographic detection methods, in order to improve the accuracy of tomographic detection, a tomographic detection method based on residual network and Faster R-CNN is proposed. First, input the image into the ResNet-50 feature extraction network to obtain the corresponding feature map, then use the RPN structure to generate the candidate frame, and project the candidate frame generated by the RPN to the feature map to obtain the corresponding feature matrix, and finally, through the ROI pooling layer, each of the feature matrix is scaled to a fixed-size feature map, and then the feature map is flattened through a series of fully connected layers to obtain the prediction result. ResNet-50 mainly solves the problem of network degradation and overfitting caused by deepening of the network layer when extracting the deep features of faults. Faster R-CNN realizes end-to-end training, combines the advantages of ResNet-50 and Faster R-CNN, and has a precise positioning efficiency. The accuracy of detecting faults reaches 90%. The data enhancement is further optimized, the generalization ability of the network is improved, the detection results of the network are optimized, and the accuracy of fault detection is effectively improved, and the feasibility of the method is verified by actual seismic data.

1. Introduction

Seismic interpretation is an important task in seismic data interpretation. Fault detection is also an important part of seismic interpretation. Accurate fault detection is of great significance to oil and gas exploration and development. The traditional fault detection method is to manually select discontinuous sample points in the section and then connect these points into a line, which requires human-computer interaction, which is inefficient, and human factors will cause great uncertainty in the results, which increases oil and gas exploration development costs and risks. In order to overcome the shortcomings of manual fault detection methods, various fault detection methods have been proposed since the beginning of the twentieth century. Bahorich et al. proposed the first-generation coherence cube algorithm C1 based on normalized cross-correlation, which is a fault detection method based on the correlation between three seismic traces, highlighting the discontinuity of seismic data and the characteristics of suppressed continuity. The disadvantage of this method is that it is very susceptible to interference from coherent noise [1, 2]. According to the shortcomings of the C1 algorithm, the second-generation coherent cube algorithm, namely C2, is proposed. The advantage of the C2 coherent cube algorithm is to overcome the shortcomings of C1 that is very susceptible to interference from coherent noise. The detection rate is high, the detection rate of small faults is low, and the detection time is longer. Then came the third-generation coherence cube algorithm C3. C3 overcomes the shortcomings of the C2 algorithm and has the advantages of fast speed and strong anti-noise ability, but the disadvantage is that the size of the coherence window is difficult to choose and the fault location is blurred. In addition to the coherence cube algorithm, many fault detection and interpretation methods have emerged [2, 3]. Marfurt and Hale used similarity algorithms to detect discontinuities in seismic data. Marfurt et al. proposed an Eigen-structure coherence algorithm, which was used to construct the covariance matrix and the Eigen-decomposition matrix of the Eigen-decomposition matrix. These algorithms have high accuracy in fault detection, but require more calculation time for seismic data with low signal-to-noise ratio. Subsequently, in order to solve the extraction of small faults and merge them into larger faults, a series of ant tracking algorithms are proposed. Pedersen et al. gave a specific explanation for the ant tracking algorithm. The advantage of this method is high accuracy and the ability to detect subtle faults. The disadvantage is that it tracks all areas belonging to the fault with a certain probability, low credibility, large amount of calculation. Wang et al. proposed a fault detection method based on Hough transform. First, discontinuous features are calculated, thresholded to enhance the fault area, then Hough transform is used to detect the linear features of the fault, and then the fault is filtered out, and finally the fault fragments obtained by connecting and marking have the advantage of higher detection accuracy for larger faults, but the disadvantage is that the small faults are severely lost [4].

With the continuous development and application of machine learning technology, machine learning algorithms have penetrated into the field of seismic data fault detection. Machine learning realizes simple tomographic detection by learning characteristic parameters. Huang et al. used various seismic attributes as the input of the convolutional neural network to automatically detect faults. Xiong et al. use convolutional neural networks to achieve fault detection and use U-Net image segmentation network and deep residual network combined with migration learning to apply to fault detection, which can improve the generalization ability of the network and optimize the detection result of the network. Xiong et al. intercepted seismic sections in different directions of the seismic data volume and used five-layer CNN to detect faults, but the reliability of the obtained faults was low. Although convolutional neural networks have penetrated into the field of seismic faults, the current fault detection methods still have certain shortcomings. During the training process, the training model only predicts better results on data with a relatively high signal-to-noise ratio [57]. A complex actual scene cannot accurately detect the exact location. This paper proposes a method to use ResNet-50+Faster R-CNN to accurately detect the fault area. Experiments on actual seismic data illustrate the feasibility of using neural networks to detect faults and the significance of further research.

2. Materials and Methods

2.1. Residual Network and Faster R-CNN Network

ResNet was proposed in 2015 and won the first place in the ImageNet competition classification task. It has been widely used in detection, segmentation, recognition, and other fields. Deep neural networks are developing in the direction of deeper and deeper network layers, but as the depth of the network increases, the accuracy of the model does not always improve. After the network is deepened, not only the test error becomes higher, but the training error also becomes higher. Deeper networks will be accompanied by gradient disappearance and explosion problems, which hinder the network from converging. ResNet can solve the obvious degradation problem that occurs as the performance of the network increases with the depth. The method of identity mapping is adopted. The original network is shown in Figure 1. Input , hope to output H(x) and then output F(x) through a series of operations. Now, the ResNet network adopts the identity mapping method as shown in Figure 2; let , and then the network only needs to learn to output a residual .

Function represents a residual function, as shown below:

The identity mapping is used, and the activation function is not used after the addition, so Formula (2) can be obtained:

Recursively obtain the expression of the L-th layer as shown below:

Backpropagation to find the gradient of the lth layer:

The gradient of the L layer contains the gradient of the L layer. In Layman’s terms, the gradient of the L layer is directly transferred to the L layer. Because the gradient disappearance problem mainly occurs in the shallow layer, ResNet-50 can do it by directly transferring the deep layer gradient to the shallow layer, and it also solves the problem of the gradient disappearance of the deep neural network [8, 9]. This explains the necessity of using ResNet-50.

Faster R-CNN is the first true end-to-end deep learning detection algorithm proposed in 2015. The biggest advantage of Faster R-CNN is that by adding the RPN network, it generates candidate frames based on the Anchor mechanism and finally performs feature extraction and candidate frame selection [10, 11]. Boundary regression and classification are integrated into a network to effectively improve detection accuracy and efficiency. The specific process is to zoom the input image into the convolutional layer to extract the feature to obtain the feature map, and then send the feature map to the RPN network to generate a series of possible candidate frames for the detection object, and then the original feature map and all the RPN output the candidate boxes are input to the ROI Pooling layer, the suggested boxes are extracted and collected, and a fixed-size 7 × 7 suggested feature map is calculated, and then sent to the fully connected layer for target classification and coordinate regression. The specific process is shown in Figure 3.

The target detection system is mainly composed of the modules in Figure 3: ResNet-50 performs feature extraction, and a deep full convolutional network (RPN) is used to generate regional proposal boxes; Faster R-CNN is a target detector, which is an independent and unified object detection In the network, the algorithm flow can be divided into 3 steps. (1) Input the image into the feature extraction network to obtain the corresponding feature map. (2) Use the RPN structure to generate the candidate frame and project the candidate frame generated by the RPN onto the feature map to obtain the corresponding feature matrix. (3) Scale each feature matrix to a 7 × 7 feature map through the ROI pooling layer and then flatten the feature map through a series of fully connected layers to obtain the prediction result [12].

2.2. Fault Detection Based on Convolutional Neural Network

In view of the fact that the ResNet-50 model can simplify the learning goals of the network, accurately extract the fault features, reduce the difficulty of training, and solve the problem of network degradation and overfitting caused by the deepening of the network layer. Faster R-CNN integrates suggestion box extraction, bounding box regression, and classification into one network. It is a true end-to-end target detection framework, and it only takes a few milliseconds to generate the suggestion box, which is especially obvious in terms of detection speed. In this paper, the two networks are combined, and the overall performance is significantly improved. Figure 4 is a diagram of the overall network structure proposed in this article [9].

First introduce the feature extraction process. As shown in Figure 1, the image is first input and entered into the original convolutional neural network, the neural network is equivalent to a composite function fitting process, and then output. As the number of network layers increases, overfitting and network degradation problems will appear. ResNet-50 is the turning point of neural network, and the number of layers has made a leap. As shown in Figure 2, ResNet-50 mainly uses residual function learning to solve the phenomenon of network degradation when the number of network layers is deepened. Figure 5 showed the ResNet-50 network structure [13].

is the input seismic data, and F(x) is the residual function. The formula of the residual structure is shown below:

Through recursion, the L feature of any depth unit can be expressed as shown below:

For any deep unit L, the feature x_L can be expressed as the feature x_L of the shallow unit L plus a residual function of the form , indicating that any unit L and There are residual characteristics between L. Similarly, for an arbitrarily deep unit L, its characteristic in Formula (6) is the sum of all previous residual function outputs plus x_0.

For back propagation, assuming that the loss function is E, Formula (7) can be obtained according to the chain rule of back propagation:

The former guarantees that it can be directly transmitted back to any shallow layer and this formula also guarantees that there will be no gradient disappearance, because , it cannot be -1. To sum up, the residual network not only solves the problem of network degradation as the number of network layers increases, but also increases the number of network layers. The learning ability of the network is stronger, and it has certain advantages for extracting the characteristics of faults [14].

The biggest advantage of Faster R-CNN is that by joining the RPN network, candidate frames are generated based on the Anchor mechanism, and finally the feature extraction is performed. The candidate frame selection, boundary regression, and classification are integrated into a network, so as to effectively improve the detection accuracy and detection efficiency. The RPN (region proposal network) is a full convolutional network (FCN) that can be trained end-to-end to generate high-quality region proposals, which are then sent to Faster R-CNN for detection. RPN can predict suggestion frames that vary greatly in scale and aspect ratio. By sharing convolutional features, RPN and Faster R-CNN are further integrated into a network, and a training mechanism is proposed to alternately fine-tune the suggestion box and target detection score while keeping the proposal fixed. Figure 6 is the network structure of RPN [15]. A sliding window is used on the feature map, and sliding is performed on the feature map. Each time it slides to a position, a one-dimensional vector is generated, and the one-dimensional vector outputs the target probability through two fully connected layers. And bounding box regression parameters, the target probability is composed of foreground probability and background probability. Each anchor is concentrated in the center of the sliding window and is associated with a scale and an aspect ratio. This article uses 5 scales and 3 aspect ratios, as shown in Figure 6. Therefore, for each sliding window position to include more comprehensively, first determine the center point, and then adjust the boundary, cls, and reg obtained through the sliding window [16, 17].

Set the size of the Anchor {322,642,1282,2562,5122}, and set the ratio {2 : 1, 1 : 1, 1 : 2}.

Each position corresponds to anchors on the original image. For a 1000 × 600 × 3 image, there are about 60 × 40 × 15 anchors, and the anchors that cross the border are ignored. For the candidate frames generated by RPN, there is a large amount of overlap. Based on the cls score of the candidate frame, the nonmaximum value suppression IOU is set to 0.7 so that there are only 2 k candidate frames left for each picture. The anchor is adjusted according to the bounding box regression parameters for the candidate frame. The process of training RPN uses positive and negative samples, because tens of thousands of anchors will be generated during the sliding window process. 256 anchors are randomly sampled in these anchors [18]. These 256 anchors are sampled by positive and negative samples. The ratio of positive and negative samples is 1 : 1. If the number of positive samples is less than 128, then negative samples are used for filling. When defining the positive sample and the negative sample, the definition of the positive sample is that the IOU of the anchor and the ground truth box (manually labeled) >0.7 and the anchor and the ground truth box (manually labeled) are the largest IOUs (for the above situation), discuss to avoid situations where none of them exceeds 0.7). The definition of negative samples is that the values of anchors and ground truth box (manually labeled) IOU are less than 0.3. All things except the positive and negative samples are discarded. The RPN loss function is shown below:

represents the probability that the i anchor is predicted to be the true label, when it is a positive sample, it is 1, and when it is a negative sample, it is 0, represents the prediction of the bounding box regression parameter of the i anchor, indicates the ground truth box corresponding to the i anchor, indicates the number of all samples in a mini-batch is 256. For a 1000 × 600 image, is the number of anchor positions (not the number of anchors) about 2400, When , Formula (8) can be simplified; the simplified result is shown below (10):

The binary cross-entropy loss function is shown below:

represents the probability that the i anchor is predicted to be the true label; is 1 when it is a positive sample (fault) and is 0 when it is a negative sample (non-fault). There are only two types of fault data whether it is a fault area, so the binary cross entropy loss is used. In this paper, the ResNet-50 feature extraction network and the Faster R-CNN network are combined to accurately detect the location of the fault. ResNet-50 has the advantage of training deep networks, and Faster R-CNN has the training advantage of end-to-end training. The combination of the two has a good positioning efficiency [19, 20].

3. Results and Discussion

After designing fault detection feature extraction network model and target model, loss function, activation function, and other key parts, it is necessary to obtain training sample set and test sample set from seismic data to train the neural network model and detect faults. The seismic profile data comes from generating training sample sets and test sample sets. Autonomously intercept the seismic fault pictures and expand their data, use the LableImage platform for data labeling, set the fault label category to fault, and save the labeling information after labeling as a text file in XML format, which is the standard file format of PASCAL VOC, due to the original data, the set is a collected depth map, which is not conducive to labeling, so here first convert the depth map after data enhancement adjustment into a corresponding light-colored image that is easy to label. The experimental environment uses operating system Windows10 64-bit, processor Inter(R) Core(TM) i7, deep learning framework PyTorch, programming language Python3.7, set dynamic learning rate, reduce learning rate every 5 steps, and learn every 5 steps of training multiply the rate by 0.33. Since there are only two types of faults, the number of categories is relatively small. 5000 pieces of data are enough for verification. From them, 2500 pieces are randomly selected as the training set, 2500 pieces as the test set, and 1000 pieces as the verification set (training set: validation set =1 : 1). It is required that there is no intersection between the training set, the test set, and the validation set. After training, test on seismic fault planes containing a single fault and multiple intersecting faults. The experimental environment uses operating system Windows10 64-bit, processor Inter(R) Core(TM) i7, deep learning framework PyTorch, programming language Python3.7, set dynamic learning rate, reduce learning rate every 5 steps, and learn every 5 steps of training Multiply the rate by 0.33. The selected GPU has a maximum resolution of 7680 × 4320 and can process multiple images. The data speeds up the training speed of the network model, thereby improving the image processing speed and model training efficiency.

In order to verify whether the feature maps extracted by the ResNet-50 model are detailed and have the advantage of extracting deep features, compared with the MobileNet V2 model, the feature maps extracted by the two networks are compared with the detection results to verify that the ResNet-50 model is for faults, the necessity of testing. In order to verify the effectiveness of this method in seismic fault detection, an experimental comparative analysis was carried out between it and the artificial fault detection method.

3.1. MobileNet V2 and ResNet-50

The extracted features are different, as shown in Figures 7 and 8, which are the features extracted by MobileNet V2 and ResNet-50, respectively. Since the fault is very similar to the surrounding area, the feature map seen is not much different. At a deeper level, compared with MobileNet V2, ResNet-50 can see very detailed feature maps of the input image.

Through the above experiments on actual seismic data, because ResNet-50 extracts more comprehensive features in the deep layer, the detection effect is more comprehensive when detecting faults. By comparing ResNet-50 and MobileNet V2, the faults on the deep network are obtained. The detection effect is good, which verifies the effectiveness of the feature extraction network based on ResNet-50 in this paper.

3.2. Comparison of Tomographic Inspection Results

After training the neural network model, in order to verify the effectiveness of the proposed ResNet-50 and Faster R-CNN tomographic detection methods, the experimental results are compared with the artificial methods. Testing is performed on the generated test set.

In the case of a single fault, the method in this paper is basically the same as the method of manually detecting the location of the fault. When there are multiple intersecting faults in the seismic profile, the fault detection method proposed in this paper has a certain error compared with the method of manual fault detection. In order to verify the real-time performance of the proposed fault detection method, when the seismic profile contains a single fault, multiple unintersected faults, and multiple intersecting faults, different time consumption on the fault profile is calculated.

Through the comparison of Table 1, the method proposed in this paper has obvious advantages in speed than manual labeling. In order to verify the accuracy of the proposed fault detection method, when the seismic profile contains a single fault, multiple disjoint faults, and multiple intersecting faults, the number of faults obtained by fault detection is compared with the number of faults obtained manually.

Through the comparison of Table 2, in single fault detection, the total number of faults detected by this method and manual method is almost the same, and the detection accuracy rate is about 90% in the detection of multiple uncrossed faults and multiple crossed faults.

In order to verify the necessity of the backbone feature extraction network, when the seismic profile contains a single fault, multiple disjoint faults, and multiple intersecting faults, different sizes of the input image will be input to different backbone feature extraction networks to compare the experimental results, as shown in Table 3.

The two main feature extraction networks, ResNet-50 and MobileNet-V2, are compared for different input image sizes. The main feature extraction network is ResNet-50, and the detection speed is faster when the input image is 512 × 512.

In order to verify the influence of different sizes and different proportions of anchor frames on the results of the fault experiment, the experimental results will be compared by setting the size and proportion of the anchor frames.

Table 4 compares the ablation experiments for anchor frames of different sizes and different ratios. It can be seen that the total number of faults detected by the anchor frames of 5 sizes and 3 ratios is more than other sizes and ratios, which proves that the anchor frames of 5 sizes and 3 ratios. The receptive field of the anchor frame is larger, and the target obtained is more accurate. When a single fault is included, the same correct number of faults as manual marking can be obtained; when there are multiple uncrossed and crossed faults, the text method can still obtain the same number of faults as the manual method, and the accuracy rates are 90% and 91, respectively. After preparing the training set, during the network training process, the training log of the lost value and accuracy rate will be saved, and the log information will be visualized and drawn.

Through the comparison of the abovementioned comparative experiment and the manual method in terms of time consumption and accuracy, it can be seen that compared with the manual method, the accuracy of this method is basically the same as that of the manual method, and the accuracy rate is as high as 90%.

4. Conclusions

This paper proposes a fault detection method based on ResNet-50 + Faster R-CNN, which effectively utilizes the advantages of the residual module, can simplify the learning objectives of the network, reduce the difficulty of training, and also uses Faster R-CNN to extract the proposal, bounding box regression, and classification which are integrated in one network to achieve end-to-end training. In this article, the two networks are combined, which has good positioning efficiency and accuracy, which greatly improves the overall performance. The ReLU activation function, the design of the binary cross-entropy loss function, and other design implement the convolutional neural network model for detecting faults. The training sample set and test sample set were generated manually from the seismic data, and the fault detection experiment was carried out in PyTorch, and the fault was successfully detected with an accuracy rate of over 90%. Due to the complex and diverse structure of the interruption layer in the actual application scene, further research is needed if it is to be able to detect accurately.

Data Availability

All data included in this study are available upon request by contact with the corresponding author.

Conflicts of Interest

The authors declare no conflict of interest.

Acknowledgments

We gratefully acknowledge China University of Petroleum and the Strategic Cooperation Technology Projects of CUPB (ZLZX2020-03) for the financial support.