Abstract

Since the underground transportation of coal mainly relies on the mine conveyor belt to complete, the mine conveyor belt with large pieces of coal will affect transportation safety. Therefore, to address the problem of real-time monitoring of lump coal, the method Ghost-ECA-Bi FPN (GEB) YOLOv5 for lump coal in the process of mining conveyor belt transportation is proposed based on a lightweight neural network and multisource information fusion. First, the image preprocessing is performed by adaptive histogram equalization, which reduces the influence of coal dust, dust, and uneven lighting on target monitoring. Second, the redundancy of the convolution process is exploited, and a lightweight neural network GhostNet is introduced to optimize the feature extraction process. In addition, combined with the efficient channel attention mechanism, the 1D convolution enables local cross-channel information interaction, which can solve the problem of imbalance between model complexity and performance. Finally, the feature information of the three stages is fused using a weighted bidirectional feature pyramid network to enhance the generalization ability of the model. The experimental results show that the improved GEB YOLOv5 algorithm has obvious advantages. In terms of model structure, the number of network layers reduces by 36.97%, and the number of model structure parameters and floating-point operations reduce by 64.53% and 69.14%, respectively. Moreover, the model volume reduces from 92.7 M to 33.0 M. Regarding the monitoring performance, the precision and recall rates improve by 1.19% and 1.11%, respectively. Furthermore, the real-time performance improves from 68.34 FPS to 110.70 FPS. It can be seen that the problem of the model performance against the model complexity is effectively solved in this experiment and the real-time monitoring of lump coal is realized.

1. Introduction

The stable coal supply is affected by its safe and efficient production and transportation, which also impacts the country’s energy security [1]. In the process of underground coal transportation, small coal pieces are mixed with quite larger ones. However, the presence of large coal pieces can negatively affect the safe transportation of mining conveyor belts, resulting in scratches and punctures on the surface of the conveyor belt, and may even make the steel core inside the belt break [2]. Therefore, the real-time monitoring of lump coal during coal extraction and transportation is critical [3, 4].

A computer vision-based large foreign object monitoring method, CBAM-YOLOv5, was designed in reference [5] by the fusing deep convolution and convolutional block attention module (CBAM) mechanisms to enhance detection precision by optimizing the detection network. However, its real-time performance was unsatisfactory, with only 31 FPS. In reference [6], a convolutional neural network with an attention mechanism module was designed, but its real-time monitoring performance was only 15 FPS, which was far below the real-time monitoring requirement. Similarly, in reference [7], the YOLOv7 algorithm was improved by adaptive histogram equalization (AHE) for data preprocessing and introduced the Sim attention mechanism and depthwise convolution to improve detection precision and efficiency. Nevertheless, the real-time monitoring performance was only 25.64 FPS. However, real-time performance is an important performance metric for securing the conveyor belt and should be emphasized when improving the algorithm.

Due to the long time and high-intensity continuous work, the quality of the mining conveyor belt is affected by gangue and large pieces of coal, which easily causes conveyer-belt fatigue, affecting coal mine safety production [8, 9]. Therefore, the authors in [10, 11] introduced EfficientNet and MobileNet V2 in YOLOv3 to achieve lightweight feature extraction and improve monitoring precision. In addition, in reference [12], k-means clustering was combined with the YOLOv3 algorithm to analyze the lump coal size accurately and achieve precise localization. However, the real-time performance of the abovementioned models is ignored [13], the improved GEB YOLOv5 algorithm in this paper takes into account both the precision and the real-time performance of the model, and the model performance is more excellent.

In addition, in references [14, 15], computer vision techniques were utilized to preprocess images, and large coal pieces in motion were identified, achieving a detection rate of more than 80%. Subsequently, in reference [16], the authors enhanced the precision of lump coal monitoring by fusing multi-information based on the faster RCNN algorithm. However, the two-stage target model has a complex structure and it is difficult to realize lightweight, so this paper carries out the introduction of Ghost-Conv on the basis of the single-stage target detection algorithm to reduce the model complexity.

In summary, the existing studies indicate that the real-time monitoring of lump coal on mining conveyor belts suffers from complex algorithm model structure, unbalanced precision and efficiency, and unsatisfactory real-time monitoring performance. Therefore, to solve the abovementioned problems, this paper proposes a real-time monitoring method, GEB YOLOv5, for lump coal on mining conveyor belts, with three main contribution points.(1)The image preprocessing is performed by AHE, which reduces the influence of coal dust, dust, and uneven lighting on target monitoring.(2)The lightweight neural network modules are introduced, which reduce the redundancy of feature extraction. In addition, the Bi FPN is combined with balancing model precision and real-time performance.(3)The efficient channel attention (ECA) mechanism is added to the feature extraction network, which improves the model detection performance and solves the problem of imbalance between model complexity and performance.

The remainder of this paper is organized as follows. Section 2 describes the modules to which the algorithm improvement is applied. Section 3 introduces the ablation experiments, the improvement steps, and the overall structure of the algorithm after the improvement. Finally, Section 4 presents the experimental analysis, while Section 5 provides a summary of the paper’s findings.

2. GEB YOLOv5 Basic Theory

2.1. Adaptive Histogram Equalization

In the real-time monitoring of lump coal on the mining conveyor belt, the conveyor belt runs at a speed of 3∼5 m/s. However, the presence of uneven lighting, dust, and coal dust in underground coal mines is not conducive to the real-time monitoring of lump coal. Therefore, introducing AHE enhances the contrast of lump coal in mining conveyor belts, reinforces its contour information, and improves the brighter or darker areas, which is beneficial for completing the monitoring of lump coal.

AHE can reduce the influence of coal dust and dust in underground coal mines on the real-time monitoring of lump coal in mining conveyor belts. It addresses the problem of the blurred image caused by the high-speed operation of mining conveyor belts and uneven illumination, effectively improves the contrast of images, and makes the outline of lump coal on mining conveyor belts clearer and more conducive to the training of models.

2.2. Ghost Lightweight Network

In the convolutional neural network feature-extraction process, as shown in Figure 1, Ghost-Conv [17] takes full advantage of the redundancy of the feature maps generated during feature extraction. In contrast to the traditional convolutional neural network, Ghost-Conv simplifies the feature-extraction process by combining the feature maps generated by traditional convolution with the Ghost feature maps produced by a linear transformation. Consequently, the computational effort is significantly reduced while still allowing the full utilization of the feature maps.

As can be seen in Figure 1, the input image first undergoes conventional convolution, which generates conventional feature maps. Subsequently, each conventional feature map performs a low-cost linear transformation to produce Ghost feature maps. Eventually, the traditional feature and Ghost feature maps are concatenated to generate a total of feature maps, which constitute the final feature output of the image.

The computation of Ghost convolution is only the of traditional convolution, and the low-cost linear transformation effectively improves convolution efficiency, compresses the model volume, and simplifies the feature extraction process.

2.3. ECA Mechanisms

The ECA mechanism [18] solves the conflicting problem between model complexity and its performance by avoiding dimensionality reduction operations. Unlike the squeeze and excitation (SE) [19] attention mechanism, the ECA mechanism maintains the model performance and reduces complexity through proper cross-channel learning.

As depicted in Figure 2, the ECA mechanism first performs global average pooling of the input images to obtain the aggregated features of . Then, the size of the convolution kernel, , is adaptively determined according to the mapping of the channel dimension , and the channel weights, , are obtained through a single 1D convolution. Eventually, the information interaction between and its adjacent channels is completed using the operation.

To sum up, the ECA mechanism solves the contradiction of model complexity and precision in the process of real-time monitoring of lump coal on the mining conveyor belt. The proposed cross-channel information interaction strategy based on the ECA mechanism also effectively overcomes the problem of poor tracking ability and low efficiency of lump coal on the mining conveyor belt.

2.4. Bi FPN Information Fusion Mechanism

The weighted bidirectional feature pyramid network (Bi FPN) [20] reduces the number of nodes. It also simplifies the bidirectional feature network by eliminating the intermediate edge nodes with poor information fusion capabilities based on the contribution values. In addition, the corresponding information fusion paths are added in the same hierarchical network of input and output nodes, thereby enabling the feature fusion network to aggregate more feature information.

The Bi FPN has fewer nodes and more information fusion paths, which enables more advanced feature fusion. As shown in Figure 3, the Bi FPN has node weight adjustment capability to give different weights to different nodes. Therefore, this feature can improve the model’s comprehensive monitoring performance in the real-time monitoring of lump coal on a mining conveyor belt.

3. Improved GEB YOLOv5 Algorithm

In deep learning model training, ablation experiments can effectively evaluate the performance of each module and determine whether it is beneficial to improve the model performance. This paper proposes the real-time monitoring method for lump coal on mining conveyor belts, GEB YOLOv5, which enhances and optimizes the original YOLOv5 algorithm in four aspects as follows: image dataset preprocessing, model lightweight design, the addition of an attention mechanism, and combining multisource information fusion. Consequently, a total of 19 sets of ablation experiments are established to evaluate the efficiency of each module.

3.1. Image Datasets Preprocessing

As shown in Figure 4, the source of video acquisition is the fixed camera position directly above the conveyor belt in the coal mine, and the camera model is KJ707 explosion-proof type camera. A total of 1200 images are obtained after the single-frame processing and filtering of high-quality video data. Among them, 800 images are designated as the training set, while the remaining 400 images are split between the validation and test sets. The creation of dataset labels is done through Labellmg toolkit. In addition, a 10-second test video is randomly intercepted to verify the actual detection effect of the model.

Meanwhile, as can be seen in the figure, the AHE image processing technique is applied to enhance the image contrast, improve the bright or dark parts of the image, enhance the image details, and obtain more image information.

Table 1 shows that the distinction of the lump coal size according to coal size classification can be divided into 6 major categories. In the real-time monitoring of lump coal on the mining conveyor belt, the real-time monitoring of extra-large lump coal and lump coal accumulation is mainly completed.

3.2. Lightweight Algorithm Structure

The working environment in the underground coal mine is narrow in space, limiting the feasibility of using large inspection equipment. In addition, computer hardware and communication equipment are susceptible to damage from the influence of dust and coal dust. Therefore, deep learning algorithms with simple model structure, low computation, and fewer hardware resources consumption that meet the development requirements of intelligent embedded devices are preferred [21].

To reduce the number of model parameters and floating-point operations, the improved algorithm decreases the number of residual modules in the feature extraction network and the fusion network. As shown in Figure 5, in the original algorithm, the number of times the four residual modules are used is 3, 6, 9, and 3. In the feature fusion network, the number of times the four residual modules are used is 3, 3, 3, and 3. However, in the improved algorithm, the number of times the residual modules are used is 1.

As the number of used residual modules and convolution operations decreases, the algorithm’s detection precision, recall, and real-time detection efficiency are reduced to varying degrees.

To meet the application requirements of embedded mobile devices, the improved algorithm uses lightweight Ghost-Conv instead of traditional convolution. Since Ghost-Conv takes advantage of redundancy between feature maps to simplify the feature extraction process, adjacent feature maps contain limited amount of similar redundant information between them.

In addition, introducing the lightweight Ghost bottleneck residual module replaces the original C3 module, which reduces the number of convolutions, while adding the batch normalization (BN) layer and activation function makes the model structure more lightweight and covers a wider range of feature extraction.

3.3. Increasing of the Attention Mechanism

To address the problem of reduced model detection performance due to lightweight, this paper compares the effects of three attention mechanisms, including CBAM, CA, and ECA, on model performance in different locations of the feature extraction network.

According to Table 2, the precision () of the original YOLOv5 algorithm is 59.533% and the recall (R) is 54.176%. As shown in Figure 5, the and R of the model are gradually improved by increasing the CBAM, CA, and ECA mechanisms sequentially at the positions of Attention Mechanism_1 (AM_1) and Attention Mechanism_2 (AM_2) of the feature extraction network, respectively.

As seen in Table 2, introducing the ECA mechanism at the position of AM_1 has the highest precision and recall of 59.867% and 56.836%, respectively. Adding the ECA mechanism at position AM_2 achieves 60.547% precision and 54.529% recall, surpassing the original algorithm.

3.4. Combining Multisource Information Fusion

In the feature fusion network, the improved GEB YOLOv5 algorithm is combined with Bi FPN. Bi FPN_2 is applied instead of concatenate in the shallow network, while Bi FPN_3 replaces concatenate in the deep network.

To verify the compatibility of different attention mechanisms with Bi FPN, this paper conducted three sets of ablation experiments using CBAM, CA, and ECA mechanisms. In the first and second sets of experiments, Bi FPN_2 and Bi FPN_3 are introduced separately, while in the third set of experiments, both Bi FPN_2 and Bi FPN_3 are introduced simultaneously.

According to the results of the ablation experiments in Table 3, it can be seen that in group 1, the fit of Bi FPN_2 with CBAM is higher than that of group 2 and group 3, and the model training and R are 60.091% and 56.053%, respectively. While the training effect after adding CA and ECA mechanisms is better than the CBAM mechanism in group 3, and the effect of the ECA mechanism was better.

Therefore, it is clear from Table 3 that based on increasing different attentional mechanisms, a perfect training effect is achieved by combining both Bi FPN_2 and Bi FPN_3.

3.5. Improved GEB YOLOv5 Structure

The GEB YOLOv5 method for real-time monitoring of lump coal on mining conveyor belts improves and optimizes based on the YOLOv5 algorithm. First, the contrast of the dataset images is improved by using AHE to reduce the effects of coal dust, dust, and uneven illumination on the images in underground coal mines and improves the dataset’s quality. Then, the lightweight Ghost-Conv and the lightweight residual module Ghost bottleneck are introduced to simplify the model structure and reduce the redundancy of feature maps. Finally, effectively combining the ECA mechanism balances the relationship between model complexity and performance. It also reduces the number of network nodes using the Bi FPN information fusion mechanism to fuse the feature information of different stages efficiently.

As seen in Figure 6, four groups of identical convolutional modules contain attention mechanisms in the feature extraction network. The input images generate four different sizes of feature maps, including 160 × 160, 80 × 80, 40 × 40, and 20 × 20, respectively. In the feature fusion network, feature maps of different sizes undergo Ghost-Conv, upsampling, and Bi FPN to obtain both the high-level semantic information and the underlying detailed information of the images, completing different stages of information of real-time monitoring, successfully enabling the real-time tracking and monitoring of lump coal. Consistent with Figure 6, the GEB YOLOv5 training and validation process are shown in Algorithm 1.

GEB YOLOv5 Algorithm:
Parameter: Epochs, Batch size, Learning rate, Weight decay coefficient.
 Data Preprocessing: Video single frame, AHE.
In Put: Training and Valid dataset, Label set.
Loading: Train models, Valid models.
 Ensure: Algorithm environment, In Put, Backbone, Neck, Out Put.
training. -th iteration training ():
  Train Net:
   a: Feature Extraction:
    Four groups: Rectangular Conv,
    Ghost Conv, Ghost Bottleneck, ECA.
   b: Feature fusion:
    Ghost Conv, Up sample, Bi FPN_2 and Bi FPN_3.
   c: Positioning error, confidence error, category error, and total loss
  Valid Net:
   a: Test effect of model .
   b: Calculate , and , .
   c: Adjust learning rate and update training strategy.
  Save results of the -th train: weight , and model .
   Update: , .
   Temporary storage model
Prediction: Identification, Location, Classification.
 Plot: Result curve, Save the best model , Output.
End Train

During the real-time monitoring of lump coal on the mining conveyor belt, the proposed algorithm, GEB YOLOv5, can exclude interference in the underground environment and achieve accurate and efficient real-time monitoring of lump coal, improving the safety of mining conveyor belt transportation [22].

4. GEB YOLOv5 Experimental Analysis

4.1. Experimental Environment Platform

The experiments presented in this paper are carried out using Python 3.8.5 environment and CUDA 11.3, under Intel Core i9-10900k@3.7 GHz, NVidia GeForce RTX 3080 10 G, and DDR4 3600 MHz dual memory hardware. In this paper, the image input size is set to 2560 × 1440, the learning rate is 0.01, the cosine annealing hyperparameter is 0.1, the weight-decay coefficient is 0.0005, and the momentum parameter in gradient descent with momentum is 0.937. A total of 500 epochs and a batch size of 10 are used during the training.

4.2. Model Lightweight Analysis

The improved GEB YOLOv5 algorithm shows a huge lightweight advantage. During the training process of the model, the number of parameters and floating-point operations are reduced significantly by about 64.53% and 69.14%, respectively. In addition, the training time is reduced from 3.49 H to 2.73 H, a reduction of about 21.78%.

In the context of the real-time monitoring of lump coal on the mining conveyor belt, the GEB YOLOv5 model features a simple structure. As shown in Figure 7, compared with YOLOv5, the training time is decreased from 209.52 min to 163.86 min, and the number of network layers is reduced from 468 to 295. Moreover, the number of floating-point operations, model volume, and parameters is reduced by approximately 69.14%, 64.40%, and 64.53%, respectively.

Moreover, the hardware requirements and consumption are lower, and the GPU utilization rate is reduced from 5.81 G to 3.12 G, thereby reducing about 46.30%. These findings suggest that the redundancy of the improved GEB YOLOv5 model is significantly decreased, making it more suitable for applications in harsh environments, such as underground coal mines.

4.3. Reconciling the Mean Performance Analysis

The harmonic mean is a metric that combines the rating of the model’s precision and recall at different confidence thresholds, reflecting the model’s overall performance. The F1 value varies with the confidence level and indicates the balance between the precision and recall of the model. In fact, the higher the F1 value, the better the model’s performance in harmonizing precision and recall, and vice versa.

As shown in Figure 8, YOLOv5 yields an F1 value of 0.59 at a confidence threshold of 0.283. In contrast, GEB YOLOv5 achieves an F1 value of 0.602 at a confidence threshold of 0.388, representing an improvement of 0.012 over the original algorithm. These findings indicate that during the real-time monitoring process of lump coal on the mining conveyor belt, the GEB YOLOv5 algorithm has the best monitoring performance when the confidence threshold is set at 0.388.

4.4. Real-Time Performance Analysis

The underground mining conveyor belt operates continuously, with high intensity and over long periods of time. Thus, the safety monitoring of the mining conveyor belt is nonstop and nondelayed. Hence, the minimum requirement for practical applications is that the proposed lump coal on mining conveyor belt monitoring method has a good real-time monitoring performance [23].

Real-time performance represents one of the important performance metrics in the field of deep learning target detection, which is determined by the number of monitored image or video frames processed per second. The higher the number of image frames processed per second, the shorter the time taken to detect each frame, resulting in a better real-time performance of the model.

During the real-time monitoring of lump coal on the mining conveyor belt, GEB YOLOv5 and YOLOv5 monitor 30 images and 10 seconds of video, respectively. Figure 9 compares their real-time performance.

The results presented in the figure show that YOLOv5 takes about 0.014 s to monitor each frame, while GEB YOLOv5 takes about 0.009 s to monitor each frame, resulting in about 35.71% improvement in monitoring efficiency. The real-time monitoring performance of the improved GEB YOLOv5 algorithm is about 110.70 FPS for 30 detected images, which is about 38.27% better than the original algorithm of 68.34 FPS. In addition, the monitoring efficiency of the first 300 frames of a randomly intercepted 10-second monitoring video is about 112.69 FPS, indicating an improvement of about 38.55% over the original algorithm of 69.25 FPS.

The data analysis shows that the GEB YOLOv5 algorithm is more efficient and has a more stable monitoring performance when monitoring images or videos.

The proposed algorithm is applied to the real-time monitoring of lump coal on the mining conveyor belt, and a comparison of the monitoring results of different algorithms is shown in Figure 10. Comparing the monitoring results shows that GEB YOLOv5 has a wider monitoring coverage, a lower false detection rate, and higher detection confidence. In Figures 10(a) and 10(b), the number of lumps of coal to be detected is seven. The YOLOv5 algorithm detects four large lumps of coal, while GEB YOLOv5 can detect six large lumps of coal, indicating that the improved GEB YOLOv5 has a higher recall rate. In Figure 10(c), there is a high density of large lump coals in the transportation process, which are obscured by each other. However, GEB YOLOv5 has more information fusion channels enabling it to identify and localize lump coals more precisely based on multisource information fusion. In Figure 10(d), YOLOv5 mistake detects the opposite wall as a large piece of coal, while in the same case, GEB YOLOv5 can complete the identification, detection, and tracking with more precision.

In addition to this, in order to highlight the validity of the proposed method, using the same lumped coal dataset, the models proposed in literatures [5, 10] are reproduced and comparative studies are conducted. The comparison of the experimental results is shown in Table 4.

Both the CBAM-YOLOv5 model in literature [5] and the YOLO-CBAM model in literature [10] add the CBAM mechanism based on the YOLOv5 algorithm. The results of the experiment show that compared with the CBAM-YOLOv5 [5] model, the number of parameters and floating-point operations (Flops) of the proposed GEB YOLOv5 in this paper are lower than it by 32.57% and 35.73%, respectively. The mAP@0.5 and mAP@0.5 : 0.95 exceed it by 2.6% and 3.4%, respectively, and the real-time performance exceeds it by 47.52 FPS.

Similarly, compared to the YOLO-CBAM [10] model, the precision of the proposed GEB YOLOv5 in this paper is 1.78% above it, and the training times of 0.21 h is reduced. In addition, the GPU utilization is lower than 0.33 G and the real-time performance exceeds its 53.29 FPS.

In summary, the GEB YOLOv5 algorithm applies AHE to effectively solve the negative impact on the detection caused by uneven illumination and high-speed operation of the mining conveyor belt while overcoming the harsh conditions of coal dust and dust diffusion in underground coal mines. Using a lightweight neural network and multisource information fusion, the GEB YOLOv5 algorithm significantly improves the monitoring precision and real-time performance. Furthermore, it solves the contradiction between model complexity and precision by combining the ECA mechanism. The final experimental results prove the effectiveness of the proposed algorithm [24].

5. Conclusion

This paper proposed GEB YOLOv5, a lightweight neural network and multisource information fusion-based approach for the real-time monitoring of lump coal on mining conveyor belts. The objective of this research is to optimize monitoring results and improve monitoring precision and real-time monitoring performance. Final experimental results show that the proposed algorithm has achieved these goals.

In future research and work, deep learning neural networks and reinforcement learning selection strategies will be effectively combined to make neural networks with self-correction and error-correction capabilities, which can further improve the model performance.

Data Availability

The data used to support the findings of this study are available from the corresponding author upon request.

Conflicts of Interest

The authors declare that they have no conflicts of interest.

Acknowledgments

This work was supported in part by the Shanxi Provincial Science and Technology Department surface project (no. 202303021211330) and the Innovation Platform Project of Science and Technology Innovation Program of Higher Education Institutions in Shanxi Province (2022P009). We thank Research Square for publishing the initial manuscript of the paper as a preprint.