Journal of Sensors

Research Article

Research on Multitarget Recognition and Detection Based on Computer Vision

Table 6

Some main models of the two algorithms.


One-stage algorithm	Performance summary	Two-stage algorithm	Performance summary

YOLO series	YOLOv1 is very fast and can be monitored in real-time. The recognition effect of small targets is not good, and pictures with fixed size.	R-CNN	Ross Girshick proposed in 2014. Selective search algorithm is used instead of sliding window, which solves the problem of window redundancy and reduces the time complexity of the algorithm. Convolution neural network replaces the traditional hand-made feature extraction part, which can extract the image features more effectively and improve the external anti-interference ability.
SSD series	YOLOv2 solves the problem of difficult convergence and uses high-resolution pictures to fine-tune the network; anchor frame and convolution for prediction.	SPPNet	In 2015, Kaiming He and others proposed. The feature map is obtained by running convolution layer only once from the whole image, which greatly reduces the time consumed by feature extraction. Reduce the loss of image information and avoid repeated calculation of convolution features. The lifting speed is about 24 times to 64 times.
M2Det	YOLOv3 uses Darknet-53 as the network backbone and adopts FPN architecture.	Mask, R-CNN	In 2017, He et al. proposed Mask R-CNN, which combines faster R-CNN and FCN. The multiscale feature extraction ability of the model is strengthened, and the recognition of small target objects is more accurate. The detection speed is about 5 pieces per second.
CentripetalNet	YOLOv4 uses CSPDarknet 53 and many pervasive algorithms to achieve the best experimental results.	D2Det	Cao et al. proposed in 2020. At the same time, it solves the problems of accurate positioning and accurate classification. Dense local regression and DRP are introduced to extract accurate target feature regions from the first stage and the second stage, respectively, thus improving performance.