|
Category | Network structure | Year of appearance | Advantages | Disadvantages | Application |
|
Image classification | AlexNet | 2012 | Outstanding performance in the ImageNet image classification competition | A large number of parameters, easy to over-fitting | [3, 6] |
VGGNet | 2014 | The model has a clear hierarchy and excellent performance | The parameter quantity is large, and the model size is large | [59] |
GoogLeNet | 2014 | The number of parameters is relatively small, and the accuracy is high | The complex network structure is complex | [60] |
ResNet | 2015 | Can solve the problem of deep network gradient disappearance and explosion | The parameter quantity and calculation amount are large | Widely applied |
|
Target detection | R-CNN | 2014 | Use convolution to extract features for target detection | A large amount of calculation, long time-consuming | — |
SPP-Net | 2014 | Can process different image sizes, and the recognition speed is fast | Many features need to be stored and long training time | [61] |
Fast R-CNN | 2015 | Improve the speed and performance of R-CNN | Cannot get rid of dependence on the constituency | — |
Faster R-CNN | 2015 | Further improves the speed, has advantages in high precision detection | Two-stage network, unable to detect in real time | [62] |
YOLO | 2015 | Can ensure accuracy while real-time detection | Insufficient ability to identify small targets | [7] |
SSD | 2016 | Fast speed, high precision, and excellent comprehensive performance | Many parameters need to be manually set | [63] |
|
Pixel-level segmentation | FCN | 2015 | Segmentation of pixel points replaces traditional classification network | Segmentation is not refined enough and requires a large storage capacity | [64] |
U-Net | 2015 | Partially solve the problem of insufficient memory and insufficient training data | Difficult to deal with the segmentation tasks of complex scenes and small targets, and prone to overfitting | [65] |
SegNet | 2015 | Upsampling sharpens edges and reduces storage space | Unable to handle spatial relationships well | [66] |
Mask R-CNN | 2017 | Second-stage pixel-level localization to achieve instance segmentation | Detecting and segmenting takes a long time | [23] |
DeepLab3+ | 2018 | Proposing dilated convolutions and ASPP modules for high accuracy | Training and processing speeds are slow | [67] |
SOLO | 2020 | Efficiently recognize multiscale objects and perform pixel-level segmentation | Weak ability to deal with complex backgrounds and not precise enough in detecting small objects | [68] |
|