Research Article
A Two-Stream Deep Fusion Framework for High-Resolution Aerial Scene Classification
Table 5
Classification performance of the proposed method on the AID dataset using different feature extractors and fusion strategies.
| Different architectures | Feature size | Training ratios | 20% | 50% |
| Without fusion (CaffeNet(RGB)) | 4096 | 87.57 ± 0.32 | 90.22 ± 0.42 | Without fusion (CaffeNet(saliency)) | 4096 | 84.45 ± 0.28 | 87.21 ± 0.48 | Without fusion (VGG-Net-16(RGB)) | 4096 | 87.24 ± 0.18 | 90.60 ± 0.31 | Without fusion (VGG-Net-16(saliency)) | 4096 | 84.25 ± 0.11 | 87.62 ± 0.56 | Without fusion (GoogLeNet(RGB)) | 1024 | 84.18 ± 0.53 | 87.15 ± 0.69 | Without fusion (GoogLeNet(saliency)) | 1024 | 81.12 ± 0.55 | 84.28 ± 0.67 | Fusion strategy 1 (CaffeNet) | 8192 | 92.26 ± 0.52 | 94.36 ± 0.29 | Fusion strategy 2 (CaffeNet) | 4096 | 92.32 ± 0.41 | 94.42 ± 0.33 | Fusion strategy 1 (VGG-Net-16) | 8192 | 92.04 ± 0.28 | 94.53 ± 0.18 | Fusion strategy 2 (VGG-Net-16) | 4096 | 92.11 ± 0.31 | 94.58 ± 0.25 | Fusion strategy 1 (GoogLeNet) | 2048 | 89.15 ± 0.45 | 91.25 ± 0.59 | Fusion strategy 2 (GoogLeNet) | 1024 | 89.21 ± 0.39 | 91.31 ± 0.49 |
|
|