Research Article
A Two-Stream Deep Fusion Framework for High-Resolution Aerial Scene Classification
Table 7
Classification performance of the proposed method on the NWPU-RESISC45 dataset using different feature extractors and fusion strategies.
| Different architectures | Feature size | Training ratios | 10% | 20% |
| Without fusion (CaffeNet(RGB)) | 4096 | 77.34 ± 0.32 | 80.54 ± 0.22 | Without fusion (CaffeNet(saliency)) | 4096 | 75.06 ± 0.51 | 78.20 ± 0.33 | Without fusion (VGG-Net-16(RGB)) | 4096 | 77.10 ± 0.14 | 80.45 ± 0.31 | Without fusion (VGG-Net-16(saliency)) | 4096 | 74.94 ± 0.23 | 78.09 ± 0.48 | Without fusion (GoogLeNet(RGB)) | 1024 | 76.87 ± 0.45 | 79.12 ± 0.23 | Without fusion (GoogLeNet(saliency)) | 1024 | 74.67 ± 0.52 | 77.04 ± 0.19 | Fusion strategy 1 (CaffeNet) | 8192 | 80.15 ± 0.23 | 83.08 ± 0.21 | Fusion strategy 2 (CaffeNet) | 4096 | 80.22 ± 0.22 | 83.16 ± 0.18 | Fusion strategy 1 (VGG-Net-16) | 8192 | 79.95 ± 0.12 | 82.96 ± 0.19 | Fusion strategy 2 (VGG-Net-16) | 4096 | 80.03 ± 0.19 | 83.02 ± 0.14 | Fusion strategy 1 (GoogLeNet) | 2048 | 79.69 ± 0.47 | 81.46 ± 0.22 | Fusion strategy 2 (GoogLeNet) | 1024 | 79.75 ± 0.41 | 81.52 ± 0.28 |
|
|