Research Article

A Two-Stream Deep Fusion Framework for High-Resolution Aerial Scene Classification

Table 5

Classification performance of the proposed method on the AID dataset using different feature extractors and fusion strategies.

Different architecturesFeature sizeTraining ratios
20%50%

Without fusion (CaffeNet(RGB))409687.57 ± 0.3290.22 ± 0.42
Without fusion (CaffeNet(saliency))409684.45 ± 0.2887.21 ± 0.48
Without fusion (VGG-Net-16(RGB))409687.24 ± 0.1890.60 ± 0.31
Without fusion (VGG-Net-16(saliency))409684.25 ± 0.1187.62 ± 0.56
Without fusion (GoogLeNet(RGB))102484.18 ± 0.5387.15 ± 0.69
Without fusion (GoogLeNet(saliency))102481.12 ± 0.5584.28 ± 0.67
Fusion strategy 1 (CaffeNet)819292.26 ± 0.5294.36 ± 0.29
Fusion strategy 2 (CaffeNet)409692.32 ± 0.4194.42 ± 0.33
Fusion strategy 1 (VGG-Net-16)819292.04 ± 0.2894.53 ± 0.18
Fusion strategy 2 (VGG-Net-16)409692.11 ± 0.3194.58 ± 0.25
Fusion strategy 1 (GoogLeNet)204889.15 ± 0.4591.25 ± 0.59
Fusion strategy 2 (GoogLeNet)102489.21 ± 0.3991.31 ± 0.49