Research Article

A Two-Stream Deep Fusion Framework for High-Resolution Aerial Scene Classification

Table 7

Classification performance of the proposed method on the NWPU-RESISC45 dataset using different feature extractors and fusion strategies.

Different architecturesFeature sizeTraining ratios
10%20%

Without fusion (CaffeNet(RGB))409677.34 ± 0.3280.54 ± 0.22
Without fusion (CaffeNet(saliency))409675.06 ± 0.5178.20 ± 0.33
Without fusion (VGG-Net-16(RGB))409677.10 ± 0.1480.45 ± 0.31
Without fusion (VGG-Net-16(saliency))409674.94 ± 0.2378.09 ± 0.48
Without fusion (GoogLeNet(RGB))102476.87 ± 0.4579.12 ± 0.23
Without fusion (GoogLeNet(saliency))102474.67 ± 0.5277.04 ± 0.19
Fusion strategy 1 (CaffeNet)819280.15 ± 0.2383.08 ± 0.21
Fusion strategy 2 (CaffeNet)409680.22 ± 0.2283.16 ± 0.18
Fusion strategy 1 (VGG-Net-16)819279.95 ± 0.1282.96 ± 0.19
Fusion strategy 2 (VGG-Net-16)409680.03 ± 0.1983.02 ± 0.14
Fusion strategy 1 (GoogLeNet)204879.69 ± 0.4781.46 ± 0.22
Fusion strategy 2 (GoogLeNet)102479.75 ± 0.4181.52 ± 0.28