Research Article
RDMMFET: Representation of Dense Multimodality Fusion Encoder Based on Transformer
Table 2
Ablation studies on VQA v2.0 test-dev with iterations and layers of each encoder.
| Module | Setting | Accuracy |
| Number of iterations | Epoch = 3 | 72.46 | Epoch = 4 | 72.59 | Epoch = 5 | 72.46 | Epoch = 6 | 72.22 | Number of encoder layers | | 72.37 | | 72.35 | | 72.47 | | 72.44 | | 72.25 | | 72.37 | | 72.59 |
|
|