Research Article
RDMMFET: Representation of Dense Multimodality Fusion Encoder Based on Transformer
Table 3
Comparison with the latest models on the VQA v2.0 data set.
| Label | Method | Test-dev | Test-std |
| No pretraining | DFAF [8] | 70.22 | 70.34 | MCAN [9] | 70.63 | 70.90 | MUAN [38] | 70.82 | 71.10 | Pretraining | ViLBERT [23] | 70.55 | 70.92 | VisualBert [27] | 70.80 | 71.00 | VL-BERT(base) [28] | 71.16 | - | VL-BERT(large) [28] | 71.79 | 72.22 | LXMERT [24] | 72.42 | 72.54 | RDMMFET (ours) | 72.59 | 72.67 |
|
|