Research Article
Multiple Context Learning Networks for Visual Question Answering
Table 6
Comparison with previous state-of-the-art methods on VQA v2.0 test dataset.
| Model | Test-dev | Test-std | All | Y/N | Num | Other | All |
| BUTD [13] | 65.32 | 81.82 | 44.21 | 56.05 | 65.67 | MFH [32] | 68.76 | 85.31 | 49.56 | 59.89 | - | Counter [33] | 68.09 | 83.14 | 51.62 | 58.97 | 68.09 | v-AGCN [17] | 65.94 | 82.58 | 45.12 | 56.71 | 66.17 | ReGAT [16] | 70.27 | 86.08 | 54.42 | 60.33 | 70.58 | DFAF [20] | 70.22 | 86.09 | 53.32 | 60.49 | 70.34 | MCAN [21] | 70.63 | 86.82 | 53.26 | 60.72 | 70.90 | MEDAN [22] | 70.60 | 87.10 | 52.69 | 60.56 | 71.01 | MCLN-LSTM | 70.26 | 85.95 | 53.18 | 60.72 | 70.63 | MCLN-BERT | 71.05 | 87.43 | 53.28 | 61.08 | 71.48 |
|
|