Research Article
Multiple Context Learning Networks for Visual Question Answering
Table 3
The results of fine-tuning BERT on VQA v2.0 validation set.
| Model | lr× | All | Y/N | Num | Other |
| N = 1, BERT | 0.001 | 65.21 | 82.61 | 45.74 | 57.13 | N = 1, BERT | 0.01 | 66.12 | 84.08 | 45.97 | 57.80 | N = 1, BERT | 0.1 | 66.61 | 85.09 | 46.14 | 57.98 | N = 2, BERT | 0.1 | 67.52 | 85.18 | 49.09 | 58.97 | N = 3, BERT | 0.1 | 67.86 | 85.35 | 49.75 | 59.27 | N = 4, BERT | 0.1 | 67.80 | 85.73 | 49.94 | 58.94 | N = 3, LSTM | - | 66.86 | 84.62 | 48.85 | 58.34 |
|
|