Research Article
Multiple Context Learning Networks for Visual Question Answering
Table 2
The results of ablating the context learning modules on VQA v2.0 and GQA validation sets.
| Model | Module | GQA | VQA v2.0 | All | All | Y/N | Num | Other |
| 1 | Without all | 53.08 | 54.60 | 69.79 | 36.02 | 47.50 | 2 | Only VCL | 53.45 | 55.13 | 69.82 | 36.09 | 47.99 | 3 | Only TCL | 53.50 | 55.53 | 69.82 | 36.44 | 49.72 | 4 | Only VTCL | 58.63 | 62.07 | 79.79 | 42.67 | 53.73 | 5 | TCL + VTCL | 63.88 | 65.17 | 82.88 | 44.68 | 57.15 | 6 | VCL + VTCL | 59.04 | 62.72 | 79.33 | 43.29 | 55.23 | 7 | VCL + TCL | 53.72 | 55.76 | 71.01 | 36.37 | 49.29 | 8 | Full modules | 64.48 | 65.68 | 83.40 | 45.57 | 57.53 |
|
|