Research Article
Visual-Text Reference Pretraining Model for Image Captioning
Table 3
Comparison of the results generated by VTR-PTM in different input modes on MS COCO.
| Approach | B@1 | B@2 | B@3 | B@4 | M | R | C | S |
| VTR-PTM0 | 71.1 | 55.4 | 40.2 | 29.6 | 24.3 | 51.5 | 100.5 | 20.1 | VTR-PTM1 | 80.2 | 65.4 | 52.3 | 39.5 | 29.2 | 58.3 | 128.6 | 27.3 | VTR-PTM2 | 82.9 | 67.3 | 53.4 | 40.9 | 30.9 | 61.5 | 130.2 | 28.5 |
|
|