Research Article
Visual-Text Reference Pretraining Model for Image Captioning
Table 5
Comparison of the results generated by VTR-PTM in different initialization methods on MS COCO.
| Approach | B@1 | B@2 | B@3 | B@4 | M | R | C | S |
| VTR-PTM from scratch | 80.3 | 63.2 | 50.9 | 37.7 | 28.5 | 56.9 | 123.4 | 25.8 |
| VTR-PTM from BERT | 81.5 | 66.4 | 52.7 | 38.6 | 29.5 | 58.3 | 125.6 | 27.8 |
| VTR-PTM from UNILM | 82.9 | 67.3 | 53.4 | 40.9 | 30.9 | 61.5 | 130.2 | 28.5 |
|
|