Research Article

Visual-Text Reference Pretraining Model for Image Captioning

Table 3

Comparison of the results generated by VTR-PTM in different input modes on MS COCO.

ApproachB@1B@2B@3B@4MRCS

VTR-PTM071.155.440.229.624.351.5100.520.1
VTR-PTM180.265.452.339.529.258.3128.627.3
VTR-PTM282.967.353.440.930.961.5130.228.5