Research Article

Visual-Text Reference Pretraining Model for Image Captioning

Figure 2

Two different visual reference networks: (a) single-channel visual reference network and (b) dual-channel visual reference network.