Research Article

Research on Video Captioning Based on Multifeature Fusion

Figure 1

The video contains not only physical objects, but also features such as sound. When we pay more attention to these supplementary features, the generated text will be more complete. (a) Video example of fast skiing. (b) Video example of a train honking out of a tunnel.
(a)
(b)