Computational Intelligence and Neuroscience

Research Article

Research on Video Captioning Based on Multifeature Fusion

Table 1

Comparison of the experimental results of the model obtained by different experimental parameters and different modal information fusion training under the MSR-VTT dataset.


Number layer	Feature	Score
		BLEU4		METEOR		ROUGEL		CIDEr
		Coordinated	Joint	Coordinated	Joint	Coordinated	Joint	Coordinated	Joint

1		0.306	0.299	0.255	0.251	0.517	0.518	0.391	0.400
		0.359	0.352	0.214	0.200	0.603	0.598	0.397	0.395
		0.401	0.410	0.290	0.287	0.619	0.586	0.422	0.410

2		0.334	0.325	0.235	0.220	0.520	0.499	0.394	0.396
		0.386	0.381	0.243	0.244	0.609	0.587	0.424	0.422
		0.443	0.430	0.327	0.319	0.612	0.600	0.521	0.517

3		0.325	0.319	0.227	0.231	0.542	0.539	0.389	0.391
		0.379	0.377	0.246	0.237	0.597	0.585	0.463	0.459
		0.393	0.390	0.292	0.293	0.599	0.571	0.497	0.469