Research Article

Hierarchical Attention-Based Multimodal Fusion Network for Video Emotion Recognition

Table 6

Accuracy of emotion recognition of global attention on Ekman and VideoEmotion-8 datasets.

Fully connected layersEkman (%)VideoEmotion-8 (%)

No attention fusion47.949.3
G156.6852.69
G257.753.13
G355.3151.71