Research Article
Hierarchical Attention-Based Multimodal Fusion Network for Video Emotion Recognition
Table 9
Top-1 accuracy (%) comparing state-of-the-art methods on Ekman-6 and VideoEmotion-8.
| | Method | Ekman (%) | VideoEmotion-8 (%) |
| | Emotion in context [10] | 51.8 | 50.6 | | Xu et al. [33] | 50.4 | 46.7 | | Kernelized feature [26] | 54.4 | 49.7 | | Concept selection [27] | 54.40 | 50.82 | | Graph-based network [36] | 55.01 | 51.77 | | CAAN [37] | 56.23 | 52.5 | | Ours | 57.7 | 53.13 |
|
|