Research Article
Multimodal Semantics Extraction from User-Generated Videos
Table 3
Performance comparison for the event genre classification task using different feature-sets.
| | | Automatic event genre classification | | Event | Ground truth event genre | Feature-set (audio) | Feature-set (sensors) | Feature-set (DSIFT) | Feature-set (global visual features) |
| | Football match 1 | Sport | Live music | Sport | Sport | Sport | | Football match 2 | Sport | Sport | Sport | Sport | Sport | | Football match 3 | Sport | Live music | Sport | Sport | Sport | | Ice-hockey match 1 | Sport | Live music | Sport | Live music | Sport | | Ice-hockey match 2 | Sport | Live music | Sport | Live music | Live music | | Concert 1 | Live music | Live music | Live music | Live music | Live music | | Concert 2 | Live music | Live music | Live music | Live music | Sport | | Concert 3 | Live music | Sport | Live music | Live music | Live music | | Concert 4 | Live music | Live music | Sport | Live music | Live music |
| | Total accuracy (%) | ā | 44.4 | 88.9 | 77.8 | 77.8 |
|
|