Research Article
A Video Classification Method Based on Spatiotemporal Detail Attention and Feature Fusion
Table 4
The comparison between this algorithm and other methods on the kinetics 600.
| | Top1 | Top5 | GFLOPs |
| I3D [15] | 71.9 | 90.1 | 108 | StNet-IRv2 RGB [51] | 79.0 | — | — | AttentionNAS [5] | 79.8 | 94.4 | — | LGD-3D R101 [52] | 81.5 | 95.6 | — | SlowFast , R101 [3] | 81.1 | 95.1 | 213 | SlowFast , R101+NL [3] | 81.8 | 95.1 | 234 | TSN [1] | 71.7 | 90.6 | 33 | TSM [2] | 75.6 | 92.1 | 65 | VCM-SDD , R101_NP | 79.6 | 94.3 | 46.8 | VCM-SDD , R101 | 80.4 | 94.7 | 46.8 | VCM-SDD , R101_NP | 81.3 | 94.9 | 46.8 | VCM-SDD , R101 | 81.9 | 95.3 | 46.8 |
|
|