Research Article

A Video Classification Method Based on Spatiotemporal Detail Attention and Feature Fusion

Table 3

The comparison between this algorithm and other methods on the kinetics 400.

Top1Top5GFLOPs

I3D [15]72.190.3108
Two-stream I3D [15]75.792.0216
S3D-G [26]77.293.0
Nonlocal R50 [47]76.592.6
Nonlocal R101 [47]77.793.3
R()D Flow [25]67.587.2152
STC [48]68.788.5
ARTNet [49]69.288.323.5
S3D [26]69.489.166.4
ECO [50]70.089.4216
R()D [25]73.990.9152
TSN [1]71.391.533
TSM [2]75.191.865
SlowFast , R101 [3]78.993.5213
SlowFast , R101+NL [3]79.893.9234
VCM-SDD , R101_NP77.493.146.8
VCM-SDD , R10178.593.546.8
VCM-SDD , R101_NP79.393.946.8
VCM-SDD , R10180.194.446.8