Research Article
RGB-D Human Action Recognition of Deep Feature Enhancement and Fusion Using Two-Stream ConvNet
Table 3
Comparison of accuracy of adding nonlocal to different locations of st-gcn.
| | Network | Top 1 | Top 5 |
| | Baseline | 81.5% | | | 1-block | 85.23% | 98.74% | | 2-block | 86.43% | 98.62% | | 3-block | 85.43% | 97.12% | | 4-block | 82.14% | 97.2% | | 5-block | 85.55% | 97.31% | | 1-2-block | 85.63% | 96.32% | | 1-3-block | 84.08% | 95.67% | | 1-4-block | 84.24% | 92.35% | | 2-2-block | 87.62% | 97.3% | | 2-3-block | 84.1% | 95.2% | | 2-4-block | 84.41% | 94.69% | | 3-3-block | 83.77% | 94.12% | | 3-4-block | 80.19% | 91.63% | | 4-4-block | 77.09% | 91.03% | | 5-5-block | 77.75% | 90.12% |
|
|