Research Article
Multimodal Multiobject Tracking by Fusing Deep Appearance Features and Motion Information
Table 1
The architecture of the used CNN.
| | Name | Patch size/stride | Output size |
| | Conv 1 | 3 × 3/1 | 32 × 128 × 64 | | Conv 2 | 3 × 3/1 | 32 × 128 × 64 | | Max pool 3 | 3 × 3/2 | 32 × 64 × 32 | | Residual 4 | 3 × 3/1 | 32 × 64 × 32 | | Residual 5 | 3 × 3/1 | 32 × 64 × 32 | | Residual 6 | 3 × 3/2 | 64 × 32 × 16 | | Residual 7 | 3 × 3/1 | 64 × 32 × 16 | | Residual 8 | 3 × 3/2 | 128 × 16 × 8 | | Residual 9 | 3 × 3/1 | 128 × 16 × 8 | | Dense 10 | — | 128 | | Batch and normalization | — | 128 |
|
|