| Hyperparameter | ConvGRU | SConvGRU | SConvGRU+ | ConvLSTM | SConvLSTM | SConvLSTM+ |
| CNN layers | 2 | 2 | 2 | 2 | 2 | 2 | Number of filters in CNN | 8 (first layer), 16 (second layer) | Kernel size in CNN | (3, 3) | (3, 3) | (3, 3) | (3, 3) | (3, 3) | (3, 3) | Stride in CNN | 1 (first layer), 2 (second layer) | Unit layers | 1 | 1 | 1 | 1 | 1 | 1 | Hidden channels of unit | 64 | 64 | 64 | 64 | 64 | 64 | Kernel size in unit | (3, 3) | (3, 3) | (3, 3) | (3, 3) | (3, 3) | (3, 3) | Input of gating mechanism | | | | | | | DCNN layers | 2 | 2 | 2 | 2 | 2 | 2 | Number of filters in DCNN | 8 (first layer), 2 (second layer) | Kernel size in DCNN | (3, 3) | (3, 3) | (3, 3) | (3, 3) | (3, 3) | (3, 3) | Stride in DCNN | 2 (first layer), 1 (second layer) | Batch size | 16 | 16 | 16 | 16 | 16 | 16 | Timestep | 10 | 10 | 10 | 10 | 10 | 10 | Epoch | 300 | 300 | 300 | 300 | 300 | 300 | Optimizer | Adam [36] | Adam | Adam | Adam | Adam | Adam | Learning rate | 0.0002 | 0.0002 | 0.0002 | 0.0002 | 0.0002 | 0.0002 | Strategy | Early-stopping | Early-stopping | Early-stopping | Early-stopping | Early-stopping | Early-stopping |
|
|