| Main network | Structure | Output |
| Prelayer | Conv (64, 3 × 3, stride = 1) | (64, 64 × 64) |
| Layer 1 | Concatenation | (128, 64 × 64) | Max pool (2 × 2, stride = 2) | (128, 32 × 32) | Conv (128, 3 × 3, stride = 1) | (128, 32 × 32) |
| Layer 2 | Concatenation | (256, 32 × 32) | Max pool (2 × 2, stride = 2) | (256, 16 × 16) | Conv (256, 3 × 3, stride = 1) | (256, 16 × 16) |
| Layer 3 | Concatenation | (512, 16 × 16) | Max pool (2 × 2, stride = 2) | (512, 8 × 8) | Conv (512, 3 × 3, stride = 1) | (512, 8 × 8) |
| Layer 4 | Max pool (2 × 2, stride = 2) | (512, 4 × 4) | Conv (512, 3 × 3, stride = 1) | (512, 4 × 4) |
| Layer 5 | Max pool (4 × 4, stride = 1) | (512, 1 × 1) |
|
|