Research Article

Dynamic Warping Network for Semantic Video Segmentation

Table 4

Ablation study of feature fusion and propagation on the Cityscapes validation set.

MethodmIoU %

Baseline73.75
Baseline + sum74.30
Baseline + concatenate74.25
Baseline + TCLoss 74.87
Baseline + TCLoss 75.25

Sum and Concatenate denote the weighted sum and concatenation of the warped features and the original features for feature fusion, respectively. TCLoss denotes the temporal consistency loss, including feature consistency loss and prediction consistency loss. The bold values denote our method can achieve the best accuracy using both the feature consistency loss and prediction consistency loss.