Computational Intelligence and Neuroscience

Research Article

Mitigation of Effects of Occlusion on Object Recognition with Deep Neural Networks through Low-Level Image Completion

Figure 6

ConvNet classification accuracy when training with either unoccluded or occluded data and no recovery. Both convolutional networks have identical architecture. Both conditions pass the input images to the classifier with no attempt to discount occlusion pixels. In the “unoccluded” case, the network is trained with the 24,300 image pairs in the NORB-simple training set. In the “combined” case, the network is trained with the 243,000 image pairs contained in the SORBO-combined training set. Approximately a quarter of these image pairs are unoccluded. The remaining pairs contain various classes and levels of occlusion. Training with unoccluded images produces higher accuracy on the unoccluded testing images. The performance degrades quickly towards chance as the level of occlusion in the testing images increases, however. The network trained on combined data is less effective on unoccluded images but more robust to increasing occlusion.