Research Article
Objects Classification by Learning-Based Visual Saliency Model and Convolutional Neural Network
Figure 6
An illustration of the architecture of our CNN. The CNN we used contains 4 convolutional layers (C1~C4), the kernel sizes, respectively, are 5, 5, 5, and 4 pixels, the number of feature maps, respectively, is 9, 18, 36, and 72, and all of the stride is 1. All of the subsampling (S1~S2) size, respectively, is 2 pixels, and all of the stride is 1. The network’s input is 3000 dimension features and output is 648 dimension features.