Research Article
Combined Auxiliary Networks and Bird’s Eye View Method for Real-Time Multicategory Object Recognition
Figure 4
The architecture of the auxiliary network. The blue dash in the scene is the voxel boundary. The green box is the ground truth bounding box. The yellow points and blue points are the voxels’ highest points, which are included and not included in the bounding boxes, respectively. The red point is the center of the ground truth bounding box. The boxwise feature is then followed by two fully connected layers (the operation is called “Dense” customarily), generating bounding box regression values similar to Rgt (see equation (4)).