PointTransformer: Encoding Human Local Features for Small Target Detection

<div>(a) The original YOLOV5 head-layer is shown on the left. (b) Our proposed head-layer with positional features mapping is shown on the right. The dependence of the model on local features is further enhanced by mapping the positional features to the output layer.</div>

Computational Intelligence and Neuroscience

fig5

Figure 5

Figure 5: PointTransformer: Encoding Human Local Features for Small Target Detection