Research Article

Designing Compact Convolutional Filters for Lightweight Human Pose Estimation

Table 3

Comparisons of results on MSCOCO test-dev2017 set. #Params and flops are calculated for the pose estimation network, and those for human detection are not included.

MethodBackboneInput#ParamsGFLOPs

Bottom-up: key point detection and grouping
OpenPose [34]61.884.967.557.168.266.5
Associative embedding [35]65.586.672.360.672.670.2
PersonLab [4]68.78975.464.175.575.4
MultiPoseNet [36]69.686.376.665.076.373.5
HigherHRNet [37]HRNet-w3228.6M47.966.487.572.861.274.2

Top-down: human detection and single-person key point detection
Large network
Mask-RCNN [22]ResNet-50-FPN63.187.368.757.871.4
G-RMI [15]ResNet-10142.6M5764.985.571.362.370.069.7
IPR [27]ResNet-10145.0M1167.888.274.863.974.0
RMPE [38]PyraNet [39]28.1M26.772.389.279.168.078.6
CPN [28]72.191.480.068.777.278.5
SimpleBaseline [25]ResNet-15268.6M35.673.791.981.170.380.079.0
Small network
MobileNetV2 [19]MobileNetV29.8M3.3366.890.074.062.673.372.3
ShuffleNetV2 [33]ShuffleNetV27.6M2.8762.988.569.458.969.368.9
Small HRNet [17]HRNet-W161.3M1.2155.285.861.451.761.261.5
Lite-HRNet [17]Lite-HRNet-181.1M0.4566.989.474.464.072.272.6
MobilePoseNetMobileNetv3 [13]1.5M0.5564.888.872.461.970.270.7
MobilePoseNetMobileNetv31.5M1.2367.489.474.264.173.373.3