Abstract:
Human pose estimation is a challenging task that requires the comprehension of the pose structure. This work can refer to spatial relation inference in a pose structure model; how to model the dynamic spatial relation against various unreliable joints is critical. To this end, we propose a Distilling Dynamic Spatial Relation network (DDSR), which builds pose-based graph representation by exploiting the feature of spatial relation from the location distribution of joints. We use a dynamic message propagation mechanism to update the spatial relation on edges. Specifically, to filter out the noisy predictions, we select the joints with high confidence; to enhance the spatial relation in a large receptive field, we propagate multi-stage messages among joints. Besides, to reduce the computation cost of the multi-stage message propagation, we design a cross-resolution distillation framework. We use a new spatial distillation loss to verify the spatial relation between the teacher model and the student model. Experimental results on COCO and MPII datasets show that our method is superior to the state-of-the-art methods. The visualization results further verify the interpretability of our spatial relation.