Authors: Márton Véges,András Lőrincz
ArXiv: 1904.05947
Document:
PDF
DOI
Artifact development version:
GitHub
Abstract URL: http://arxiv.org/abs/1904.05947v1
The common approach to 3D human pose estimation is predicting the body joint
coordinates relative to the hip. This works well for a single person but is
insufficient in the case of multiple interacting people. Methods predicting
absolute coordinates first estimate a root-relative pose then calculate the
translation via a secondary optimization task. We propose a neural network that
predicts joints in a camera centered coordinate system instead of a
root-relative one. Unlike previous methods, our network works in a single step
without any post-processing. Our network beats previous methods on the
MuPoTS-3D dataset and achieves state-of-the-art results.