• At the end of each trial, the correct category was revealed and the subjects recorded the accuracy of their category guess.
Tiny People dataset
• We introduce a new Tiny People dataset, which contains 200 real scene images
capturing people at distance (the dataset is used for testing only)
Tiny People Pose
Lukáš Neumann and Andrea Vedaldi
Introduction
1. Andriluka, M., Pishchulin, L., Gehler, P., Bernt, S.: 2d human pose estimation: New benchmark and state of the art analysis, CVPR 2014
2. Newell, A., Yang, K., Deng, J.: Stacked hourglass networks for human pose estimation, ECCV 2016
3. Cao, Z., Simon, T., Wei, S.E., Sheikh, Y.: Realtime multi-person 2d pose estimation using part affinity fields, CVPR 2017
4. Hu, P., Ramanan, D.: Finding tiny faces, CVPR 2017
Visual Geometry Group, Department of Engineering Science, University of Oxford
• Goal: recognizing pose from tiny images of people, down to 24px high
• Tiny People are important in surveillance, autonomous driving, etc.
• Challenge: low resolution data is extremely ambiguous
• Approach
• Models data uncertainty via probability distributions
• A CNN that outputs a distribution over possible body joints from a small
image of a person
• A probabilistic variant of keypoint localization using dense heat maps
• Evaluation
• Downsampled standard benchmarks (MPII Human Pose and MS-COCO)
• New Tiny People dataset of “real” low-resolution people
• Our method outperforms both standard models
References
This research was supported by
Probabilistic Formulation
• Our model emits a continuous Gaussian distribution for each keypoint u by
estimating Gaussian Mixture Model parameters over a coarse 16 × 16 feature
map Ωd generated over the whole image
• Output: a continuous distribution over possible body joint configurations:
• Training: minimum neg. log-likelihood of the ground truth keypoint locations
MPII Human Pose dataset
• Evaluation on the standard MPII Human Pose dataset1, where we reduced the
resolution to approximate people seen at a distance
• Standard-height (128px): comparable to the state-of-the-art Stacked Hourglass2
and better than Part Affinity Fields3
• Small people (<64px): our method performs significantly better
detection accuracy model surpriseregression error
person height histogram
of human pose datasets
• We also train the Tiny Faces detector4 for person detection and use it as an
input for our method
• We have shown modelling uncertainty explicitly in a deep network can significantly boosts accuracy for ambiguous data, such as low-resolution pose
recognition
Standard Formulation
• The standard approach2 for landmark detection is to output one heatmap per
joint
• Heatmaps are fitted to the ground truth data by a L2 per-pixel regression of a
heuristic Gaussian-like kernel around the ground truth landmark location
• Limitations:
• Heatmaps only convey information about the
location – they may appear to encode uncertainty, but they do not
• The accuracy is bound by the heatmap
resolution
• The model allows for modelling data uncertainty (by increasing the variance of
corresponding Gaussians), as well as for sub-pixel accuracy