Mixture Dense Regression for Object Detection and Human Pose Estimation

12/02/2019
by   Ali Varamesh, et al.
0

Mixture models are well-established machine learning approaches that, in computer vision, have mostly been applied to inverse or ill-defined problems. However, they are general-purpose divide-and-conquer techniques, splitting the input space into relatively homogeneous subsets, in a data-driven manner. Therefore, not only ill-defined but also well-defined complex problems should benefit from them. To this end, we devise a multi-modal solution for spatial regression using mixture density networks for dense object detection and human pose estimation. For both tasks, we show that a mixture model converges faster, yields higher accuracy, and divides the input space into interpretable modes. For object detection, mixture components learn to focus on object scale with the distribution of components closely following the distribution of ground truth object scale. For human pose estimation, a mixture model divides the data based on viewpoint and uncertainty – namely, front and back views, with back view imposing higher uncertainty. We conduct our experiments on the MS COCO dataset and do not face any mode collapse. However, to avoid numerical instabilities, we had to modify the activation function for the mixture variance terms slightly.

READ FULL TEXT
research
04/11/2019

Generating Multiple Hypotheses for 3D Human Pose Estimation with Mixture Density Network

3D human pose estimation from a monocular image or 2D joints is an ill-p...
research
07/06/2020

Point-Set Anchors for Object Detection, Instance Segmentation and Pose Estimation

A recent approach for object detection and human pose estimation is to r...
research
11/28/2019

Mixture-Model-based Bounding Box Density Estimation for Object Detection

In this paper, we propose a new object detection model, Mixture-Model-ba...
research
08/01/2023

Human-M3: A Multi-view Multi-modal Dataset for 3D Human Pose Estimation in Outdoor Scenes

3D human pose estimation in outdoor environments has garnered increasing...
research
12/30/2020

Rethinking the Heatmap Regression for Bottom-up Human Pose Estimation

Heatmap regression has become the most prevalent choice for nowadays hum...
research
12/20/2020

Deep Bingham Networks: Dealing with Uncertainty and Ambiguity in Pose Estimation

In this work, we introduce Deep Bingham Networks (DBN), a generic framew...
research
05/22/2014

Self-tuned Visual Subclass Learning with Shared Samples An Incremental Approach

Computer vision tasks are traditionally defined and evaluated using sema...

Please sign up or login with your details

Forgot password? Click here to reset