Hypothesize and Bound: A Computational Focus of Attention Mechanism for Simultaneous N-D Segmentation, Pose Estimation and Classification Using Shape Priors

04/13/2011
by   Diego Rother, et al.
0

Given the ever increasing bandwidth of the visual information available to many intelligent systems, it is becoming essential to endow them with a sense of what is worthwhile their attention and what can be safely disregarded. This article presents a general mathematical framework to efficiently allocate the available computational resources to process the parts of the input that are relevant to solve a given perceptual problem. By this we mean to find the hypothesis H (i.e., the state of the world) that maximizes a function L(H), representing how well each hypothesis "explains" the input. Given the large bandwidth of the sensory input, fully evaluating L(H) for each hypothesis H is computationally infeasible (e.g., because it would imply checking a large number of pixels). To address this problem we propose a mathematical framework with two key ingredients. The first one is a Bounding Mechanism (BM) to compute lower and upper bounds of L(H), for a given computational budget. These bounds are much cheaper to compute than L(H) itself, can be refined at any time by increasing the budget allocated to a hypothesis, and are frequently enough to discard a hypothesis. To compute these bounds, we develop a novel theory of shapes and shape priors. The second ingredient is a Focus of Attention Mechanism (FoAM) to select which hypothesis' bounds should be refined next, with the goal of discarding non-optimal hypotheses with the least amount of computation. The proposed framework: 1) is very efficient since most hypotheses are discarded with minimal computation; 2) is parallelizable; 3) is guaranteed to find the globally optimal hypothesis; and 4) its running time depends on the problem at hand, not on the bandwidth of the input. We instantiate the proposed framework for the problem of simultaneously estimating the class, pose, and a noiseless version of a 2D shape in a 2D image.

READ FULL TEXT

page 2

page 18

page 19

page 21

page 22

page 23

page 24

research
02/01/2015

Pose and Shape Estimation with Discriminatively Learned Parts

We introduce a new approach for estimating the 3D pose and the 3D shape ...
research
12/12/2016

PoseAgent: Budget-Constrained 6D Object Pose Estimation via Reinforcement Learning

State-of-the-art computer vision algorithms often achieve efficiency by ...
research
05/16/2019

Bimodal Stereo: Joint Shape and Pose Estimation from Color-Depth Image Pair

Mutual calibration between color and depth cameras is a challenging topi...
research
12/07/2016

Global Hypothesis Generation for 6D Object Pose Estimation

This paper addresses the task of estimating the 6D pose of a known 3D ob...
research
12/30/2015

Sharp Computational-Statistical Phase Transitions via Oracle Computational Model

We study the fundamental tradeoffs between computational tractability an...
research
04/12/2016

Spatial Attention Deep Net with Partial PSO for Hierarchical Hybrid Hand Pose Estimation

Discriminative methods often generate hand poses kinematically implausib...

Please sign up or login with your details

Forgot password? Click here to reset