Self-taught Object Localization with Deep Networks

09/13/2014
by   Loris Bazzani, et al.
0

This paper introduces self-taught object localization, a novel approach that leverages deep convolutional networks trained for whole-image recognition to localize objects in images without additional human supervision, i.e., without using any ground-truth bounding boxes for training. The key idea is to analyze the change in the recognition scores when artificially masking out different regions of the image. The masking out of a region that includes the object typically causes a significant drop in recognition score. This idea is embedded into an agglomerative clustering technique that generates self-taught localization hypotheses. Our object localization scheme outperforms existing proposal methods in both precision and recall for small number of subwindow proposals (e.g., on ILSVRC-2012 it produces a relative gain of 23.4 state-of-the-art for top-1 hypothesis). Furthermore, our experiments show that the annotations automatically-generated by our method can be used to train object detectors yielding recognition results remarkably close to those obtained by training on manually-annotated bounding boxes.

READ FULL TEXT

page 1

page 6

page 11

page 12

page 13

research
04/25/2019

RepPoints: Point Set Representation for Object Detection

Modern object detectors rely heavily on rectangular bounding boxes, such...
research
09/15/2020

Polyp-artifact relationship analysis using graph inductive learned representations

The diagnosis process of colorectal cancer mainly focuses on the localiz...
research
03/29/2017

Iterative Object and Part Transfer for Fine-Grained Recognition

The aim of fine-grained recognition is to identify sub-ordinate categori...
research
04/27/2016

Simultaneous Food Localization and Recognition

The development of automatic nutrition diaries, which would allow to kee...
research
03/05/2015

BoxSup: Exploiting Bounding Boxes to Supervise Convolutional Networks for Semantic Segmentation

Recent leading approaches to semantic segmentation rely on deep convolut...
research
07/06/2022

GLENet: Boosting 3D Object Detectors with Generative Label Uncertainty Estimation

The inherent ambiguity in ground-truth annotations of 3D bounding boxes ...
research
08/01/2018

TraMNet - Transition Matrix Network for Efficient Action Tube Proposals

Current state-of-the-art methods solve spatiotemporal action localisatio...

Please sign up or login with your details

Forgot password? Click here to reset