A Simple and Effective Use of Object-Centric Images for Long-Tailed Object Detection

by   Cheng Zhang, et al.

Object frequencies in daily scenes follow a long-tailed distribution. Many objects do not appear frequently enough in scene-centric images (e.g., sightseeing, street views) for us to train accurate object detectors. In contrast, these objects are captured at a higher frequency in object-centric images, which are intended to picture the objects of interest. Motivated by this phenomenon, we propose to take advantage of the object-centric images to improve object detection in scene-centric images. We present a simple yet surprisingly effective framework to do so. On the one hand, our approach turns an object-centric image into a useful training example for object detection in scene-centric images by mitigating the domain gap between the two image sources in both the input and label space. On the other hand, our approach employs a multi-stage procedure to train the object detector, such that the detector learns the diverse object appearances from object-centric images while being tied to the application domain of scene-centric images. On the LVIS dataset, our approach can improve the object detection (and instance segmentation) accuracy of rare objects by 50 performance of other classes.



page 1

page 3

page 4

page 5

page 9

page 14

page 16

page 17


Image-Level or Object-Level? A Tale of Two Resampling Strategies for Long-Tailed Detection

Training on datasets with long-tailed distributions has been challenging...

SWA Object Detection

Do you want to improve 1.0 AP for your object detector without any infer...

Seeing What Is Not There: Learning Context to Determine Where Objects Are Missing

Most of computer vision focuses on what is in an image. We propose to tr...

Stitcher: Feedback-driven Data Provider for Object Detection

Object detectors commonly vary quality according to scales, where the pe...

Context Forest for efficient object detection with large mixture models

We present Context Forest (ConF), a technique for predicting properties ...

HyperDet3D: Learning a Scene-conditioned 3D Object Detector

A bathtub in a library, a sink in an office, a bed in a laundry room – t...

Simple multi-dataset detection

How do we build a general and broad object detection system? We use all ...

Code Repositories


Mosaic of Object-centric Images as Scene-centric Images (MosaicOS) for long-tailed object detection and instance segmentation.

view repo
This week in AI

Get the week's most popular data science and artificial intelligence research sent straight to your inbox every Saturday.