Learning to Detect Every Thing in an Open World

12/03/2021
by   Kuniaki Saito, et al.
0

Many open-world applications require the detection of novel objects, yet state-of-the-art object detection and instance segmentation networks do not excel at this task. The key issue lies in their assumption that regions without any annotations should be suppressed as negatives, which teaches the model to treat the unannotated objects as background. To address this issue, we propose a simple yet surprisingly powerful data augmentation and training scheme we call Learning to Detect Every Thing (LDET). To avoid suppressing hidden objects, background objects that are visible but unlabeled, we paste annotated objects on a background image sampled from a small region of the original image. Since training solely on such synthetically augmented images suffers from domain shift, we decouple the training into two parts: 1) training the region classification and regression head on augmented images, and 2) training the mask heads on original images. In this way, a model does not learn to classify hidden objects as background while generalizing well to real images. LDET leads to significant improvements on many datasets in the open world instance segmentation task, outperforming baselines on cross-category generalization on COCO, as well as cross-dataset evaluation on UVO and Cityscapes.

READ FULL TEXT

page 1

page 5

page 7

page 8

page 10

page 13

page 14

page 15

research
08/08/2023

Exploring Transformers for Open-world Instance Segmentation

Open-world instance segmentation is a rising task, which aims to segment...
research
02/19/2019

Augmentation for small object detection

In recent years, object detection has experienced impressive progress. D...
research
03/09/2023

MaskDiff: Modeling Mask Distribution with Diffusion Probabilistic Model for Few-Shot Instance Segmentation

Few-shot instance segmentation extends the few-shot learning paradigm to...
research
08/04/2017

Augmented Reality Meets Computer Vision : Efficient Data Generation for Urban Driving Scenes

The success of deep learning in computer vision is based on availability...
research
10/18/2022

Scrape, Cut, Paste and Learn: Automated Dataset Generation Applied to Parcel Logistics

State-of-the-art approaches in computer vision heavily rely on sufficien...
research
03/09/2023

Open-world Instance Segmentation: Top-down Learning with Bottom-up Supervision

Many top-down architectures for instance segmentation achieve significan...
research
08/29/2023

Unveiling Camouflage: A Learnable Fourier-based Augmentation for Camouflaged Object Detection and Instance Segmentation

Camouflaged object detection (COD) and camouflaged instance segmentation...

Please sign up or login with your details

Forgot password? Click here to reset