Universal Instance Perception as Object Discovery and Retrieval

03/12/2023
by   Bin Yan, et al.
0

All instance perception tasks aim at finding certain objects specified by some queries such as category names, language expressions, and target annotations, but this complete field has been split into multiple independent subtasks. In this work, we present a universal instance perception model of the next generation, termed UNINEXT. UNINEXT reformulates diverse instance perception tasks into a unified object discovery and retrieval paradigm and can flexibly perceive different types of objects by simply changing the input prompts. This unified formulation brings the following benefits: (1) enormous data from different tasks and label vocabularies can be exploited for jointly training general instance-level representations, which is especially beneficial for tasks lacking in training data. (2) the unified model is parameter-efficient and can save redundant computation when handling multiple tasks simultaneously. UNINEXT shows superior performance on 20 challenging benchmarks from 10 instance-level tasks including classical image-level tasks (object detection and instance segmentation), vision-and-language tasks (referring expression comprehension and segmentation), and six video-level object tracking tasks. Code is available at https://github.com/MasterBin-IIAU/UNINEXT.

READ FULL TEXT

page 4

page 16

page 17

page 18

research
03/29/2022

Unified Transformer Tracker for Object Tracking

As an important area in computer vision, object tracking has formed two ...
research
10/05/2015

Relaxed Multiple-Instance SVM with Application to Object Discovery

Multiple-instance learning (MIL) has served as an important tool for a w...
research
07/14/2022

Towards Grand Unification of Object Tracking

We present a unified method, termed Unicorn, that can simultaneously sol...
research
08/18/2022

Unifying Visual Perception by Dispersible Points Learning

We present a conceptually simple, flexible, and universal visual percept...
research
09/25/2022

BURST: A Benchmark for Unifying Object Recognition, Segmentation and Tracking in Video

Multiple existing benchmarks involve tracking and segmenting objects in ...
research
07/24/2023

Exposing the Troublemakers in Described Object Detection

Detecting objects based on language descriptions is a popular task that ...
research
09/07/2023

Tracking Anything with Decoupled Video Segmentation

Training data for video segmentation are expensive to annotate. This imp...

Please sign up or login with your details

Forgot password? Click here to reset