Object Discovery from Motion-Guided Tokens

03/27/2023
by   Zhipeng Bao, et al.
1

Object discovery – separating objects from the background without manual labels – is a fundamental open challenge in computer vision. Previous methods struggle to go beyond clustering of low-level cues, whether handcrafted (e.g., color, texture) or learned (e.g., from auto-encoders). In this work, we augment the auto-encoder representation learning framework with two key components: motion-guidance and mid-level feature tokenization. Although both have been separately investigated, we introduce a new transformer decoder showing that their benefits can compound thanks to motion-guided vector quantization. We show that our architecture effectively leverages the synergy between motion and tokenization, improving upon the state of the art on both synthetic and real datasets. Our approach enables the emergence of interpretable object-specific mid-level features, demonstrating the benefits of motion-guidance (no labeling) and quantization (interpretability, memory efficiency).

READ FULL TEXT

page 1

page 3

page 5

page 7

page 8

page 12

page 13

research
03/18/2022

Discovering Objects that Can Move

This paper studies the problem of object discovery – separating objects ...
research
08/16/2023

Agglomerative Transformer for Human-Object Interaction Detection

We propose an agglomerative Transformer (AGER) that enables Transformer-...
research
01/02/2018

Unsupervised Object-Level Video Summarization with Online Motion Auto-Encoder

Unsupervised video summarization plays an important role on digesting, b...
research
12/17/2022

Improving Unsupervised Video Object Segmentation with Motion-Appearance Synergy

We present IMAS, a method that segments the primary objects in videos wi...
research
08/16/2020

Unsupervised Acoustic Unit Representation Learning for Voice Conversion using WaveNet Auto-encoders

Unsupervised representation learning of speech has been of keen interest...
research
11/29/2022

UDE: A Unified Driving Engine for Human Motion Generation

Generating controllable and editable human motion sequences is a key cha...

Please sign up or login with your details

Forgot password? Click here to reset