3D Video Object Detection with Learnable Object-Centric Global Optimization

03/27/2023
by   Jiawei He, et al.
0

We explore long-term temporal visual correspondence-based optimization for 3D video object detection in this work. Visual correspondence refers to one-to-one mappings for pixels across multiple images. Correspondence-based optimization is the cornerstone for 3D scene reconstruction but is less studied in 3D video object detection, because moving objects violate multi-view geometry constraints and are treated as outliers during scene reconstruction. We address this issue by treating objects as first-class citizens during correspondence-based optimization. In this work, we propose BA-Det, an end-to-end optimizable object detector with object-centric temporal correspondence learning and featuremetric object bundle adjustment. Empirically, we verify the effectiveness and efficiency of BA-Det for multiple baseline 3D detectors under various setups. Our BA-Det achieves SOTA performance on the large-scale Waymo Open Dataset (WOD) with only marginal computation cost. Our code is available at https://github.com/jiaweihe1996/BA-Det.

READ FULL TEXT

page 4

page 12

research
03/21/2023

Exploring Object-Centric Temporal Modeling for Efficient Multi-View 3D Object Detection

In this paper, we propose a long-sequence modeling framework, named Stre...
research
12/26/2022

Fewer is More: Efficient Object Detection in Large Aerial Images

Current mainstream object detection methods for large aerial images usua...
research
02/17/2021

A Simple and Effective Use of Object-Centric Images for Long-Tailed Object Detection

Object frequencies in daily scenes follow a long-tailed distribution. Ma...
research
07/15/2020

Learning to Parse Wireframes in Images of Man-Made Environments

In this paper, we propose a learning-based approach to the task of autom...
research
03/31/2023

Shepherding Slots to Objects: Towards Stable and Robust Object-Centric Learning

Object-centric learning (OCL) aspires general and compositional understa...
research
06/02/2023

OCBEV: Object-Centric BEV Transformer for Multi-View 3D Object Detection

Multi-view 3D object detection is becoming popular in autonomous driving...
research
11/10/2022

AnimeRun: 2D Animation Visual Correspondence from Open Source 3D Movies

Existing correspondence datasets for two-dimensional (2D) cartoon suffer...

Please sign up or login with your details

Forgot password? Click here to reset