Priors are Powerful: Improving a Transformer for Multi-camera 3D Detection with 2D Priors

01/31/2023
by   Di Feng, et al.
0

Transfomer-based approaches advance the recent development of multi-camera 3D detection both in academia and industry. In a vanilla transformer architecture, queries are randomly initialised and optimised for the whole dataset, without considering the differences among input frames. In this work, we propose to leverage the predictions from an image backbone, which is often highly optimised for 2D tasks, as priors to the transformer part of a 3D detection network. The method works by (1). augmenting image feature maps with 2D priors, (2). sampling query locations via ray-casting along 2D box centroids, as well as (3). initialising query features with object-level image features. Experimental results shows that 2D priors not only help the model converge faster, but also largely improve the baseline approach by up to 12 average precision.

READ FULL TEXT
research
08/22/2021

Guiding Query Position and Performing Similar Attention for Transformer-Based Detection Heads

After DETR was proposed, this novel transformer-based detection paradigm...
research
03/15/2023

FAQ: Feature Aggregated Queries for Transformer-based Video Object Detectors

Video object detection needs to solve feature degradation situations tha...
research
04/11/2022

Category-Aware Transformer Network for Better Human-Object Interaction Detection

Human-Object Interactions (HOI) detection, which aims to localize a huma...
research
04/30/2023

TransCAR: Transformer-based Camera-And-Radar Fusion for 3D Object Detection

Despite radar's popularity in the automotive industry, for fusion-based ...
research
07/18/2022

Conditional DETR V2: Efficient Detection Transformer with Box Queries

In this paper, we are interested in Detection Transformer (DETR), an end...
research
03/09/2021

QPIC: Query-Based Pairwise Human-Object Interaction Detection with Image-Wide Contextual Information

We propose a simple, intuitive yet powerful method for human-object inte...
research
05/27/2021

When Liebig's Barrel Meets Facial Landmark Detection: A Practical Model

In recent years, significant progress has been made in the research of f...

Please sign up or login with your details

Forgot password? Click here to reset