RCDPT: Radar-Camera fusion Dense Prediction Transformer

11/04/2022
by   Chen-Chou Lo, et al.
0

Recently, transformer networks have outperformed traditional deep neural networks in natural language processing and show a large potential in many computer vision tasks compared to convolutional backbones. In the original transformer, readout tokens are used as designated vectors for aggregating information from other tokens. However, the performance of using readout tokens in a vision transformer is limited. Therefore, we propose a novel fusion strategy to integrate radar data into a dense prediction transformer network by reassembling camera representations with radar representations. Instead of using readout tokens, radar representations contribute additional depth information to a monocular depth estimation model and improve performance. We further investigate different fusion approaches that are commonly used for integrating additional modality in a dense prediction transformer network. The experiments are conducted on the nuScenes dataset, which includes camera images, lidar, and radar data. The results show that our proposed method yields better performance than the commonly used fusion strategies and outperforms existing convolutional depth estimation models that fuse camera images and radar.

READ FULL TEXT

page 2

page 3

research
09/30/2020

Depth Estimation from Monocular Images and Sparse Radar Data

In this paper, we explore the possibility of achieving a more accurate d...
research
09/21/2023

Multimodal Transformers for Wireless Communications: A Case Study in Beam Prediction

Wireless communications at high-frequency bands with large antenna array...
research
07/15/2021

Depth Estimation from Monocular Images and Sparse radar using Deep Ordinal Regression Network

We integrate sparse radar data into a monocular depth estimation model a...
research
12/17/2020

Multi-Modal Depth Estimation Using Convolutional Neural Networks

This paper addresses the problem of dense depth predictions from sparse ...
research
04/28/2022

Depth Estimation with Simplified Transformer

Transformer and its variants have shown state-of-the-art results in many...
research
12/28/2018

Spatiotemporal Data Fusion for Precipitation Nowcasting

Precipitation nowcasting using neural networks and ground-based radars h...
research
09/12/2022

Vision Transformer with Convolutional Encoder-Decoder for Hand Gesture Recognition using 24 GHz Doppler Radar

Transformers combined with convolutional encoders have been recently use...

Please sign up or login with your details

Forgot password? Click here to reset