A Generalized Multi-Modal Fusion Detection Framework

03/13/2023
by   Leichao Cui, et al.
0

LiDAR point clouds have become the most common data source in autonomous driving. However, due to the sparsity of point clouds, accurate and reliable detection cannot be achieved in specific scenarios. Because of their complementarity with point clouds, images are getting increasing attention. Although with some success, existing fusion methods either perform hard fusion or do not fuse in a direct manner. In this paper, we propose a generic 3D detection framework called MMFusion, using multi-modal features. The framework aims to achieve accurate fusion between LiDAR and images to improve 3D detection in complex scenes. Our framework consists of two separate streams: the LiDAR stream and the camera stream, which can be compatible with any single-modal feature extraction network. The Voxel Local Perception Module in the LiDAR stream enhances local feature representation, and then the Multi-modal Feature Fusion Module selectively combines feature output from different streams to achieve better fusion. Extensive experiments have shown that our framework not only outperforms existing benchmarks but also improves their detection, especially for detecting cyclists and pedestrians on KITTI benchmarks, with strong robustness and generalization capabilities. Hopefully, our work will stimulate more research into multi-modal fusion for autonomous driving tasks.

READ FULL TEXT

page 1

page 7

research
11/03/2022

PointSee: Image Enhances Point Cloud

There is a trend to fuse multi-modal information for 3D object detection...
research
08/24/2023

SkipcrossNets: Adaptive Skip-cross Fusion for Road Detection

Multi-modal fusion is increasingly being used for autonomous driving tas...
research
12/06/2022

Attention-Enhanced Cross-modal Localization Between 360 Images and Point Clouds

Visual localization plays an important role for intelligent robots and a...
research
03/17/2023

PersonalTailor: Personalizing 2D Pattern Design from 3D Garment Point Clouds

Garment pattern design aims to convert a 3D garment to the corresponding...
research
05/11/2023

Multi-modal Multi-level Fusion for 3D Single Object Tracking

3D single object tracking plays a crucial role in computer vision. Mains...
research
05/14/2022

Multi-modal curb detection and filtering

Reliable knowledge of road boundaries is critical for autonomous vehicle...
research
09/25/2022

From One to Many: Dynamic Cross Attention Networks for LiDAR and Camera Fusion

LiDAR and cameras are two complementary sensors for 3D perception in aut...

Please sign up or login with your details

Forgot password? Click here to reset