LiDAR-based Online 3D Video Object Detection with Graph-based Message Passing and Spatiotemporal Transformer Attention

04/03/2020
by   Junbo Yin, et al.
0

Existing LiDAR-based 3D object detectors usually focus on the single-frame detection, while ignoring the spatiotemporal information in consecutive point cloud frames. In this paper, we propose an end-to-end online 3D video object detector that operates on point cloud sequences. The proposed model comprises a spatial feature encoding component and a spatiotemporal feature aggregation component. In the former component, a novel Pillar Message Passing Network (PMPNet) is proposed to encode each discrete point cloud frame. It adaptively collects information for a pillar node from its neighbors by iterative message passing, which effectively enlarges the receptive field of the pillar feature. In the latter component, we propose an Attentive Spatiotemporal Transformer GRU (AST-GRU) to aggregate the spatiotemporal information, which enhances the conventional ConvGRU with an attentive memory gating mechanism. AST-GRU contains a Spatial Transformer Attention (STA) module and a Temporal Transformer Attention (TTA) module, which can emphasize the foreground objects and align the dynamic objects, respectively. Experimental results demonstrate that the proposed 3D video object detector achieves state-of-the-art performance on the large-scale nuScenes benchmark.

READ FULL TEXT

page 1

page 3

page 4

research
07/26/2022

Graph Neural Network and Spatiotemporal Transformer Attention for 3D Video Object Detection from Point Clouds

Previous works for LiDAR-based 3D object detection mainly focus on the s...
research
08/18/2022

Ret3D: Rethinking Object Relations for Efficient 3D Object Detection in Driving Scenes

Current efficient LiDAR-based detection frameworks are lacking in exploi...
research
09/30/2022

D-Align: Dual Query Co-attention Network for 3D Object Detection Based on Multi-frame Point Cloud Sequence

LiDAR sensors are widely used for 3D object detection in various mobile ...
research
02/28/2022

Spatiotemporal Transformer Attention Network for 3D Voxel Level Joint Segmentation and Motion Prediction in Point Cloud

Environment perception including detection, classification, tracking, an...
research
03/28/2022

Equivariant Point Cloud Analysis via Learning Orientations for Message Passing

Equivariance has been a long-standing concern in various fields ranging ...
research
06/02/2022

Unified Recurrence Modeling for Video Action Anticipation

Forecasting future events based on evidence of current conditions is an ...
research
01/19/2020

Zero-Shot Video Object Segmentation via Attentive Graph Neural Networks

This work proposes a novel attentive graph neural network (AGNN) for zer...

Please sign up or login with your details

Forgot password? Click here to reset