Spatiotemporal Transformer Attention Network for 3D Voxel Level Joint Segmentation and Motion Prediction in Point Cloud

02/28/2022
by   Zhensong Wei, et al.
0

Environment perception including detection, classification, tracking, and motion prediction are key enablers for automated driving systems and intelligent transportation applications. Fueled by the advances in sensing technologies and machine learning techniques, LiDAR-based sensing systems have become a promising solution. The current challenges of this solution are how to effectively combine different perception tasks into a single backbone and how to efficiently learn the spatiotemporal features directly from point cloud sequences. In this research, we propose a novel spatiotemporal attention network based on a transformer self-attention mechanism for joint semantic segmentation and motion prediction within a point cloud at the voxel level. The network is trained to simultaneously outputs the voxel level class and predicted motion by learning directly from a sequence of point cloud datasets. The proposed backbone includes both a temporal attention module (TAM) and a spatial attention module (SAM) to learn and extract the complex spatiotemporal features. This approach has been evaluated with the nuScenes dataset, and promising performance has been achieved.

READ FULL TEXT
research
10/19/2021

Spatial-Temporal Transformer for 3D Point Cloud Sequences

Effective learning of spatial-temporal information within a point cloud ...
research
03/19/2022

Voxel Set Transformer: A Set-to-Set Approach to 3D Object Detection from Point Clouds

Transformer has demonstrated promising performance in many 2D vision tas...
research
04/17/2022

Learning 3D Semantics from Pose-Noisy 2D Images with Hierarchical Full Attention Network

We propose a novel framework to learn 3D point cloud semantics from 2D m...
research
07/11/2022

Learning Spatial and Temporal Variations for 4D Point Cloud Segmentation

LiDAR-based 3D scene perception is a fundamental and important task for ...
research
04/03/2020

LiDAR-based Online 3D Video Object Detection with Graph-based Message Passing and Spatiotemporal Transformer Attention

Existing LiDAR-based 3D object detectors usually focus on the single-fra...
research
04/27/2021

Cross-Level Cross-Scale Cross-Attention Network for Point Cloud Representation

Self-attention mechanism recently achieves impressive advancement in Nat...

Please sign up or login with your details

Forgot password? Click here to reset