Anchor-Based Spatial-Temporal Attention Convolutional Networks for Dynamic 3D Point Cloud Sequences

12/20/2020
by   Guangming Wang, et al.
0

Recently, learning based methods for the robot perception from the image or video have much developed, but deep learning methods for dynamic 3D point cloud sequences are underexplored. With the widespread application of 3D sensors such as LiDAR and depth camera, efficient and accurate perception of the 3D environment from 3D sequence data is pivotal to autonomous driving and service robots. An Anchor-based Spatial-Temporal Attention Convolution operation (ASTAConv) is proposed in this paper to process dynamic 3D point cloud sequences. The proposed convolution operation builds a regular receptive field around each point by setting several virtual anchors around each point. The features of neighborhood points are firstly aggregated to each anchor based on spatial-temporal attention mechanism. Then, anchor-based sparse 3D convolution is adopted to aggregate the features of these anchors to the core points. The proposed method makes better use of the structured information within the local region, and learn spatial-temporal embedding features from dynamic 3D point cloud sequences. Then Anchor-based Spatial-Temporal Attention Convolutional Neural Networks (ASTACNNs) are proposed for classification and segmentation tasks and are evaluated on action recognition and semantic segmentation tasks. The experimental results on MSRAction3D and Synthia datasets demonstrate that the higher accuracy can be achieved than the previous state-of-the-art method by our novel strategy of multi-frame fusion.

READ FULL TEXT

page 1

page 7

page 10

research
10/19/2021

Spatial-Temporal Transformer for 3D Point Cloud Sequences

Effective learning of spatial-temporal information within a point cloud ...
research
10/16/2020

Human Segmentation with Dynamic LiDAR Data

Consecutive LiDAR scans compose dynamic 3D sequences, which contain more...
research
01/17/2022

Action Keypoint Network for Efficient Video Recognition

Reducing redundancy is crucial for improving the efficiency of video rec...
research
03/27/2023

Binarizing Sparse Convolutional Networks for Efficient Point Cloud Analysis

In this paper, we propose binary sparse convolutional networks called BS...
research
07/11/2022

Learning Spatial and Temporal Variations for 4D Point Cloud Segmentation

LiDAR-based 3D scene perception is a fundamental and important task for ...
research
12/01/2021

Point Cloud Segmentation Using Sparse Temporal Local Attention

Point clouds are a key modality used for perception in autonomous vehicl...
research
11/05/2019

Dynamic Time Warp Convolutional Networks

Where dealing with temporal sequences it is fair to assume that the same...

Please sign up or login with your details

Forgot password? Click here to reset