DeepAI AI Chat
Log In Sign Up

ASAP-Net: Attention and Structure Aware Point Cloud Sequence Segmentation

by   Hanwen Cao, et al.

Recent works of point clouds show that mulit-frame spatio-temporal modeling outperforms single-frame versions by utilizing cross-frame information. In this paper, we further improve spatio-temporal point cloud feature learning with a flexible module called ASAP considering both attention and structure information across frames, which we find as two important factors for successful segmentation in dynamic point clouds. Firstly, our ASAP module contains a novel attentive temporal embedding layer to fuse the relatively informative local features across frames in a recurrent fashion. Secondly, an efficient spatio-temporal correlation method is proposed to exploit more local structure for embedding, meanwhile enforcing temporal consistency and reducing computation complexity. Finally, we show the generalization ability of the proposed ASAP module with different backbone networks for point cloud sequence segmentation. Our ASAP-Net (backbone plus ASAP module) outperforms baselines and previous methods on both Synthia and SemanticKITTI datasets (+3.4 to +15.2 mIoU points with different backbones). Code is availabe at


page 2

page 4

page 6

page 8


STTracker: Spatio-Temporal Tracker for 3D Single Object Tracking

3D single object tracking with point clouds is a critical task in 3D com...

Spatio-temporal Graph-RNN for Point Cloud Prediction

In this paper, we propose an end-to-end learning network to predict futu...

Point Cloud Segmentation Using Sparse Temporal Local Attention

Point clouds are a key modality used for perception in autonomous vehicl...

Learning Parallel Dense Correspondence from Spatio-Temporal Descriptors for Efficient and Robust 4D Reconstruction

This paper focuses on the task of 4D shape reconstruction from a sequenc...

IDEA-Net: Dynamic 3D Point Cloud Interpolation via Deep Embedding Alignment

This paper investigates the problem of temporally interpolating dynamic ...

Frame Mining: a Free Lunch for Learning Robotic Manipulation from 3D Point Clouds

We study how choices of input point cloud coordinate frames impact learn...

3D Dynamic Point Cloud Denoising via Spatio-temporal Graph Modeling

The prevalence of accessible depth sensing and 3D laser scanning techniq...