DeepAI
Log In Sign Up

PointCMC: Cross-Modal Multi-Scale Correspondences Learning for Point Cloud Understanding

Some self-supervised cross-modal learning approaches have recently demonstrated the potential of image signals for enhancing point cloud representation. However, it remains a question on how to directly model cross-modal local and global correspondences in a self-supervised fashion. To solve it, we proposed PointCMC, a novel cross-modal method to model multi-scale correspondences across modalities for self-supervised point cloud representation learning. In particular, PointCMC is composed of: (1) a local-to-local (L2L) module that learns local correspondences through optimized cross-modal local geometric features, (2) a local-to-global (L2G) module that aims to learn the correspondences between local and global features across modalities via local-global discrimination, and (3) a global-to-global (G2G) module, which leverages auxiliary global contrastive loss between the point cloud and image to learn high-level semantic correspondences. Extensive experiment results show that our approach outperforms existing state-of-the-art methods in various downstream tasks such as 3D object classification and segmentation. Code will be made publicly available upon acceptance.

READ FULL TEXT

page 1

page 2

page 3

page 4

03/01/2022

CrossPoint: Self-Supervised Cross-Modal Contrastive Learning for 3D Point Cloud Understanding

Manual annotation of large-scale point cloud dataset for varying tasks s...
12/13/2022

DeepMapping2: Self-Supervised Large-Scale LiDAR Map Optimization

LiDAR mapping is important yet challenging in self-driving and mobile ro...
05/28/2020

Self-supervised Modal and View Invariant Feature Learning

Most of the existing self-supervised feature learning methods for 3D dat...
09/20/2022

Cross-modal Learning for Image-Guided Point Cloud Shape Completion

In this paper we explore the recent topic of point cloud completion, gui...
07/05/2022

Open-Vocabulary 3D Detection via Image-level Class and Debiased Cross-modal Contrastive Learning

Current point-cloud detection methods have difficulty detecting the open...
03/31/2022

Cross-modal Learning of Graph Representations using Radar Point Cloud for Long-Range Gesture Recognition

Gesture recognition is one of the most intuitive ways of interaction and...
12/24/2020

P4Contrast: Contrastive Learning with Pairs of Point-Pixel Pairs for RGB-D Scene Understanding

Self-supervised representation learning is a critical problem in compute...