Log In Sign Up

Let Images Give You More:Point Cloud Cross-Modal Training for Shape Analysis

by   Xu Yan, et al.

Although recent point cloud analysis achieves impressive progress, the paradigm of representation learning from a single modality gradually meets its bottleneck. In this work, we take a step towards more discriminative 3D point cloud representation by fully taking advantages of images which inherently contain richer appearance information, e.g., texture, color, and shade. Specifically, this paper introduces a simple but effective point cloud cross-modality training (PointCMT) strategy, which utilizes view-images, i.e., rendered or projected 2D images of the 3D object, to boost point cloud analysis. In practice, to effectively acquire auxiliary knowledge from view images, we develop a teacher-student framework and formulate the cross modal learning as a knowledge distillation problem. PointCMT eliminates the distribution discrepancy between different modalities through novel feature and classifier enhancement criteria and avoids potential negative transfer effectively. Note that PointCMT effectively improves the point-only representation without architecture modification. Sufficient experiments verify significant gains on various datasets using appealing backbones, i.e., equipped with PointCMT, PointNet++ and PointMLP achieve state-of-the-art performance on two benchmarks, i.e., 94.4 respectively. Code will be made available at


X-Trans2Cap: Cross-Modal Knowledge Transfer using Transformer for 3D Dense Captioning

3D dense captioning aims to describe individual objects by natural langu...

Multi-View Vision-to-Geometry Knowledge Transfer for 3D Point Cloud Shape Analysis

As two fundamental representation modalities of 3D objects, 2D multi-vie...

Open-Vocabulary 3D Detection via Image-level Class and Debiased Cross-modal Contrastive Learning

Current point-cloud detection methods have difficulty detecting the open...

Cross-modal Learning for Image-Guided Point Cloud Shape Completion

In this paper we explore the recent topic of point cloud completion, gui...

Distillation with Contrast is All You Need for Self-Supervised Point Cloud Representation Learning

In this paper, we propose a simple and general framework for self-superv...

Revisiting Point Cloud Shape Classification with a Simple and Effective Baseline

Processing point cloud data is an important component of many real-world...

Multi-modal Alignment using Representation Codebook

Aligning signals from different modalities is an important step in visio...

Code Repositories


2DPASS: 2D Priors Assisted Semantic Segmentation on LiDAR Point Clouds (ECCV 2022) :fire:

view repo


[NeurIPS2022] Let Images Give You More: Point Cloud Cross-Modal Training for Shape Analysis

view repo