Multi-view Vision-Prompt Fusion Network: Can 2D Pre-trained Model Boost 3D Point Cloud Data-scarce Learning?

04/20/2023
by   Haoyang Peng, et al.
0

Point cloud based 3D deep model has wide applications in many applications such as autonomous driving, house robot, and so on. Inspired by the recent prompt learning in natural language processing, this work proposes a novel Multi-view Vision-Prompt Fusion Network (MvNet) for few-shot 3D point cloud classification. MvNet investigates the possibility of leveraging the off-the-shelf 2D pre-trained models to achieve the few-shot classification, which can alleviate the over-dependence issue of the existing baseline models towards the large-scale annotated 3D point cloud data. Specifically, MvNet first encodes a 3D point cloud into multi-view image features for a number of different views. Then, a novel multi-view prompt fusion module is developed to effectively fuse information from different views to bridge the gap between 3D point cloud data and 2D pre-trained models. A set of 2D image prompts can then be derived to better describe the suitable prior knowledge for a large-scale pre-trained image model for few-shot 3D point cloud classification. Extensive experiments on ModelNet, ScanObjectNN, and ShapeNet datasets demonstrate that MvNet achieves new state-of-the-art performance for 3D few-shot point cloud image classification. The source code of this work will be available soon.

READ FULL TEXT

page 3

page 8

research
12/02/2018

PVRNet: Point-View Relation Neural Network for 3D Shape Recognition

Three-dimensional (3D) shape recognition has drawn much research attenti...
research
08/10/2022

Multi-View Pre-Trained Model for Code Vulnerability Identification

Vulnerability identification is crucial for cyber security in the softwa...
research
12/04/2021

PointCLIP: Point Cloud Understanding by CLIP

Recently, zero-shot and few-shot learning via Contrastive Vision-Languag...
research
06/05/2023

Learning from Multi-View Representation for Point-Cloud Pre-Training

A critical problem in the pre-training of 3D point clouds is leveraging ...
research
12/28/2020

Compositional Prototype Network with Multi-view Comparision for Few-Shot Point Cloud Semantic Segmentation

Point cloud segmentation is a fundamental visual understanding task in 3...
research
08/06/2023

Beyond First Impressions: Integrating Joint Multi-modal Cues for Comprehensive 3D Representation

In recent years, 3D representation learning has turned to 2D vision-lang...
research
03/31/2023

A Closer Look at Few-Shot 3D Point Cloud Classification

In recent years, research on few-shot learning (FSL) has been fast-growi...

Please sign up or login with your details

Forgot password? Click here to reset