DeepAI AI Chat
Log In Sign Up

Self-supervised Modal and View Invariant Feature Learning

by   Longlong Jing, et al.
CUNY Law School

Most of the existing self-supervised feature learning methods for 3D data either learn 3D features from point cloud data or from multi-view images. By exploring the inherent multi-modality attributes of 3D objects, in this paper, we propose to jointly learn modal-invariant and view-invariant features from different modalities including image, point cloud, and mesh with heterogeneous networks for 3D data. In order to learn modal- and view-invariant features, we propose two types of constraints: cross-modal invariance constraint and cross-view invariant constraint. Cross-modal invariance constraint forces the network to maximum the agreement of features from different modalities for same objects, while the cross-view invariance constraint forces the network to maximum agreement of features from different views of images for same objects. The quality of learned features has been tested on different downstream tasks with three modalities of data including point cloud, multi-view images, and mesh. Furthermore, the invariance cross different modalities and views are evaluated with the cross-modal retrieval task. Extensive evaluation results demonstrate that the learned features are robust and have strong generalizability across different tasks.


Self-supervised Feature Learning by Cross-modality and Cross-view Correspondences

The success of supervised learning requires large-scale ground truth lab...

Cross-modal Center Loss

Cross-modal retrieval aims to learn discriminative and modal-invariant f...

CrossPoint: Self-Supervised Cross-Modal Contrastive Learning for 3D Point Cloud Understanding

Manual annotation of large-scale point cloud dataset for varying tasks s...

PointCMC: Cross-Modal Multi-Scale Correspondences Learning for Point Cloud Understanding

Some self-supervised cross-modal learning approaches have recently demon...

Contrastive Learning of Features between Images and LiDAR

Image and Point Clouds provide different information for robots. Finding...

Self-supervised Feature Learning via Exploiting Multi-modal Data for Retinal Disease Diagnosis

The automatic diagnosis of various retinal diseases from fundus images i...

Geometric Cross-Modal Comparison of Heterogeneous Sensor Data

In this work, we address the problem of cross-modal comparison of aerial...