Self-supervised Feature Learning by Cross-modality and Cross-view Correspondences

04/13/2020
by   Longlong Jing, et al.
0

The success of supervised learning requires large-scale ground truth labels which are very expensive, time-consuming, or may need special skills to annotate. To address this issue, many self- or un-supervised methods are developed. Unlike most existing self-supervised methods to learn only 2D image features or only 3D point cloud features, this paper presents a novel and effective self-supervised learning approach to jointly learn both 2D image features and 3D point cloud features by exploiting cross-modality and cross-view correspondences without using any human annotated labels. Specifically, 2D image features of rendered images from different views are extracted by a 2D convolutional neural network, and 3D point cloud features are extracted by a graph convolution neural network. Two types of features are fed into a two-layer fully connected neural network to estimate the cross-modality correspondence. The three networks are jointly trained (i.e. cross-modality) by verifying whether two sampled data of different modalities belong to the same object, meanwhile, the 2D convolutional neural network is additionally optimized through minimizing intra-object distance while maximizing inter-object distance of rendered images in different views (i.e. cross-view). The effectiveness of the learned 2D and 3D features is evaluated by transferring them on five different tasks including multi-view 2D shape recognition, 3D shape recognition, multi-view 2D shape retrieval, 3D shape retrieval, and 3D part-segmentation. Extensive evaluations on all the five different tasks across different datasets demonstrate strong generalization and effectiveness of the learned 2D and 3D features by the proposed self-supervised method.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
05/28/2020

Self-supervised Modal and View Invariant Feature Learning

Most of the existing self-supervised feature learning methods for 3D dat...
research
10/13/2020

Audio-Visual Self-Supervised Terrain Type Discovery for Mobile Platforms

The ability to both recognize and discover terrain characteristics is an...
research
07/20/2023

SCA-PVNet: Self-and-Cross Attention Based Aggregation of Point Cloud and Multi-View for 3D Object Retrieval

To address 3D object retrieval, substantial efforts have been made to ge...
research
12/02/2018

PVRNet: Point-View Relation Neural Network for 3D Shape Recognition

Three-dimensional (3D) shape recognition has drawn much research attenti...
research
01/13/2022

SnapshotNet: Self-supervised Feature Learning for Point Cloud Data Segmentation Using Minimal Labeled Data

Manually annotating complex scene point cloud datasets is both costly an...
research
06/01/2021

Bootstrap Your Own Correspondences

Geometric feature extraction is a crucial component of point cloud regis...
research
12/20/2017

SuperPoint: Self-Supervised Interest Point Detection and Description

This paper presents a self-supervised framework for training interest po...

Please sign up or login with your details

Forgot password? Click here to reset