S2-Net: Self-supervision Guided Feature Representation Learning for Cross-Modality Images

03/28/2022
by   Shasha Mei, et al.
0

Combining the respective advantages of cross-modality images can compensate for the lack of information in the single modality, which has attracted increasing attention of researchers into multi-modal image matching tasks. Meanwhile, due to the great appearance differences between cross-modality image pairs, it often fails to make the feature representations of correspondences as close as possible. In this letter, we design a cross-modality feature representation learning network, S2-Net, which is based on the recently successful detect-and-describe pipeline, originally proposed for visible images but adapted to work with cross-modality image pairs. To solve the consequent problem of optimization difficulties, we introduce self-supervised learning with a well-designed loss function to guide the training without discarding the original advantages. This novel strategy simulates image pairs in the same modality, which is also a useful guide for the training of cross-modality images. Notably, it does not require additional data but significantly improves the performance and is even workable for all methods of the detect-and-describe pipeline. Extensive experiments are conducted to evaluate the performance of the strategy we proposed, compared to both handcrafted and deep learning-based methods. Results show that our elegant formulation of combined optimization of supervised and self-supervised learning outperforms state-of-the-arts on RoadScene and RGB-NIR datasets.

READ FULL TEXT

page 1

page 2

page 4

research
11/28/2019

Self-Supervised Learning by Cross-Modal Audio-Video Clustering

The visual and audio modalities are highly correlated yet they contain d...
research
02/09/2021

Learning Modality-Specific Representations with Self-Supervised Multi-Task Learning for Multimodal Sentiment Analysis

Representation Learning is a significant and challenging task in multimo...
research
02/15/2021

Self-Supervised Features Improve Open-World Learning

This is a position paper that addresses the problem of Open-World learni...
research
06/21/2022

Probing Visual-Audio Representation for Video Highlight Detection via Hard-Pairs Guided Contrastive Learning

Video highlight detection is a crucial yet challenging problem that aims...
research
12/18/2019

Learning Shared Cross-modality Representation Using Multispectral-LiDAR and Hyperspectral Data

Due to the ever-growing diversity of the data source, multi-modality fea...
research
04/06/2021

Weakly supervised segmentation with cross-modality equivariant constraints

Weakly supervised learning has emerged as an appealing alternative to al...
research
07/12/2023

Unified Molecular Modeling via Modality Blending

Self-supervised molecular representation learning is critical for molecu...

Please sign up or login with your details

Forgot password? Click here to reset