Taxonomy of multimodal self-supervised representation learning

12/25/2020
by   Alex Fedorov, et al.
12

Sensory input from multiple sources is crucial for robust and coherent human perception. Different sources contribute complementary explanatory factors and get combined based on factors they share. This system motivated the design of powerful unsupervised representation-learning algorithms. In this paper, we unify recent work on multimodal self-supervised learning under a single framework. Observing that most self-supervised methods optimize similarity metrics between a set of model components, we propose a taxonomy of all reasonable ways to organize this process. We empirically show on two versions of multimodal MNIST and a multimodal brain imaging dataset that (1) multimodal contrastive learning has significant benefits over its unimodal counterpart, (2) the specific composition of multiple contrastive objectives is critical to performance on a downstream task, (3) maximization of the similarity between representations has a regularizing effect on a neural network, which sometimes can lead to reduced downstream performance but still can reveal multimodal relations. Consequently, we outperform previous unsupervised encoder-decoder methods based on CCA or variational mixtures MMVAE on various datasets on linear evaluation protocol.

READ FULL TEXT

page 4

page 5

research
12/25/2020

On self-supervised multi-modal representation learning: An application to Alzheimer's disease

Introspection of deep supervised predictive models trained on functional...
research
12/21/2022

Similarity Contrastive Estimation for Image and Video Soft Contrastive Self-Supervised Learning

Contrastive representation learning has proven to be an effective self-s...
research
10/28/2022

Improving the Modality Representation with Multi-View Contrastive Learning for Multimodal Sentiment Analysis

Modality representation learning is an important problem for multimodal ...
research
09/07/2022

Self-supervised multimodal neuroimaging yields predictive representations for a spectrum of Alzheimer's phenotypes

Recent neuroimaging studies that focus on predicting brain disorders via...
research
09/29/2022

Understanding Collapse in Non-Contrastive Siamese Representation Learning

Contrastive methods have led a recent surge in the performance of self-s...
research
03/01/2023

Can representation learning for multimodal image registration be improved by supervision of intermediate layers?

Multimodal imaging and correlative analysis typically require image alig...
research
06/09/2022

Rethinking 360° Image Visual Attention Modelling with Unsupervised Learning

Despite the success of self-supervised representation learning on plana...

Please sign up or login with your details

Forgot password? Click here to reset