Multimodal Learning Without Labeled Multimodal Data: Guarantees and Applications

06/07/2023
by   Paul Pu Liang, et al.
6

In many machine learning systems that jointly learn from multiple modalities, a core research question is to understand the nature of multimodal interactions: the emergence of new task-relevant information during learning from both modalities that was not present in either alone. We study this challenge of interaction quantification in a semi-supervised setting with only labeled unimodal data and naturally co-occurring multimodal data (e.g., unlabeled images and captions, video and corresponding audio) but when labeling them is time-consuming. Using a precise information-theoretic definition of interactions, our key contributions are the derivations of lower and upper bounds to quantify the amount of multimodal interactions in this semi-supervised setting. We propose two lower bounds based on the amount of shared information between modalities and the disagreement between separately trained unimodal classifiers, and derive an upper bound through connections to approximate algorithms for min-entropy couplings. We validate these estimated bounds and show how they accurately track true interactions. Finally, two semi-supervised multimodal applications are explored based on these theoretical results: (1) analyzing the relationship between multimodal performance and estimated interactions, and (2) self-supervised learning that embraces disagreement between modalities beyond agreement as is typically done.

READ FULL TEXT
research
07/14/2020

TCGM: An Information-Theoretic Framework for Semi-Supervised Multi-Modality Learning

Fusing data from multiple modalities provides more information to train ...
research
02/24/2022

An Information-theoretical Approach to Semi-supervised Learning under Covariate-shift

A common assumption in semi-supervised learning is that the labeled, unl...
research
02/23/2023

Quantifying Modeling Feature Interactions: An Information Decomposition Framework

The recent explosion of interest in multimodal applications has resulted...
research
06/08/2023

Factorized Contrastive Learning: Going Beyond Multi-view Redundancy

In a wide range of multimodal tasks, contrastive learning has become a p...
research
01/18/2021

Multimodal Variational Autoencoders for Semi-Supervised Learning: In Defense of Product-of-Experts

Multimodal generative models should be able to learn a meaningful latent...
research
12/09/2017

Semi-supervised Multimodal Hashing

Retrieving nearest neighbors across correlated data in multiple modaliti...
research
11/17/2020

ABC-Net: Semi-Supervised Multimodal GAN-based Engagement Detection using an Affective, Behavioral and Cognitive Model

We present ABC-Net, a novel semi-supervised multimodal GAN framework to ...

Please sign up or login with your details

Forgot password? Click here to reset