DeepAI AI Chat
Log In Sign Up

What Makes Multimodal Learning Better than Single (Provably)

06/08/2021
by   Yu Huang, et al.
4

The world provides us with data of multiple modalities. Intuitively, models fusingdata from different modalities outperform unimodal models, since more informationis aggregated. Recently, joining the success of deep learning, there is an influentialline of work on deep multimodal learning, which has remarkable empirical resultson various applications. However, theoretical justifications in this field are notablylacking.Can multimodal provably perform better than unimodal? In this paper, we answer this question under a most popular multimodal learningframework, which firstly encodes features from different modalities into a commonlatent space and seamlessly maps the latent representations into the task space. Weprove that learning with multiple modalities achieves a smaller population risk thanonly using its subset of modalities. The main intuition is that the former has moreaccurate estimate of the latent space representation. To the best of our knowledge,this is the first theoretical treatment to capture important qualitative phenomenaobserved in real multimodal applications. Combining with experiment results, weshow that multimodal learning does possess an appealing formal guarantee.

READ FULL TEXT

page 1

page 2

page 3

page 4

05/25/2023

Score-Based Multimodal Autoencoders

Multimodal Variational Autoencoders (VAEs) represent a promising group o...
05/04/2023

Multimodal Understanding Through Correlation Maximization and Minimization

Multimodal learning has mainly focused on learning large models on, and ...
04/10/2023

On Robustness in Multimodal Learning

Multimodal learning is defined as learning over multiple heterogeneous i...
11/22/2019

Factorized Multimodal Transformer for Multimodal Sequential Learning

The complex world around us is inherently multimodal and sequential (con...
03/08/2023

Comparing Trajectory and Vision Modalities for Verb Representation

Three-dimensional trajectories, or the 3D position and rotation of objec...
02/18/2022

A Review on Methods and Applications in Multimodal Deep Learning

Deep Learning has implemented a wide range of applications and has becom...