SMIL: Multimodal Learning with Severely Missing Modality

03/09/2021
by   Mengmeng Ma, et al.
14

A common assumption in multimodal learning is the completeness of training data, i.e., full modalities are available in all training examples. Although there exists research endeavor in developing novel methods to tackle the incompleteness of testing data, e.g., modalities are partially missing in testing examples, few of them can handle incomplete training modalities. The problem becomes even more challenging if considering the case of severely missing, e.g., 90 first time in the literature, this paper formally studies multimodal learning with missing modality in terms of flexibility (missing modalities in training, testing, or both) and efficiency (most training data have incomplete modality). Technically, we propose a new method named SMIL that leverages Bayesian meta-learning in uniformly achieving both objectives. To validate our idea, we conduct a series of experiments on three popular benchmarks: MM-IMDb, CMU-MOSI, and avMNIST. The results prove the state-of-the-art performance of SMIL over existing methods and generative baselines including autoencoders and generative adversarial networks. Our code is available at https://github.com/mengmenm/SMIL.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
03/06/2023

Multimodal Prompting with Missing Modalities for Visual Recognition

In this paper, we tackle two challenges in multimodal learning for visua...
research
10/07/2022

Missing Modality meets Meta Sampling (M3S): An Efficient Universal Approach for Multimodal Sentiment Analysis with Missing Modality

Multimodal sentiment analysis (MSA) is an important way of observing men...
research
02/06/2023

MuG: A Multimodal Classification Benchmark on Game Data with Tabular, Textual, and Visual Fields

Multimodal learning has attracted the interest of the machine learning c...
research
04/17/2018

Multimodal Co-Training for Selecting Good Examples from Webly Labeled Video

We tackle the problem of learning concept classifiers from videos on the...
research
12/15/2022

MM-SHAP: A Performance-agnostic Metric for Measuring Multimodal Contributions in Vision and Language Models Tasks

Vision and language models (VL) are known to exploit unrobust indicators...
research
10/23/2022

MM-Align: Learning Optimal Transport-based Alignment Dynamics for Fast and Accurate Inference on Missing Modality Sequences

Existing multimodal tasks mostly target at the complete input modality s...
research
04/17/2023

MMANet: Margin-aware Distillation and Modality-aware Regularization for Incomplete Multimodal Learning

Multimodal learning has shown great potentials in numerous scenes and at...

Please sign up or login with your details

Forgot password? Click here to reset