Multimodal Prompting with Missing Modalities for Visual Recognition

03/06/2023
by   Yi-Lun Lee, et al.
0

In this paper, we tackle two challenges in multimodal learning for visual recognition: 1) when missing-modality occurs either during training or testing in real-world situations; and 2) when the computation resources are not available to finetune on heavy transformer models. To this end, we propose to utilize prompt learning and mitigate the above two challenges together. Specifically, our modality-missing-aware prompts can be plugged into multimodal transformers to handle general missing-modality cases, while only requiring less than 1 further explore the effect of different prompt configurations and analyze the robustness to missing modality. Extensive experiments are conducted to show the effectiveness of our prompt learning framework that improves the performance under various missing-modality cases, while alleviating the requirement of heavy model re-training. Code is available.

READ FULL TEXT

page 3

page 8

research
07/26/2023

Visual Prompt Flexible-Modal Face Anti-Spoofing

Recently, vision transformer based multimodal learning methods have been...
research
03/09/2021

SMIL: Multimodal Learning with Severely Missing Modality

A common assumption in multimodal learning is the completeness of traini...
research
03/02/2021

Listen, Read, and Identify: Multimodal Singing Language Identification of Music

We propose a multimodal singing language classification model that uses ...
research
04/28/2022

Tag-assisted Multimodal Sentiment Analysis under Uncertain Missing Modalities

Multimodal sentiment analysis has been studied under the assumption that...
research
10/02/2020

Training Strategies to Handle Missing Modalities for Audio-Visual Expression Recognition

Automatic audio-visual expression recognition can play an important role...
research
04/17/2023

MMANet: Margin-aware Distillation and Modality-aware Regularization for Incomplete Multimodal Learning

Multimodal learning has shown great potentials in numerous scenes and at...
research
04/12/2022

Are Multimodal Transformers Robust to Missing Modality?

Multimodal data collected from the real world are often imperfect due to...

Please sign up or login with your details

Forgot password? Click here to reset