Enhancing Multi-modal Cooperation via Fine-grained Modality Valuation

09/12/2023
by   Yake Wei, et al.
0

One primary topic of multi-modal learning is to jointly incorporate heterogeneous information from different modalities. However, most models often suffer from unsatisfactory multi-modal cooperation, which could not jointly utilize all modalities well. Some methods are proposed to identify and enhance the worse learnt modality, but are often hard to provide the fine-grained observation of multi-modal cooperation at sample-level with theoretical support. Hence, it is essential to reasonably observe and improve the fine-grained cooperation between modalities, especially when facing realistic scenarios where the modality discrepancy could vary across different samples. To this end, we introduce a fine-grained modality valuation metric to evaluate the contribution of each modality at sample-level. Via modality valuation, we regretfully observe that the multi-modal model tends to rely on one specific modality, resulting in other modalities being low-contributing. We further analyze this issue and improve cooperation between modalities by enhancing the discriminative ability of low-contributing modalities in a targeted manner. Overall, our methods reasonably observe the fine-grained uni-modal contribution at sample-level and achieve considerable improvement on different multi-modal models.

READ FULL TEXT
research
04/30/2022

SHAPE: An Unified Approach to Evaluate the Contribution and Cooperation of Individual Modalities

As deep learning advances, there is an ever-growing demand for models ca...
research
05/29/2019

What Makes Training Multi-Modal Networks Hard?

Consider end-to-end training of a multi-modal vs. a single-modal network...
research
03/09/2023

MMCosine: Multi-Modal Cosine Loss Towards Balanced Audio-Visual Fine-Grained Learning

Audio-visual learning helps to comprehensively understand the world by f...
research
02/10/2022

Characterizing and overcoming the greedy nature of learning in multi-modal deep neural networks

We hypothesize that due to the greedy nature of learning in multi-modal ...
research
10/16/2012

Factorized Multi-Modal Topic Model

Multi-modal data collections, such as corpora of paired images and text ...
research
05/27/2020

A Multi-modal Approach to Fine-grained Opinion Mining on Video Reviews

Despite the recent advances in opinion mining for written reviews, few w...
research
03/16/2023

Multi-modal Differentiable Unsupervised Feature Selection

Multi-modal high throughput biological data presents a great scientific ...

Please sign up or login with your details

Forgot password? Click here to reset