An Efficient Approach to Informative Feature Extraction from Multimodal Data

11/22/2018
by   Lichen Wang, et al.
0

One primary focus in multimodal feature extraction is to find the representations of individual modalities that are maximally correlated. As a well-known measure of dependence, the Hirschfeld-Gebelein-Rényi (HGR) maximal correlation becomes an appealing objective because of its operational meaning and desirable properties. However, the strict whitening constraints formalized in the HGR maximal correlation limit its application. To address this problem, this paper proposes Soft-HGR, a novel framework to extract informative features from multiple data modalities. Specifically, our framework prevents the "hard" whitening constraints, while simultaneously preserving the same feature geometry as in the HGR maximal correlation. The objective of Soft-HGR is straightforward, only involving two inner products, which guarantees the efficiency and stability in optimization. We further generalize the framework to handle more than two modalities and missing modalities. When labels are partially available, we enhance the discriminative power of the feature representations by making a semi-supervised adaptation. Empirical evaluation implies that our approach learns more informative feature mappings and is more efficient to optimize.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
07/07/2022

Multimodal Feature Extraction for Memes Sentiment Classification

In this study, we propose feature extraction for multimodal meme classif...
research
05/04/2023

Multimodal Understanding Through Correlation Maximization and Minimization

Multimodal learning has mainly focused on learning large models on, and ...
research
01/04/2023

Kernel Subspace and Feature Extraction

We study kernel methods in machine learning from the perspective of feat...
research
05/12/2023

Knowledge Soft Integration for Multimodal Recommendation

One of the main challenges in modern recommendation systems is how to ef...
research
07/14/2018

GPU-based Commonsense Paradigms Reasoning for Real-Time Query Answering and Multimodal Analysis

We utilize commonsense knowledge bases to address the problem of real- t...
research
06/23/2021

Learning Multimodal VAEs through Mutual Supervision

Multimodal VAEs seek to model the joint distribution over heterogeneous ...

Please sign up or login with your details

Forgot password? Click here to reset