Harmonized Multimodal Learning with Gaussian Process Latent Variable Models

08/14/2019
by   Guoli Song, et al.
8

Multimodal learning aims to discover the relationship between multiple modalities. It has become an important research topic due to extensive multimodal applications such as cross-modal retrieval. This paper attempts to address the modality heterogeneity problem based on Gaussian process latent variable models (GPLVMs) to represent multimodal data in a common space. Previous multimodal GPLVM extensions generally adopt individual learning schemes on latent representations and kernel hyperparameters, which ignore their intrinsic relationship. To exploit strong complementarity among different modalities and GPLVM components, we develop a novel learning scheme called Harmonization, where latent model parameters are jointly learned from each other. Beyond the correlation fitting or intra-modal structure preservation paradigms widely used in existing studies, the harmonization is derived in a model-driven manner to encourage the agreement between modality-specific GP kernels and the similarity of latent representations. We present a range of multimodal learning models by incorporating the harmonization mechanism into several representative GPLVM-based approaches. Experimental results on four benchmark datasets show that the proposed models outperform the strong baselines for cross-modal retrieval tasks, and that the harmonized multimodal learning method is superior in discovering semantically consistent latent representation.

READ FULL TEXT

page 9

page 12

page 14

research
09/02/2017

XFlow: 1D-2D Cross-modal Deep Neural Networks for Audiovisual Classification

We propose two multimodal deep learning architectures that allow for cro...
research
04/02/2020

MCEN: Bridging Cross-Modal Gap between Cooking Recipes and Dish Images with Latent Variable Model

Nowadays, driven by the increasing concern on diet and health, food comp...
research
04/21/2022

Learning Sequential Latent Variable Models from Multimodal Time Series Data

Sequential modelling of high-dimensional data is an important problem th...
research
07/21/2016

A Comprehensive Survey on Cross-modal Retrieval

In recent years, cross-modal retrieval has drawn much attention due to t...
research
04/01/2022

A Novel Multimodal Approach for Studying the Dynamics of Curiosity in Small Group Learning

Curiosity is a vital metacognitive skill in educational contexts, leadin...
research
05/04/2023

Multimodal Understanding Through Correlation Maximization and Minimization

Multimodal learning has mainly focused on learning large models on, and ...
research
10/03/2022

Unsupervised Multimodal Change Detection Based on Structural Relationship Graph Representation Learning

Unsupervised multimodal change detection is a practical and challenging ...

Please sign up or login with your details

Forgot password? Click here to reset