Generalized Product-of-Experts for Learning Multimodal Representations in Noisy Environments

11/07/2022
by   Abhinav Joshi, et al.
8

A real-world application or setting involves interaction between different modalities (e.g., video, speech, text). In order to process the multimodal information automatically and use it for an end application, Multimodal Representation Learning (MRL) has emerged as an active area of research in recent times. MRL involves learning reliable and robust representations of information from heterogeneous sources and fusing them. However, in practice, the data acquired from different sources are typically noisy. In some extreme cases, a noise of large magnitude can completely alter the semantics of the data leading to inconsistencies in the parallel multimodal data. In this paper, we propose a novel method for multimodal representation learning in a noisy environment via the generalized product of experts technique. In the proposed method, we train a separate network for each modality to assess the credibility of information coming from that modality, and subsequently, the contribution from each modality is dynamically varied while estimating the joint distribution. We evaluate our method on two challenging benchmarks from two diverse domains: multimodal 3D hand-pose estimation and multimodal surgical video segmentation. We attain state-of-the-art performance on both benchmarks. Our extensive quantitative and qualitative evaluations show the advantages of our method compared to previous approaches.

READ FULL TEXT

page 6

page 7

page 8

research
07/15/2021

MultiBench: Multiscale Benchmarks for Multimodal Representation Learning

Learning multimodal representations involves integrating information fro...
research
06/16/2018

Learning Factorized Multimodal Representations

Learning representations of multimodal data is a fundamentally complex r...
research
02/07/2022

GMC – Geometric Multimodal Contrastive Representation Learning

Learning representations of multimodal data that are both informative an...
research
03/02/2022

HighMMT: Towards Modality and Task Generalization for High-Modality Representation Learning

Learning multimodal representations involves discovering correspondences...
research
03/06/2020

Noise Estimation Using Density Estimation for Self-Supervised Multimodal Learning

One of the key factors of enabling machine learning models to comprehend...
research
05/29/2018

Disentangling by Partitioning: A Representation Learning Framework for Multimodal Sensory Data

Multimodal sensory data resembles the form of information perceived by h...
research
04/17/2018

Multimodal Co-Training for Selecting Good Examples from Webly Labeled Video

We tackle the problem of learning concept classifiers from videos on the...

Please sign up or login with your details

Forgot password? Click here to reset