Generalized K-fan Multimodal Deep Model with Shared Representations

03/26/2015
by   Gang Chen, et al.
0

Multimodal learning with deep Boltzmann machines (DBMs) is an generative approach to fuse multimodal inputs, and can learn the shared representation via Contrastive Divergence (CD) for classification and information retrieval tasks. However, it is a 2-fan DBM model, and cannot effectively handle multiple prediction tasks. Moreover, this model cannot recover the hidden representations well by sampling from the conditional distribution when more than one modalities are missing. In this paper, we propose a K-fan deep structure model, which can handle the multi-input and muti-output learning problems effectively. In particular, the deep structure has K-branch for different inputs where each branch can be composed of a multi-layer deep model, and a shared representation is learned in an discriminative manner to tackle multimodal tasks. Given the deep structure, we propose two objective functions to handle two multi-input and multi-output tasks: joint visual restoration and labeling, and the multi-view multi-calss object recognition tasks. To estimate the model parameters, we initialize the deep model parameters with CD to maximize the joint distribution, and then we use backpropagation to update the model according to specific objective function. The experimental results demonstrate that the model can effectively leverages multi-source information and predict multiple tasks well over competitive baselines.

READ FULL TEXT

page 8

page 9

research
12/04/2016

Joint Visual Denoising and Classification using Deep Learning

Visual restoration and recognition are traditionally addressed in pipeli...
research
06/16/2018

Learning Factorized Multimodal Representations

Learning representations of multimodal data is a fundamentally complex r...
research
04/11/2017

Deep Multimodal Representation Learning from Temporal Data

In recent years, Deep Learning has been successfully applied to multimod...
research
03/16/2023

Identifiability Results for Multimodal Contrastive Learning

Contrastive learning is a cornerstone underlying recent progress in mult...
research
06/15/2020

Multimodal Generative Learning Utilizing Jensen-Shannon-Divergence

Learning from different data types is a long-standing goal in machine le...
research
03/04/2016

Learning deep representation of multityped objects and tasks

We introduce a deep multitask architecture to integrate multityped repre...
research
02/07/2019

Multimodal Conditional Learning with Fast Thinking Policy-like Model and Slow Thinking Planner-like Model

This paper studies the supervised learning of the conditional distributi...

Please sign up or login with your details

Forgot password? Click here to reset