Multimodal Conditional Learning with Fast Thinking Policy-like Model and Slow Thinking Planner-like Model

02/07/2019
by   Jianwen Xie, et al.
10

This paper studies the supervised learning of the conditional distribution of a high-dimensional output given an input, where the output and input belong to two different modalities, e.g., the output is an image and the input is a sketch. We solve this problem by learning two models that bear similarities to those in reinforcement learning and optimal control. One model is policy-like. It generates the output directly by a non-linear transformation of the input and a noise vector. This amounts to fast thinking because the conditional generation is accomplished by direct sampling. The other model is planner-like. It learns an objective function in the form of a conditional energy function, so that the output can be generated by optimizing the objective function, or more rigorously by sampling from the conditional energy-based model. This amounts to slow thinking because the sampling process is accomplished by an iterative algorithm such as Langevin dynamics. We propose to learn the two models jointly, where the fast thinking policy-like model serves to initialize the sampling of the slow thinking planner-like model, and the planner-like model refines the initial output by an iterative algorithm. The planner-like model learns from the difference between the refined output and the observed output, while the policy-like model learns from how the planner-like model refines its initial output. We demonstrate the effectiveness of the proposed method on various image generation tasks.

READ FULL TEXT

page 5

page 6

page 7

page 9

research
12/08/2020

Deep Energy-Based NARX Models

This paper is directed towards the problem of learning nonlinear ARX mod...
research
02/28/2020

Policy-Aware Model Learning for Policy Gradient Methods

This paper considers the problem of learning a model in model-based rein...
research
02/03/2022

Generative Flow Networks for Discrete Probabilistic Modeling

We present energy-based generative flow networks (EB-GFN), a novel proba...
research
04/13/2021

Bi-level Off-policy Reinforcement Learning for Volt/VAR Control Involving Continuous and Discrete Devices

In Volt/Var control (VVC) of active distribution networks(ADNs), both sl...
research
09/10/2023

Learning Energy-Based Models by Cooperative Diffusion Recovery Likelihood

Training energy-based models (EBMs) with maximum likelihood estimation o...
research
12/03/2018

Mitigating Planner Overfitting in Model-Based Reinforcement Learning

An agent with an inaccurate model of its environment faces a difficult c...
research
03/26/2015

Generalized K-fan Multimodal Deep Model with Shared Representations

Multimodal learning with deep Boltzmann machines (DBMs) is an generative...

Please sign up or login with your details

Forgot password? Click here to reset