Progressive Energy-Based Cooperative Learning for Multi-Domain Image-to-Image Translation

by   Weinan Song, et al.

This paper studies a novel energy-based cooperative learning framework for multi-domain image-to-image translation. The framework consists of four components: descriptor, translator, style encoder, and style generator. The descriptor is a multi-head energy-based model that represents a multi-domain image distribution. The components of translator, style encoder, and style generator constitute a diversified image generator. Specifically, given an input image from a source domain, the translator turns it into a stylised output image of the target domain according to a style code, which can be inferred by the style encoder from a reference image or produced by the style generator from a random noise. Since the style generator is represented as an domain-specific distribution of style codes, the translator can provide a one-to-many transformation (i.e., diversified generation) between source domain and target domain. To train our framework, we propose a likelihood-based multi-domain cooperative learning algorithm to jointly train the multi-domain descriptor and the diversified image generator (including translator, style encoder, and style generator modules) via multi-domain MCMC teaching, in which the descriptor guides the diversified image generator to shift its probability density toward the data distribution, while the diversified image generator uses its randomly translated images to initialize the descriptor's Langevin dynamics process for efficient sampling.


page 8

page 9

page 10

page 11

page 12

page 13


Multimodal Unsupervised Image-to-Image Translation

Unsupervised image-to-image translation is an important and challenging ...

Unsupervised Image-to-Image Translation Using Domain-Specific Variational Information Bound

Unsupervised image-to-image translation is a class of computer vision pr...

Bridging the Gap between Label- and Reference-based Synthesis in Multi-attribute Image-to-Image Translation

The image-to-image translation (I2IT) model takes a target label or a re...

Unified cross-modality feature disentangler for unsupervised multi-domain MRI abdomen organs segmentation

Our contribution is a unified cross-modality feature disentagling approa...

Learning Descriptor Networks for 3D Shape Synthesis and Analysis

This paper proposes a 3D shape descriptor network, which is a deep convo...

Cooperative Training of Descriptor and Generator Networks

This paper studies the cooperative training of two probabilistic models ...

SHUNIT: Style Harmonization for Unpaired Image-to-Image Translation

We propose a novel solution for unpaired image-to-image (I2I) translatio...

Please sign up or login with your details

Forgot password? Click here to reset