Unsupervised Multimodal Video-to-Video Translation via Self-Supervised Learning

04/14/2020
by   Kangning Liu, et al.
8

Existing unsupervised video-to-video translation methods fail to produce translated videos which are frame-wise realistic, semantic information preserving and video-level consistent. In this work, we propose UVIT, a novel unsupervised video-to-video translation model. Our model decomposes the style and the content, uses the specialized encoder-decoder structure and propagates the inter-frame information through bidirectional recurrent neural network (RNN) units. The style-content decomposition mechanism enables us to achieve style consistent video translation results as well as provides us with a good interface for modality flexible translation. In addition, by changing the input frames and style codes incorporated in our translation, we propose a video interpolation loss, which captures temporal information within the sequence to train our building blocks in a self-supervised manner. Our model can produce photo-realistic, spatio-temporal consistent translated videos in a multimodal way. Subjective and objective experimental results validate the superiority of our model over existing methods. More details can be found on our project website: https://uvit.netlify.com

READ FULL TEXT

page 2

page 13

page 14

page 21

page 23

page 24

page 27

page 28

research
06/10/2018

Unsupervised Video-to-Video Translation

Unsupervised image-to-image translation is a recently proposed task of t...
research
02/07/2018

Self-Supervised Video Hashing with Hierarchical Binary Auto-encoder

Existing video hash functions are built on three isolated stages: frame ...
research
04/02/2022

Unsupervised Coherent Video Cartoonization with Perceptual Motion Consistency

In recent years, creative content generations like style transfer and ne...
research
08/15/2018

Recycle-GAN: Unsupervised Video Retargeting

We introduce a data-driven approach for unsupervised video retargeting t...
research
01/28/2021

Playable Video Generation

This paper introduces the unsupervised learning problem of playable vide...
research
10/03/2021

Disarranged Zone Learning (DZL): An unsupervised and dynamic automatic stenosis recognition methodology based on coronary angiography

We proposed a novel unsupervised methodology named Disarranged Zone Lear...
research
08/07/2023

Recurrent Self-Supervised Video Denoising with Denser Receptive Field

Self-supervised video denoising has seen decent progress through the use...

Please sign up or login with your details

Forgot password? Click here to reset