I2V-GAN: Unpaired Infrared-to-Visible Video Translation

08/02/2021
by   Shuang Li, et al.
4

Human vision is often adversely affected by complex environmental factors, especially in night vision scenarios. Thus, infrared cameras are often leveraged to help enhance the visual effects via detecting infrared radiation in the surrounding environment, but the infrared videos are undesirable due to the lack of detailed semantic information. In such a case, an effective video-to-video translation method from the infrared domain to the visible light counterpart is strongly needed by overcoming the intrinsic huge gap between infrared and visible fields. To address this challenging problem, we propose an infrared-to-visible (I2V) video translation method I2V-GAN to generate fine-grained and spatial-temporal consistent visible light videos by given unpaired infrared videos. Technically, our model capitalizes on three types of constraints: 1)adversarial constraint to generate synthetic frames that are similar to the real ones, 2)cyclic consistency with the introduced perceptual loss for effective content conversion as well as style preservation, and 3)similarity constraints across and within domains to enhance the content and motion consistency in both spatial and temporal spaces at a fine-grained level. Furthermore, the current public available infrared and visible light datasets are mainly used for object detection or tracking, and some are composed of discontinuous images which are not suitable for video tasks. Thus, we provide a new dataset for I2V video translation, which is named IRVI. Specifically, it has 12 consecutive video clips of vehicle and monitoring scenes, and both infrared and visible light videos could be apart into 24352 frames. Comprehensive experiments validate that I2V-GAN is superior to the compared SOTA methods in the translation of I2V videos with higher fluency and finer semantic details. The code and IRVI dataset are available at https://github.com/BIT-DA/I2V-GAN.

READ FULL TEXT

page 1

page 4

page 5

page 6

page 7

page 8

research
04/26/2022

ROMA: Cross-Domain Region Similarity Matching for Unpaired Nighttime Infrared to Daytime Visible Video Translation

Infrared cameras are often utilized to enhance the night vision since th...
research
12/16/2020

C2F-FWN: Coarse-to-Fine Flow Warping Network for Spatial-Temporal Consistent Motion Transfer

Human video motion transfer (HVMT) aims to synthesize videos that one pe...
research
07/05/2022

Spatial-Temporal Frequency Forgery Clue for Video Forgery Detection in VIS and NIR Scenario

In recent years, with the rapid development of face editing and generati...
research
07/08/2022

Video-based Smoky Vehicle Detection with A Coarse-to-Fine Framework

Automatic smoky vehicle detection in videos is a superior solution to th...
research
10/28/2022

Exploring Spatial-Temporal Features for Deepfake Detection and Localization

With the continuous research on Deepfake forensics, recent studies have ...
research
08/15/2018

Recycle-GAN: Unsupervised Video Retargeting

We introduce a data-driven approach for unsupervised video retargeting t...
research
09/15/2016

Visible Light-Based Human Visual System Conceptual Model

There is a widely held belief in the digital image and video processing ...

Please sign up or login with your details

Forgot password? Click here to reset