Fast-Vid2Vid: Spatial-Temporal Compression for Video-to-Video Synthesis

07/11/2022
by   Long Zhuo, et al.
0

Video-to-Video synthesis (Vid2Vid) has achieved remarkable results in generating a photo-realistic video from a sequence of semantic maps. However, this pipeline suffers from high computational cost and long inference latency, which largely depends on two essential factors: 1) network architecture parameters, 2) sequential data stream. Recently, the parameters of image-based generative models have been significantly compressed via more efficient network architectures. Nevertheless, existing methods mainly focus on slimming network architectures and ignore the size of the sequential data stream. Moreover, due to the lack of temporal coherence, image-based compression is not sufficient for the compression of the video task. In this paper, we present a spatial-temporal compression framework, Fast-Vid2Vid, which focuses on data aspects of generative models. It makes the first attempt at time dimension to reduce computational resources and accelerate inference. Specifically, we compress the input data stream spatially and reduce the temporal redundancy. After the proposed spatial-temporal knowledge distillation, our model can synthesize key-frames using the low-resolution data stream. Finally, Fast-Vid2Vid interpolates intermediate frames by motion compensation with slight latency. On standard benchmarks, Fast-Vid2Vid achieves around real-time performance as 20 FPS and saves around 8x computational cost on a single V100 GPU.

READ FULL TEXT

page 1

page 5

page 8

page 14

page 15

page 21

page 22

page 23

research
06/22/2018

Video Inpainting by Jointly Learning Temporal Structure and Spatial Details

We present a new data-driven video inpainting method for recovering miss...
research
08/15/2023

Shortcut-V2V: Compression Framework for Video-to-Video Translation based on Temporal Redundancy Reduction

Video-to-video translation aims to generate video frames of a target dom...
research
09/21/2023

Spatial-Temporal Transformer based Video Compression Framework

Learned video compression (LVC) has witnessed remarkable advancements in...
research
05/17/2023

EfficientSCI: Densely Connected Network with Space-time Factorization for Large-scale Video Snapshot Compressive Imaging

Video snapshot compressive imaging (SCI) uses a two-dimensional detector...
research
09/15/2023

Differentiable Resolution Compression and Alignment for Efficient Video Classification and Retrieval

Optimizing video inference efficiency has become increasingly important ...
research
01/07/2022

Microdosing: Knowledge Distillation for GAN based Compression

Recently, significant progress has been made in learned image and video ...
research
03/27/2021

Video Rescaling Networks with Joint Optimization Strategies for Downscaling and Upscaling

This paper addresses the video rescaling task, which arises from the nee...

Please sign up or login with your details

Forgot password? Click here to reset