Accelerating the Training of Video Super-Resolution Models

05/10/2022
by   Lijian Lin, et al.
0

Despite that convolution neural networks (CNN) have recently demonstrated high-quality reconstruction for video super-resolution (VSR), efficiently training competitive VSR models remains a challenging problem. It usually takes an order of magnitude more time than training their counterpart image models, leading to long research cycles. Existing VSR methods typically train models with fixed spatial and temporal sizes from beginning to end. The fixed sizes are usually set to large values for good performance, resulting to slow training. However, is such a rigid training strategy necessary for VSR? In this work, we show that it is possible to gradually train video models from small to large spatial/temporal sizes, i.e., in an easy-to-hard manner. In particular, the whole training is divided into several stages and the earlier stage has smaller training spatial shape. Inside each stage, the temporal size also varies from short to long while the spatial size remains unchanged. Training is accelerated by such a multigrid training strategy, as most of computation is performed on smaller spatial and shorter temporal shapes. For further acceleration with GPU parallelization, we also investigate the large minibatch training without the loss in accuracy. Extensive experiments demonstrate that our method is capable of largely speeding up training (up to 6.2× speedup in wall-clock training time) without performance drop for various VSR models. The code is available at https://github.com/TencentARC/Efficient-VSR-Training.

READ FULL TEXT

page 8

page 15

research
12/02/2019

A Multigrid Method for Efficiently Training Video Models

Training competitive deep video models is an order of magnitude slower t...
research
05/11/2022

Spatial-Temporal Space Hand-in-Hand: Spatial-Temporal Video Super-Resolution via Cycle-Projected Mutual Learning

Spatial-Temporal Video Super-Resolution (ST-VSR) aims to generate super-...
research
03/27/2022

RSTT: Real-time Spatial Temporal Transformer for Space-Time Video Super-Resolution

Space-time video super-resolution (STVSR) is the task of interpolating v...
research
04/21/2021

Temporal Modulation Network for Controllable Space-Time Video Super-Resolution

Space-time video super-resolution (STVSR) aims to increase the spatial a...
research
03/21/2021

PGT: A Progressive Method for Training Models on Long Videos

Convolutional video models have an order of magnitude larger computation...
research
12/28/2021

AdaFocus V2: End-to-End Training of Spatial Dynamic Networks for Video Recognition

Recent works have shown that the computational efficiency of video recog...
research
06/15/2017

FreezeOut: Accelerate Training by Progressively Freezing Layers

The early layers of a deep neural net have the fewest parameters, but ta...

Please sign up or login with your details

Forgot password? Click here to reset