RSTT: Real-time Spatial Temporal Transformer for Space-Time Video Super-Resolution

03/27/2022
by   Zhicheng Geng, et al.
16

Space-time video super-resolution (STVSR) is the task of interpolating videos with both Low Frame Rate (LFR) and Low Resolution (LR) to produce High-Frame-Rate (HFR) and also High-Resolution (HR) counterparts. The existing methods based on Convolutional Neural Network (CNN) succeed in achieving visually satisfied results while suffer from slow inference speed due to their heavy architectures. We propose to resolve this issue by using a spatial-temporal transformer that naturally incorporates the spatial and temporal super resolution modules into a single model. Unlike CNN-based methods, we do not explicitly use separated building blocks for temporal interpolations and spatial super-resolutions; instead, we only use a single end-to-end transformer architecture. Specifically, a reusable dictionary is built by encoders based on the input LFR and LR frames, which is then utilized in the decoder part to synthesize the HFR and HR frames. Compared with the state-of-the-art TMNet <cit.>, our network is 60% smaller (4.5M vs 12.3M parameters) and 80% faster (26.2fps vs 14.3fps on 720×576 frames) without sacrificing much performance. The source code is available at https://github.com/llmpass/RSTT.

READ FULL TEXT

page 4

page 5

page 7

page 9

research
07/18/2022

Enhancing Space-time Video Super-resolution via Spatial-temporal Feature Interaction

The target of space-time video super-resolution (STVSR) is to increase b...
research
04/17/2022

VDTR: Video Deblurring with Transformer

Video deblurring is still an unsolved problem due to the challenging spa...
research
03/13/2020

Is There Tradeoff between Spatial and Temporal in Video Super-Resolution?

Recent advances of deep learning lead to great success of image and vide...
research
05/10/2022

Accelerating the Training of Video Super-Resolution Models

Despite that convolution neural networks (CNN) have recently demonstrate...
research
09/15/2023

Differentiable Resolution Compression and Alignment for Efficient Video Classification and Retrieval

Optimizing video inference efficiency has become increasingly important ...
research
04/21/2021

Temporal Modulation Network for Controllable Space-Time Video Super-Resolution

Space-time video super-resolution (STVSR) aims to increase the spatial a...
research
02/25/2021

Learning for Unconstrained Space-Time Video Super-Resolution

Recent years have seen considerable research activities devoted to video...

Please sign up or login with your details

Forgot password? Click here to reset