Rethinking Alignment in Video Super-Resolution Transformers

07/18/2022
by   Shuwei Shi, et al.
9

The alignment of adjacent frames is considered an essential operation in video super-resolution (VSR). Advanced VSR models, including the latest VSR Transformers, are generally equipped with well-designed alignment modules. However, the progress of the self-attention mechanism may violate this common sense. In this paper, we rethink the role of alignment in VSR Transformers and make several counter-intuitive observations. Our experiments show that: (i) VSR Transformers can directly utilize multi-frame information from unaligned videos, and (ii) existing alignment methods are sometimes harmful to VSR Transformers. These observations indicate that we can further improve the performance of VSR Transformers simply by removing the alignment module and adopting a larger attention window. Nevertheless, such designs will dramatically increase the computational burden, and cannot deal with large motions. Therefore, we propose a new and efficient alignment method called patch alignment, which aligns image patches instead of pixels. VSR Transformers equipped with patch alignment could demonstrate state-of-the-art performance on multiple benchmarks. Our work provides valuable insights on how multi-frame information is used in VSR and how to select alignment methods for different networks/datasets. Codes and models will be released at https://github.com/XPixelGroup/RethinkVSRAlignment.

READ FULL TEXT

page 4

page 6

page 7

page 8

page 17

research
11/03/2022

Temporal Consistency Learning of inter-frames for Video Super-Resolution

Video super-resolution (VSR) is a task that aims to reconstruct high-res...
research
05/07/2019

EDVR: Video Restoration with Enhanced Deformable Convolutional Networks

Video restoration tasks, including super-resolution, deblurring, etc, ar...
research
04/29/2023

An Implicit Alignment for Video Super-Resolution

Video super-resolution commonly uses a frame-wise alignment to support t...
research
04/27/2021

BasicVSR++: Improving Video Super-Resolution with Enhanced Propagation and Alignment

A recurrent structure is a popular framework choice for the task of vide...
research
10/15/2022

A Codec Information Assisted Framework for Efficient Compressed Video Super-Resolution

Online processing of compressed videos to increase their resolutions att...
research
09/28/2022

DeViT: Deformed Vision Transformers in Video Inpainting

This paper proposes a novel video inpainting method. We make three main ...
research
06/01/2022

Efficient Multi-Purpose Cross-Attention Based Image Alignment Block for Edge Devices

Image alignment, also known as image registration, is a critical block u...

Please sign up or login with your details

Forgot password? Click here to reset