Video Frame Interpolation with Flow Transformer

07/30/2023
by   Pan Gao, et al.
0

Video frame interpolation has been actively studied with the development of convolutional neural networks. However, due to the intrinsic limitations of kernel weight sharing in convolution, the interpolated frame generated by it may lose details. In contrast, the attention mechanism in Transformer can better distinguish the contribution of each pixel, and it can also capture long-range pixel dependencies, which provides great potential for video interpolation. Nevertheless, the original Transformer is commonly used for 2D images; how to develop a Transformer-based framework with consideration of temporal self-attention for video frame interpolation remains an open issue. In this paper, we propose Video Frame Interpolation Flow Transformer to incorporate motion dynamics from optical flows into the self-attention mechanism. Specifically, we design a Flow Transformer Block that calculates the temporal self-attention in a matched local area with the guidance of flow, making our framework suitable for interpolating frames with large motion while maintaining reasonably low complexity. In addition, we construct a multi-scale architecture to account for multi-scale motion, further improving the overall performance. Extensive experiments on three benchmarks demonstrate that the proposed method can generate interpolated frames with better visual quality than state-of-the-art methods.

READ FULL TEXT

page 7

page 10

page 11

research
05/15/2022

Video Frame Interpolation with Transformer

Video frame interpolation (VFI), which aims to synthesize intermediate f...
research
03/03/2022

ViTransPAD: Video Transformer using convolution and self-attention for Face Presentation Attack Detection

Face Presentation Attack Detection (PAD) is an important measure to prev...
research
07/12/2023

Efficient Convolution and Transformer-Based Network for Video Frame Interpolation

Video frame interpolation is an increasingly important research task wit...
research
01/06/2022

Flow-Guided Sparse Transformer for Video Deblurring

Exploiting similar and sharper scene patches in spatio-temporal neighbor...
research
06/08/2022

UHD Image Deblurring via Multi-scale Cubic-Mixer

Currently, transformer-based algorithms are making a splash in the domai...
research
09/19/2022

E-VFIA : Event-Based Video Frame Interpolation with Attention

Video frame interpolation (VFI) is a fundamental vision task that aims t...
research
11/21/2022

H-VFI: Hierarchical Frame Interpolation for Videos with Large Motions

Capitalizing on the rapid development of neural networks, recent video f...

Please sign up or login with your details

Forgot password? Click here to reset