Video Frame Interpolation with Transformer

05/15/2022
by   Liying Lu, et al.
27

Video frame interpolation (VFI), which aims to synthesize intermediate frames of a video, has made remarkable progress with development of deep convolutional networks over past years. Existing methods built upon convolutional networks generally face challenges of handling large motion due to the locality of convolution operations. To overcome this limitation, we introduce a novel framework, which takes advantage of Transformer to model long-range pixel correlation among video frames. Further, our network is equipped with a novel cross-scale window-based attention mechanism, where cross-scale windows interact with each other. This design effectively enlarges the receptive field and aggregates multi-scale information. Extensive quantitative and qualitative experiments demonstrate that our method achieves new state-of-the-art results on various benchmarks.

READ FULL TEXT

page 1

page 5

page 7

page 8

page 14

page 15

research
07/30/2023

Video Frame Interpolation with Flow Transformer

Video frame interpolation has been actively studied with the development...
research
04/05/2023

BiFormer: Learning Bilateral Motion Estimation via Bilateral Transformer for 4K Video Frame Interpolation

A novel 4K video frame interpolator based on bilateral transformer (BiFo...
research
07/12/2023

Efficient Convolution and Transformer-Based Network for Video Frame Interpolation

Video frame interpolation is an increasingly important research task wit...
research
07/08/2022

Cross-Attention Transformer for Video Interpolation

We propose TAIN (Transformers and Attention for video INterpolation), a ...
research
07/27/2022

Meta-Interpolation: Time-Arbitrary Frame Interpolation via Dual Meta-Learning

Existing video frame interpolation methods can only interpolate the fram...
research
05/08/2022

Transformer Tracking with Cyclic Shifting Window Attention

Transformer architecture has been showing its great strength in visual o...
research
09/04/2023

Cross-Consistent Deep Unfolding Network for Adaptive All-In-One Video Restoration

Existing Video Restoration (VR) methods always necessitate the individua...

Please sign up or login with your details

Forgot password? Click here to reset