Video Frame Interpolation via Adaptive Separable Convolution

by   Simon Niklaus, et al.

Standard video frame interpolation methods first estimate optical flow between input frames and then synthesize an intermediate frame guided by motion. Recent approaches merge these two steps into a single convolution process by convolving input frames with spatially adaptive kernels that account for motion and re-sampling simultaneously. These methods require large kernels to handle large motion, which limits the number of pixels whose kernels can be estimated at once due to the large memory demand. To address this problem, this paper formulates frame interpolation as local separable convolution over input frames using pairs of 1D kernels. Compared to regular 2D kernels, the 1D kernels require significantly fewer parameters to be estimated. Our method develops a deep fully convolutional neural network that takes two input frames and estimates pairs of 1D kernels for all pixels simultaneously. Since our method is able to estimate kernels and synthesizes the whole video frame at once, it allows for the incorporation of perceptual loss to train the neural network to produce visually pleasing frames. This deep neural network is trained end-to-end using widely available video data without any human annotation. Both qualitative and quantitative experiments show that our method provides a practical solution to high-quality video frame interpolation.



There are no comments yet.


page 1

page 3

page 5

page 6

page 8


Video Frame Interpolation via Adaptive Convolution

Video frame interpolation typically involves two steps: motion estimatio...

Multiple Video Frame Interpolation via Enhanced Deformable Separable Convolution

Generating non-existing frames from a consecutive video sequence has bee...

Implementing Adaptive Separable Convolution for Video Frame Interpolation

As Deep Neural Networks are becoming more popular, much of the attention...

PhaseNet for Video Frame Interpolation

Most approaches for video frame interpolation require accurate dense cor...

Deep Reference Generation with Multi-Domain Hierarchical Constraints for Inter Prediction

Inter prediction is an important module in video coding for temporal red...

Learning Spatial Transform for Video Frame Interpolation

Video frame interpolation is one of the most challenging tasks in the vi...

All at Once: Temporally Adaptive Multi-Frame Interpolation with Advanced Motion Modeling

Recent advances in high refresh rate displays as well as the increased i...
This week in AI

Get the week's most popular data science and artificial intelligence research sent straight to your inbox every Saturday.