Learning Joint Spatial-Temporal Transformations for Video Inpainting

07/20/2020
by   Yanhong Zeng, et al.
0

High-quality video inpainting that completes missing regions in video frames is a promising yet challenging task. State-of-the-art approaches adopt attention models to complete a frame by searching missing contents from reference frames, and further complete whole videos frame by frame. However, these approaches can suffer from inconsistent attention results along spatial and temporal dimensions, which often leads to blurriness and temporal artifacts in videos. In this paper, we propose to learn a joint Spatial-Temporal Transformer Network (STTN) for video inpainting. Specifically, we simultaneously fill missing regions in all input frames by self-attention, and propose to optimize STTN by a spatial-temporal adversarial loss. To show the superiority of the proposed model, we conduct both quantitative and qualitative evaluations by using standard stationary masks and more realistic moving object masks. Demo videos are available at https://github.com/researchmm/STTN.

READ FULL TEXT

page 2

page 7

page 11

page 12

page 14

page 22

research
11/05/2021

Spatial-Temporal Residual Aggregation for High Resolution Video Inpainting

Recent learning-based inpainting algorithms have achieved compelling res...
research
04/14/2021

Decoupled Spatial-Temporal Transformer for Video Inpainting

Video inpainting aims to fill the given spatiotemporal holes with realis...
research
01/26/2021

Deep Video Inpainting Detection

This paper studies video inpainting detection, which localizes an inpain...
research
06/17/2023

Fast Fourier Inception Networks for Occluded Video Prediction

Video prediction is a pixel-level task that generates future frames by e...
research
08/14/2022

Flow-Guided Transformer for Video Inpainting

We propose a flow-guided transformer, which innovatively leverage the mo...
research
06/14/2022

Stand-Alone Inter-Frame Attention in Video Models

Motion, as the uniqueness of a video, has been critical to the developme...
research
07/08/2021

Multi-frame Collaboration for Effective Endoscopic Video Polyp Detection via Spatial-Temporal Feature Transformation

Precise localization of polyp is crucial for early cancer screening in g...

Please sign up or login with your details

Forgot password? Click here to reset