Extreme-scale Talking-Face Video Upsampling with Audio-Visual Priors

08/17/2022
by   Sindhu B Hegde, et al.
0

In this paper, we explore an interesting question of what can be obtained from an 8×8 pixel video sequence. Surprisingly, it turns out to be quite a lot. We show that when we process this 8×8 video with the right set of audio and image priors, we can obtain a full-length, 256×256 video. We achieve this 32× scaling of an extremely low-resolution input using our novel audio-visual upsampling network. The audio prior helps to recover the elemental facial details and precise lip shapes and a single high-resolution target identity image prior provides us with rich appearance details. Our approach is an end-to-end multi-stage framework. The first stage produces a coarse intermediate output video that can be then used to animate single target identity image and generate realistic, accurate and high-quality outputs. Our approach is simple and performs exceedingly well (an 8× improvement in FID score) compared to previous super-resolution methods. We also extend our model to talking-face video compression, and show that we obtain a 3.5× improvement in terms of bits/pixel over the previous state-of-the-art. The results from our network are thoroughly analyzed through extensive ablation experiments (in the paper and supplementary material). We also provide the demo video along with code and models on our website: <http://cvit.iiit.ac.in/research/projects/cvit-projects/talking-face-video-upsampling>.

READ FULL TEXT

page 1

page 3

page 6

page 7

page 12

page 13

page 14

page 16

research
09/27/2019

Learning to Have an Ear for Face Super-Resolution

We propose a novel method to perform extreme (16x) face super-resolution...
research
10/06/2022

Audio-Visual Face Reenactment

This work proposes a novel method to generate realistic talking head vid...
research
07/18/2020

Face Super-Resolution Guided by 3D Facial Priors

State-of-the-art face super-resolution methods employ deep convolutional...
research
05/28/2018

Face hallucination using cascaded super-resolution and identity priors

In this paper we address the problem of hallucinating high-resolution fa...
research
11/06/2018

Super-Identity Convolutional Neural Network for Face Hallucination

Face hallucination is a generative task to super-resolve the facial imag...
research
03/14/2022

GCFSR: a Generative and Controllable Face Super Resolution Method Without Facial and GAN Priors

Face image super resolution (face hallucination) usually relies on facia...
research
05/06/2022

VFHQ: A High-Quality Dataset and Benchmark for Video Face Super-Resolution

Most of the existing video face super-resolution (VFSR) methods are trai...

Please sign up or login with your details

Forgot password? Click here to reset