Audio-Driven Talking Face Video Generation with Dynamic Convolution Kernels

01/16/2022
by   Zipeng Ye, et al.
0

In this paper, we present a dynamic convolution kernel (DCK) strategy for convolutional neural networks. Using a fully convolutional network with the proposed DCKs, high-quality talking-face video can be generated from multi-modal sources (i.e., unmatched audio and video) in real time, and our trained model is robust to different identities, head postures, and input audios. Our proposed DCKs are specially designed for audio-driven talking face video generation, leading to a simple yet effective end-to-end system. We also provide a theoretical analysis to interpret why DCKs work. Experimental results show that our method can generate high-quality talking-face video with background at 60 fps. Comparison and evaluation between our method and the state-of-the-art methods demonstrate the superiority of our method.

READ FULL TEXT

page 3

page 4

page 6

page 8

page 9

page 10

page 14

research
11/27/2022

VideoReTalking: Audio-based Lip Synchronization for Talking Head Video Editing In the Wild

We present VideoReTalking, a new system to edit the faces of a real-worl...
research
02/24/2020

Audio-driven Talking Face Video Generation with Learning-based Personalized Head Pose

Real-world talking faces often accompany with natural head movement. How...
research
04/29/2021

Text2Video: Text-driven Talking-head Video Synthesis with Phonetic Dictionary

With the advance of deep learning technology, automatic video generation...
research
03/17/2023

MMFace4D: A Large-Scale Multi-Modal 4D Face Dataset for Audio-Driven 3D Face Animation

Audio-Driven Face Animation is an eagerly anticipated technique for appl...
research
11/21/2020

Stochastic Talking Face Generation Using Latent Distribution Matching

The ability to envisage the visual of a talking face based just on heari...
research
10/25/2020

APB2FaceV2: Real-Time Audio-Guided Multi-Face Reenactment

Audio-guided face reenactment aims to generate a photorealistic face tha...
research
04/13/2020

From Inference to Generation: End-to-end Fully Self-supervised Generation of Human Face from Speech

This work seeks the possibility of generating the human face from voice ...

Please sign up or login with your details

Forgot password? Click here to reset