Instruct-NeuralTalker: Editing Audio-Driven Talking Radiance Fields with Instructions

06/19/2023
by   Yuqi Sun, et al.
0

Recent neural talking radiance field methods have shown great success in photorealistic audio-driven talking face synthesis. In this paper, we propose a novel interactive framework that utilizes human instructions to edit such implicit neural representations to achieve real-time personalized talking face generation. Given a short speech video, we first build an efficient talking radiance field, and then apply the latest conditional diffusion model for image editing based on the given instructions and guiding implicit representation optimization towards the editing target. To ensure audio-lip synchronization during the editing process, we propose an iterative dataset updating strategy and utilize a lip-edge loss to constrain changes in the lip region. We also introduce a lightweight refinement network for complementing image details and achieving controllable detail generation in the final rendered image. Our method also enables real-time rendering at up to 30FPS on consumer hardware. Multiple metrics and user verification show that our approach provides a significant improvement in rendering quality compared to state-of-the-art methods.

READ FULL TEXT

page 1

page 2

page 4

page 5

page 6

page 7

page 10

page 11

research
06/05/2023

Instruct-Video2Avatar: Video-to-Avatar Generation with Instructions

We propose a method for synthesizing edited photo-realistic digital avat...
research
07/27/2023

Seal-3D: Interactive Pixel-Level Editing for Neural Radiance Fields

With the popularity of implicit neural representations, or neural radian...
research
10/06/2021

EdiTTS: Score-based Editing for Controllable Text-to-Speech

We present EdiTTS, an off-the-shelf speech editing methodology based on ...
research
11/14/2021

Towards Lightweight Controllable Audio Synthesis with Conditional Implicit Neural Representations

The high temporal resolution of audio and our perceptual sensitivity to ...
research
06/06/2020

Simple Primary Colour Editing for Consumer Product Images

We present a simple primary colour editing method for consumer product i...
research
05/01/2023

GeneFace++: Generalized and Stable Real-Time Audio-Driven 3D Talking Face Generation

Generating talking person portraits with arbitrary speech audio is a cru...
research
04/03/2019

ICface: Interpretable and Controllable Face Reenactment Using GANs

This paper presents a generic face animator that is able to control the ...

Please sign up or login with your details

Forgot password? Click here to reset