Text-based Editing of Talking-head Video

06/04/2019
by   Ohad Fried, et al.
4

Editing talking-head video to change the speech content or to remove filler words is challenging. We propose a novel method to edit talking-head video based on its transcript to produce a realistic output video in which the dialogue of the speaker has been modified, while maintaining a seamless audio-visual flow (i.e. no jump cuts). Our method automatically annotates an input talking-head video with phonemes, visemes, 3D face pose and geometry, reflectance, expression and scene illumination per frame. To edit a video, the user has to only edit the transcript, and an optimization strategy then chooses segments of the input corpus as base material. The annotated parameters corresponding to the selected segments are seamlessly stitched together and used to produce an intermediate video representation in which the lower half of the face is rendered with a parametric face model. Finally, a recurrent video generation network transforms this representation to a photorealistic video that matches the edited transcript. We demonstrate a large variety of edits, such as the addition, removal, and alteration of words, as well as convincing language translation and full sentence synthesis.

READ FULL TEXT

page 2

page 3

page 4

page 5

page 7

page 8

page 13

page 14

research
11/27/2022

VideoReTalking: Audio-based Lip Synchronization for Talking Head Video Editing In the Wild

We present VideoReTalking, a new system to edit the faces of a real-worl...
research
06/05/2023

Instruct-Video2Avatar: Video-to-Avatar Generation with Instructions

We propose a method for synthesizing edited photo-realistic digital avat...
research
07/27/2018

X2Face: A network for controlling face generation by using images, audio, and pose codes

The objective of this paper is a neural network model that controls the ...
research
12/27/2021

Responsive Listening Head Generation: A Benchmark Dataset and Baseline

Responsive listening during face-to-face conversations is a critical ele...
research
06/29/2023

PVP: Personalized Video Prior for Editable Dynamic Portraits using StyleGAN

Portrait synthesis creates realistic digital avatars which enable users ...
research
01/15/2020

Everybody's Talkin': Let Me Talk as You Want

We present a method to edit a target portrait footage by taking a sequen...
research
05/21/2022

Towards the Effects of Alignment Edits on the Quality of Experience of 360 Videos

The optimization of viewers' quality of experience (QoE) in 360 videos f...

Please sign up or login with your details

Forgot password? Click here to reset