AVFace: Towards Detailed Audio-Visual 4D Face Reconstruction

04/25/2023
by   Aggelina Chatziagapi, et al.
0

In this work, we present a multimodal solution to the problem of 4D face reconstruction from monocular videos. 3D face reconstruction from 2D images is an under-constrained problem due to the ambiguity of depth. State-of-the-art methods try to solve this problem by leveraging visual information from a single image or video, whereas 3D mesh animation approaches rely more on audio. However, in most cases (e.g. AR/VR applications), videos include both visual and speech information. We propose AVFace that incorporates both modalities and accurately reconstructs the 4D facial and lip motion of any speaker, without requiring any 3D ground truth for training. A coarse stage estimates the per-frame parameters of a 3D morphable model, followed by a lip refinement, and then a fine stage recovers facial geometric details. Due to the temporal audio and video information captured by transformer-based modules, our method is robust in cases when either modality is insufficient (e.g. face occlusions). Extensive qualitative and quantitative evaluation demonstrates the superiority of our method over the current state-of-the-art.

READ FULL TEXT

page 1

page 3

page 4

page 6

page 7

page 8

research
11/15/2016

Learning Detailed Face Reconstruction from a Single Image

Reconstructing the detailed geometric structure of a face from a given i...
research
07/22/2022

Visual Speech-Aware Perceptual 3D Facial Expression Reconstruction from Videos

The recent state of the art on monocular 3D face reconstruction from ima...
research
08/11/2021

SIDER: Single-Image Neural Optimization for Facial Geometric Detail Recovery

We present SIDER(Single-Image neural optimization for facial geometric D...
research
11/20/2022

Audio-visual video face hallucination with frequency supervision and cross modality support by speech based lip reading loss

Recently, there has been numerous breakthroughs in face hallucination ta...
research
10/25/2019

Self-supervised Learning of Detailed 3D Face Reconstruction

In this paper, we present an end-to-end learning framework for detailed ...
research
09/13/2018

Video to Fully Automatic 3D Hair Model

Imagine taking a selfie video with your mobile phone and getting as outp...
research
04/16/2021

Multimodal Deception Detection in Videos via Analyzing Emotional State-based Feature

Deception detection is an important task that has been a hot research to...

Please sign up or login with your details

Forgot password? Click here to reset