MARLIN: Masked Autoencoder for facial video Representation LearnINg

11/12/2022
by   Zhixi Cai, et al.
0

This paper proposes a self-supervised approach to learn universal facial representations from videos, that can transfer across a variety of facial analysis tasks such as Facial Attribute Recognition (FAR), Facial Expression Recognition (FER), DeepFake Detection (DFD), and Lip Synchronization (LS). Our proposed framework, named MARLIN, is a facial video masked autoencoder, that learns highly robust and generic facial embeddings from abundantly available non-annotated web crawled facial videos. As a challenging auxiliary task, MARLIN reconstructs the spatio-temporal details of the face from the densely masked facial regions which mainly include eyes, nose, mouth, lips, and skin to capture local and global aspects that in turn help in encoding generic and transferable features. Through a variety of experiments on diverse downstream tasks, we demonstrate MARLIN to be an excellent facial video encoder as well as feature extractor, that performs consistently well across a variety of downstream tasks including FAR (1.13 (2.64 benchmark), LS (29.36 data regime. Our codes and pre-trained models will be made public.

READ FULL TEXT

page 1

page 4

page 8

research
12/06/2021

General Facial Representation Learning in a Visual-Linguistic Manner

How to learn a universal facial representation that boosts all face anal...
research
11/24/2022

Pose-disentangled Contrastive Learning for Self-supervised Facial Representation

Self-supervised facial representation has recently attracted increasing ...
research
08/21/2018

Self-supervised learning of a facial attribute embedding from video

We propose a self-supervised framework for learning facial attributes by...
research
09/15/2023

Unsupervised Disentangling of Facial Representations with 3D-aware Latent Diffusion Models

Unsupervised learning of facial representations has gained increasing at...
research
03/30/2021

Pre-training strategies and datasets for facial representation learning

What is the best way to learn a universal face representation? Recent wo...
research
08/23/2021

Modeling Dynamics of Facial Behavior for Mental Health Assessment

Facial action unit (FAU) intensities are popular descriptors for the ana...
research
01/11/2017

Linear Disentangled Representation Learning for Facial Actions

Limited annotated data available for the recognition of facial expressio...

Please sign up or login with your details

Forgot password? Click here to reset