Semi-supervised Facial Action Unit Intensity Estimation with Contrastive Learning

11/03/2020
by   Enrique Sanchez, et al.
0

This paper tackles the challenging problem of estimating the intensity of Facial Action Units with few labeled images. Contrary to previous works, our method does not require to manually select key frames, and produces state-of-the-art results with as little as 2% of annotated frames, which are randomly chosen. To this end, we propose a semi-supervised learning approach where a spatio-temporal model combining a feature extractor and a temporal module are learned in two stages. The first stage uses datasets of unlabeled videos to learn a strong spatio-temporal representation of facial behavior dynamics based on contrastive learning. To our knowledge we are the first to build upon this framework for modeling facial behavior in an unsupervised manner. The second stage uses another dataset of randomly chosen labeled frames to train a regressor on top of our spatio-temporal model for estimating the AU intensity. We show that although backpropagation through time is applied only with respect to the output of the network for extremely sparse and randomly chosen labeled frames, our model can be effectively trained to estimate AU intensity accurately, thanks to the unsupervised pre-training of the first stage. We experimentally validate that our method outperforms existing methods when working with as little as 2% of randomly chosen data for both DISFA and BP4D datasets, without a careful choice of labeled frames, a time-consuming task still required in previous approaches.

READ FULL TEXT
research
03/19/2023

Spatio-Temporal AU Relational Graph Representation Learning For Facial Action Units Detection

This paper presents our Facial Action Units (AUs) recognition submission...
research
10/01/2021

SMATE: Semi-Supervised Spatio-Temporal Representation Learning on Multivariate Time Series

Learning from Multivariate Time Series (MTS) has attracted widespread at...
research
05/09/2018

Joint Action Unit localisation and intensity estimation through heatmap regression

This paper proposes a supervised learning approach to jointly perform fa...
research
04/04/2019

Inferring Dynamic Representations of Facial Actions from a Still Image

Facial actions are spatio-temporal signals by nature, and therefore thei...
research
04/13/2020

Unsupervised Facial Action Unit Intensity Estimation via Differentiable Optimization

The automatic intensity estimation of facial action units (AUs) from a s...
research
03/30/2022

Knowledge-Spreader: Learning Facial Action Unit Dynamics with Extremely Limited Labels

Recent studies on the automatic detection of facial action unit (AU) hav...
research
04/14/2020

A Transfer Learning approach to Heatmap Regression for Action Unit intensity estimation

Action Units (AUs) are geometrically-based atomic facial muscle movement...

Please sign up or login with your details

Forgot password? Click here to reset