Knowledge-Spreader: Learning Facial Action Unit Dynamics with Extremely Limited Labels

03/30/2022
by   Xiaotian Li, et al.
0

Recent studies on the automatic detection of facial action unit (AU) have extensively relied on large-sized annotations. However, manually AU labeling is difficult, time-consuming, and costly. Most existing semi-supervised works ignore the informative cues from the temporal domain, and are highly dependent on densely annotated videos, making the learning process less efficient. To alleviate these problems, we propose a deep semi-supervised framework Knowledge-Spreader (KS), which differs from conventional methods in two aspects. First, rather than only encoding human knowledge as constraints, KS also learns the Spatial-Temporal AU correlation knowledge in order to strengthen its out-of-distribution generalization ability. Second, we approach KS by applying consistency regularization and pseudo-labeling in multiple student networks alternately and dynamically. It spreads the spatial knowledge from labeled frames to unlabeled data, and completes the temporal information of partially labeled video clips. Thus, the design allows KS to learn AU dynamics from video clips with only one label allocated, which significantly reduce the requirements of using annotations. Extensive experiments demonstrate that the proposed KS achieves competitive performance as compared to the state of the arts under the circumstances of using only 2 labels on DISFA. In addition, we test it on our newly developed large-scale comprehensive emotion database, which contains considerable samples across well-synchronized and aligned sensor modalities for easing the scarcity issue of annotations and identities in human affective computing. The new database will be released to the research community.

READ FULL TEXT
research
06/04/2021

Exploring Adversarial Learning for Deep Semi-Supervised Facial Action Unit Recognition

Current works formulate facial action unit (AU) recognition as a supervi...
research
09/04/2023

SSVOD: Semi-Supervised Video Object Detection with Sparse Annotations

Despite significant progress in semi-supervised learning for image objec...
research
09/01/2022

MAPLE: Masked Pseudo-Labeling autoEncoder for Semi-supervised Point Cloud Action Recognition

Recognizing human actions from point cloud videos has attracted tremendo...
research
10/02/2020

Semantics through Time: Semi-supervised Segmentation of Aerial Videos with Iterative Label Propagation

Semantic segmentation is a crucial task for robot navigation and safety....
research
09/23/2021

Self-supervised Learning for Semi-supervised Temporal Language Grounding

Given a text description, Temporal Language Grounding (TLG) aims to loca...
research
12/16/2019

Towards Omni-Supervised Face Alignment for Large Scale Unlabeled Videos

In this paper, we propose a spatial-temporal relational reasoning networ...
research
11/03/2020

Semi-supervised Facial Action Unit Intensity Estimation with Contrastive Learning

This paper tackles the challenging problem of estimating the intensity o...

Please sign up or login with your details

Forgot password? Click here to reset