Facial Expression Recognition using Vanilla ViT backbones with MAE Pretraining

07/22/2022
by   Jia Li, et al.
0

Humans usually convey emotions voluntarily or involuntarily by facial expressions. Automatically recognizing the basic expression (such as happiness, sadness, and neutral) from a facial image, i.e., facial expression recognition (FER), is extremely challenging and attracts much research interests. Large scale datasets and powerful inference models have been proposed to address the problem. Though considerable progress has been made, most of the state of the arts employing convolutional neural networks (CNNs) or elaborately modified Vision Transformers (ViTs) depend heavily on upstream supervised pretraining. Transformers are taking place the domination of CNNs in more and more computer vision tasks. But they usually need much more data to train, since they use less inductive biases compared with CNNs. To explore whether a vanilla ViT without extra training samples from upstream tasks is able to achieve competitive accuracy, we use a plain ViT with MAE pretraining to perform the FER task. Specifically, we first pretrain the original ViT as a Masked Autoencoder (MAE) on a large facial expression dataset without expression labels. Then, we fine-tune the ViT on popular facial expression datasets with expression labels. The presented method is quite competitive with 90.22% on RAF-DB, 61.73% on AfectNet and can serve as a simple yet strong ViT-based baseline for FER studies.

READ FULL TEXT

page 1

page 2

page 3

research
07/07/2021

Learning Vision Transformer with Squeeze and Excitation for Facial Expression Recognition

As various databases of facial expressions have been made accessible ove...
research
03/01/2023

CLIPER: A Unified Vision-Language Framework for In-the-Wild Facial Expression Recognition

Facial expression recognition (FER) is an essential task for understandi...
research
12/09/2016

Facial Expression Recognition using Convolutional Neural Networks: State of the Art

The ability to recognize facial expressions automatically enables novel ...
research
07/20/2022

AU-Supervised Convolutional Vision Transformers for Synthetic Facial Expression Recognition

The paper describes our proposed methodology for the six basic expressio...
research
06/28/2022

Generating near-infrared facial expression datasets with dimensional affect labels

Facial expression analysis has long been an active research area of comp...
research
07/19/2021

Facial Expressions Recognition with Convolutional Neural Networks

Over the centuries, humans have developed and acquired a number of ways ...
research
05/17/2023

Facial Expression Recognition at the Edge: CPU vs GPU vs VPU vs TPU

Facial Expression Recognition (FER) plays an important role in human-com...

Please sign up or login with your details

Forgot password? Click here to reset