POSTER V2: A simpler and stronger facial expression recognition network

by   Jiawei Mao, et al.

Facial expression recognition (FER) plays an important role in a variety of real-world applications such as human-computer interaction. POSTER V1 achieves the state-of-the-art (SOTA) performance in FER by effectively combining facial landmark and image features through two-stream pyramid cross-fusion design. However, the architecture of POSTER V1 is undoubtedly complex. It causes expensive computational costs. In order to relieve the computational pressure of POSTER V1, in this paper, we propose POSTER V2. It improves POSTER V1 in three directions: cross-fusion, two-stream, and multi-scale feature extraction. In cross-fusion, we use window-based cross-attention mechanism replacing vanilla cross-attention mechanism. We remove the image-to-landmark branch in the two-stream design. For multi-scale feature extraction, POSTER V2 combines images with landmark's multi-scale features to replace POSTER V1's pyramid design. Extensive experiments on several standard datasets show that our POSTER V2 achieves the SOTA FER performance with the minimum computational cost. For example, POSTER V2 reached 92.21% on RAF-DB, 67.49% on AffectNet (7 cls) and 63.77% on AffectNet (8 cls), respectively, using only 8.4G floating point operations (FLOPs) and 43.7M parameters (Param). This demonstrates the effectiveness of our improvements. The code and models are available at  <>.


page 4

page 5

page 14


Distract Your Attention: Multi-head Cross Attention Network for Facial Expression Recognition

We present a novel facial expression recognition network, called Distrac...

ARBEx: Attentive Feature Extraction with Reliability Balancing for Robust Facial Expression Learning

In this paper, we introduce a framework ARBEx, a novel attentive feature...

Facial Expression Recognition using Facial Landmark Detection and Feature Extraction on Neural Networks

The proposed framework in this paper has the primary objective of classi...

Face Trees for Expression Recognition

We propose an end-to-end architecture for facial expression recognition....

More comprehensive facial inversion for more effective expression recognition

Facial expression recognition (FER) plays a significant role in the ubiq...

Cross-scale Attention Guided Multi-instance Learning for Crohn's Disease Diagnosis with Pathological Images

Multi-instance learning (MIL) is widely used in the computer-aided inter...

Feature Extraction Matters More: Universal Deepfake Disruption through Attacking Ensemble Feature Extractors

Adversarial example is a rising way of protecting facial privacy securit...

Please sign up or login with your details

Forgot password? Click here to reset