Local Multi-Head Channel Self-Attention for Facial Expression Recognition

11/14/2021
by   Roberto Pecoraro, et al.
13

Since the Transformer architecture was introduced in 2017 there has been many attempts to bring the self-attention paradigm in the field of computer vision. In this paper we propose a novel self-attention module that can be easily integrated in virtually every convolutional neural network and that is specifically designed for computer vision, the LHC: Local (multi) Head Channel (self-attention). LHC is based on two main ideas: first, we think that in computer vision the best way to leverage the self-attention paradigm is the channel-wise application instead of the more explored spatial attention and that convolution will not be replaced by attention modules like recurrent networks were in NLP; second, a local approach has the potential to better overcome the limitations of convolution than global attention. With LHC-Net we managed to achieve a new state of the art in the famous FER2013 dataset with a significantly lower complexity and impact on the "host" architecture in terms of computational cost when compared with the previous SOTA.

READ FULL TEXT

page 1

page 2

page 5

page 7

research
11/15/2021

Searching for TrioNet: Combining Convolution with Local and Global Self-Attention

Recently, self-attention operators have shown superior performance as a ...
research
04/18/2020

Adaptive Attention Span in Computer Vision

Recent developments in Transformers for language modeling have opened ne...
research
10/31/2021

A Simple Approach to Image Tilt Correction with Self-Attention MobileNet for Smartphones

The main contributions of our work are two-fold. First, we present a Sel...
research
11/28/2022

FsaNet: Frequency Self-attention for Semantic Segmentation

Considering the spectral properties of images, we propose a new self-att...
research
08/25/2021

TransFER: Learning Relation-aware Facial Expression Representations with Transformers

Facial expression recognition (FER) has received increasing interest in ...
research
05/02/2023

ARBEx: Attentive Feature Extraction with Reliability Balancing for Robust Facial Expression Learning

In this paper, we introduce a framework ARBEx, a novel attentive feature...
research
11/20/2020

ConvTransformer: A Convolutional Transformer Network for Video Frame Synthesis

Deep Convolutional Neural Networks (CNNs) are powerful models that have ...

Please sign up or login with your details

Forgot password? Click here to reset