Fully-attentive and interpretable: vision and video vision transformers for pain detection

10/27/2022
by   Giacomo Fiorentini, et al.
2

Pain is a serious and costly issue globally, but to be treated, it must first be detected. Vision transformers are a top-performing architecture in computer vision, with little research on their use for pain detection. In this paper, we propose the first fully-attentive automated pain detection pipeline that achieves state-of-the-art performance on binary pain detection from facial expressions. The model is trained on the UNBC-McMaster dataset, after faces are 3D-registered and rotated to the canonical frontal view. In our experiments we identify important areas of the hyperparameter space and their interaction with vision and video vision transformers, obtaining 3 noteworthy models. We analyse the attention maps of one of our models, finding reasonable interpretations for its predictions. We also evaluate Mixup, an augmentation technique, and Sharpness-Aware Minimization, an optimizer, with no success. Our presented models, ViT-1 (F1 score 0.55 +- 0.15), ViViT-1 (F1 score 0.55 +- 0.13), and ViViT-2 (F1 score 0.49 +- 0.04), all outperform earlier works, showing the potential of vision transformers for pain detection. Code is available at https://github.com/IPDTFE/ViT-McMaster

READ FULL TEXT

page 2

page 6

page 7

page 8

research
07/06/2021

Combining EfficientNet and Vision Transformers for Video Deepfake Detection

Deepfakes are the result of digital manipulation to obtain credible vide...
research
04/10/2023

Attention at SemEval-2023 Task 10: Explainable Detection of Online Sexism (EDOS)

In this paper, we have worked on interpretability, trust, and understand...
research
10/23/2022

EUREKA: EUphemism Recognition Enhanced through Knn-based methods and Augmentation

We introduce EUREKA, an ensemble-based approach for performing automatic...
research
04/25/2021

Transformers to Fight the COVID-19 Infodemic

The massive spread of false information on social media has become a glo...
research
03/12/2023

AidUI: Toward Automated Recognition of Dark Patterns in User Interfaces

Past studies have illustrated the prevalence of UI dark patterns, or use...
research
08/31/2023

A Sequential Framework for Detection and Classification of Abnormal Teeth in Panoramic X-rays

This paper describes our solution for the Dental Enumeration and Diagnos...
research
04/28/2022

Learning to Extract Building Footprints from Off-Nadir Aerial Images

Extracting building footprints from aerial images is essential for preci...

Please sign up or login with your details

Forgot password? Click here to reset