Using Self-Supervised Feature Extractors with Attention for Automatic COVID-19 Detection from Speech

06/30/2021
by   John Mendonça, et al.
0

The ComParE 2021 COVID-19 Speech Sub-challenge provides a test-bed for the evaluation of automatic detectors of COVID-19 from speech. Such models can be of value by providing test triaging capabilities to health authorities, working alongside traditional testing methods. Herein, we leverage the usage of pre-trained, problem agnostic, speech representations and evaluate their use for this task. We compare the obtained results against a CNN architecture trained from scratch and traditional frequency-domain representations. We also evaluate the usage of Self-Attention Pooling as an utterance-level information aggregation method. Experimental results demonstrate that models trained on features extracted from self-supervised models perform similarly or outperform fully-supervised models and models based on handcrafted features. Our best model improves the Unweighted Average Recall (UAR) from 69.0% to 72.3% on a development set comprised of only full-band examples and achieves 64.4% on the test set. Furthermore, we study where the network is attending, attempting to draw some conclusions regarding its explainability. In this relatively small dataset, we find the network attends especially to vowels and aspirates.

READ FULL TEXT
research
07/20/2022

End-to-End and Self-Supervised Learning for ComParE 2022 Stuttering Sub-Challenge

In this paper, we present end-to-end and speech embedding based systems ...
research
10/13/2021

EIHW-MTG DiCOVA 2021 Challenge System Report

This paper aims to automatically detect COVID-19 patients by analysing t...
research
10/18/2021

EIHW-MTG: Second DiCOVA Challenge System Report

This work presents an outer product-based approach to fuse the embedded ...
research
04/22/2022

FAIR4Cov: Fused Audio Instance and Representation for COVID-19 Detection

Audio-based classification techniques on body sounds have long been stud...
research
03/22/2023

Self-supervised Learning with Speech Modulation Dropout

We show that training a multi-headed self-attention-based deep network t...
research
06/01/2023

Stuttering Detection Using Speaker Representations and Self-supervised Contextual Embeddings

The adoption of advanced deep learning architectures in stuttering detec...
research
11/09/2021

Membership Inference Attacks Against Self-supervised Speech Models

Recently, adapting the idea of self-supervised learning (SSL) on continu...

Please sign up or login with your details

Forgot password? Click here to reset