Dysfluencies Seldom Come Alone – Detection as a Multi-Label Problem

10/28/2022
by   Sebastian P. Bayerl, et al.
0

Specially adapted speech recognition models are necessary to handle stuttered speech. For these to be used in a targeted manner, stuttered speech must be reliably detected. Recent works have treated stuttering as a multi-class classification problem or viewed detecting each dysfluency type as an isolated task; that does not capture the nature of stuttering, where one dysfluency seldom comes alone, i.e., co-occurs with others. This work explores an approach based on a modified wav2vec 2.0 system for end-to-end stuttering detection and classification as a multi-label problem. The method is evaluated on combinations of three datasets containing English and German stuttered speech, yielding state-of-the-art results for stuttering detection on the SEP-28k-Extended dataset. Experimental results provide evidence for the transferability of features and the generalizability of the method across datasets and languages.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
05/30/2023

A Stutter Seldom Comes Alone – Cross-Corpus Stuttering Detection as a Multi-label Problem

Most stuttering detection and classification research has viewed stutter...
research
08/23/2022

K-MHaS: A Multi-label Hate Speech Detection Dataset in Korean Online News Comment

Online Hate speech detection has become important with the growth of dig...
research
06/29/2021

Attack Transferability Characterization for Adversarially Robust Multi-label Classification

Despite of the pervasive existence of multi-label evasion attack, it is ...
research
05/15/2018

A Purely End-to-end System for Multi-speaker Speech Recognition

Recently, there has been growing interest in multi-speaker speech recogn...
research
07/10/2019

Deep Multi Label Classification in Affine Subspaces

Multi-label classification (MLC) problems are becoming increasingly popu...
research
11/28/2021

Speaker Embedding-aware Neural Diarization for Flexible Number of Speakers with Textual Information

Overlapping speech diarization is always treated as a multi-label classi...

Please sign up or login with your details

Forgot password? Click here to reset