The Influence of Dataset Partitioning on Dysfluency Detection Systems

06/07/2022
by   Sebastian P. Bayerl, et al.
0

This paper empirically investigates the influence of different data splits and splitting strategies on the performance of dysfluency detection systems. For this, we perform experiments using wav2vec 2.0 models with a classification head as well as support vector machines (SVM) in conjunction with the features extracted from the wav2vec 2.0 model to detect dysfluencies. We train and evaluate the systems with different non-speaker-exclusive and speaker-exclusive splits of the Stuttering Events in Podcasts (SEP-28k) dataset to shed some light on the variability of results w.r.t. to the partition method used. Furthermore, we show that the SEP-28k dataset is dominated by only a few speakers, making it difficult to evaluate. To remedy this problem, we created SEP-28k-Extended (SEP-28k-E), containing semi-automatically generated speaker and gender information for the SEP-28k corpus, and suggest different data splits, each useful for evaluating other aspects of methods for dysfluency detection.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
08/15/2019

Speaker Verification Using Simple Temporal Features and Pitch Synchronous Cepstral Coefficients

Speaker verification is the process by which a speakers claim of identit...
research
06/03/2021

An objective evaluation of the effects of recording conditions and speaker characteristics in multi-speaker deep neural speech synthesis

Multi-speaker spoken datasets enable the creation of text-to-speech synt...
research
05/25/2021

Utterance partitioning for speaker recognition: an experimental review and analysis with new findings under GMM-SVM framework

The performance of speaker recognition system is highly dependent on the...
research
08/06/2020

Improving on-device speaker verification using federated learning with privacy

Information on speaker characteristics can be useful as side information...
research
06/30/2023

Manga109Dialog A Large-scale Dialogue Dataset for Comics Speaker Detection

The expanding market for e-comics has spurred interest in the developmen...
research
05/25/2020

Speaker and Posture Classification using Instantaneous Intraspeech Breathing Features

Acoustic features extracted from speech are widely used in problems such...
research
10/14/2013

Misfire Detection in IC Engine using Kstar Algorithm

Misfire in an IC Engine continues to be a problem leading to reduced fue...

Please sign up or login with your details

Forgot password? Click here to reset