Optimal Transport-based Adaptation in Dysarthric Speech Tasks

04/06/2021
by   Rosanna Turrisi, et al.
0

In many real-world applications, the mismatch between distributions of training data (source) and test data (target) significantly degrades the performance of machine learning algorithms. In speech data, causes of this mismatch include different acoustic environments or speaker characteristics. In this paper, we address this issue in the challenging context of dysarthric speech, by multi-source domain/speaker adaptation (MSDA/MSSA). Specifically, we propose the use of an optimal-transport based approach, called MSDA via Weighted Joint Optimal Transport (MSDA-WDJOT). We confront the mismatch problem in dysarthria detection for which the proposed approach outperforms both the Baseline and the state-of-the-art MSDA models, improving the detection accuracy of 0.9 dysarthric speaker adaptation in command speech recognition. This provides a Command Error Rate relative reduction of 16 best competitor model, respectively. Interestingly, MSDA-WJDOT provides a similarity score between the source and the target, i.e. between speakers in this case. We leverage this similarity measure to define a Dysarthric and Healthy score of the target speaker and diagnose the dysarthria with an accuracy of 95

READ FULL TEXT
research
03/14/2022

Interpretable Dysarthric Speaker Adaptation based on Optimal-Transport

This work addresses the mismatch problem between the distribution of tra...
research
10/16/2021

A Unified Speaker Adaptation Approach for ASR

Transformer models have been used in automatic speech recognition (ASR) ...
research
09/20/2019

CDOT: Continuous Domain Adaptation using Optimal Transport

In this work, we address the scenario in which the target domain is cont...
research
05/29/2023

ADAPTERMIX: Exploring the Efficacy of Mixture of Adapters for Low-Resource TTS Adaptation

There are significant challenges for speaker adaptation in text-to-speec...
research
08/11/2020

Why Did the x-Vector System Miss a Target Speaker? Impact of Acoustic Mismatch Upon Target Score on VoxCeleb Data

Modern automatic speaker verification (ASV) relies heavily on machine le...
research
12/04/2018

Domain Mismatch Robust Acoustic Scene Classification using Channel Information Conversion

In a recent acoustic scene classification (ASC) research field, training...
research
02/25/2019

Channel adversarial training for cross-channel text-independent speaker recognition

The conventional speaker recognition frameworks (e.g., the i-vector and ...

Please sign up or login with your details

Forgot password? Click here to reset