Speaker Adaptation for End-To-End Speech Recognition Systems in Noisy Environments

11/16/2022
by   Dominik Wagner, et al.
0

We analyze the impact of speaker adaptation in end-to-end architectures based on transformers and wav2vec 2.0 under different noise conditions. We demonstrate that the proven method of concatenating speaker vectors to the acoustic features and supplying them as an auxiliary model input remains a viable option to increase the robustness of end-to-end architectures. By including speaker embeddings obtained from x-vector and ECAPA-TDNN models, we achieve relative word error rate improvements of up to 9.6 up to 14.5 approximately inversely proportional to the signal-to-noise ratio (SNR) and is strongest in heavily noised environments (SNR=0). The most substantial benefit of speaker adaption in systems based on wav2vec 2.0 can be achieved under moderate noise conditions (SNR≥18). We also find that x-vectors tend to yield larger improvements than ECAPA-TDNN embeddings.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
10/23/2021

A Study of Acoustic Features in Arabic Speaker Identification under Noisy Environmental Conditions

One of the major parts of the voice recognition field is the choice of a...
research
08/07/2020

Investigation of Speaker-adaptation methods in Transformer based ASR

End-to-end models are fast replacing conventional hybrid models in autom...
research
01/04/2019

Speaker Adaptation for End-to-End CTC Models

We propose two approaches for speaker adaptation in end-to-end (E2E) aut...
research
03/09/2022

An Environmental Feature Representation in I-vector Space for Room Verification and Metadata Estimation

This paper investigates the application of environmental feature represe...
research
10/20/2019

Deep speech inpainting of time-frequency masks

In particularly noisy environments, transient loud intrusions can comple...
research
06/14/2019

Cumulative Adaptation for BLSTM Acoustic Models

This paper addresses the robust speech recognition problem as an adaptat...
research
10/22/2020

Combination of Deep Speaker Embeddings for Diarisation

Recently, significant progress has been made in speaker diarisation afte...

Please sign up or login with your details

Forgot password? Click here to reset