Multi-Accent Adaptation based on Gate Mechanism

11/05/2020
by   Han Zhu, et al.
0

When only a limited amount of accented speech data is available, to promote multi-accent speech recognition performance, the conventional approach is accent-specific adaptation, which adapts the baseline model to multiple target accents independently. To simplify the adaptation procedure, we explore adapting the baseline model to multiple target accents simultaneously with multi-accent mixed data. Thus, we propose using accent-specific top layer with gate mechanism (AST-G) to realize multi-accent adaptation. Compared with the baseline model and accent-specific adaptation, AST-G achieves 9.8 average relative WER reduction respectively. However, in real-world applications, we can't obtain the accent category label for inference in advance. Therefore, we apply using an accent classifier to predict the accent label. To jointly train the acoustic model and the accent classifier, we propose the multi-task learning with gate mechanism (MTL-G). As the accent label prediction could be inaccurate, it performs worse than the accent-specific adaptation. Yet, in comparison with the baseline model, MTL-G achieves 5.1

READ FULL TEXT

page 1

page 2

page 3

page 4

research
02/23/2021

Senone-aware Adversarial Multi-task Training for Unsupervised Child to Adult Speech Adaptation

Acoustic modeling for child speech is challenging due to the high acoust...
research
04/21/2022

Layer-wise Fast Adaptation for End-to-End Multi-Accent Speech Recognition

Accent variability has posed a huge challenge to automatic speech recogn...
research
01/04/2019

Speaker Adaptation for End-to-End CTC Models

We propose two approaches for speaker adaptation in end-to-end (E2E) aut...
research
02/07/2018

Joint Modeling of Accents and Acoustics for Multi-Accent Speech Recognition

The performance of automatic speech recognition systems degrades with in...
research
07/07/2016

Sequence Training and Adaptation of Highway Deep Neural Networks

Highway deep neural network (HDNN) is a type of depth-gated feedforward ...
research
03/27/2018

A Multi-Discriminator CycleGAN for Unsupervised Non-Parallel Speech Domain Adaptation

Domain adaptation plays an important role for speech recognition models,...
research
02/16/2018

Interpreting DNN output layer activations: A strategy to cope with unseen data in speech recognition

Unseen data can degrade performance of deep neural net acoustic models. ...

Please sign up or login with your details

Forgot password? Click here to reset