Efficient acoustic feature transformation in mismatched environments using a Guided-GAN

10/03/2022
by   Walter Heymans, et al.
0

We propose a new framework to improve automatic speech recognition (ASR) systems in resource-scarce environments using a generative adversarial network (GAN) operating on acoustic input features. The GAN is used to enhance the features of mismatched data prior to decoding, or can optionally be used to fine-tune the acoustic model. We achieve improvements that are comparable to multi-style training (MTR), but at a lower computational cost. With less than one hour of data, an ASR system trained on good quality data, and evaluated on mismatched audio is improved by between 11.5 rate (WER). Experiments demonstrate that the framework can be very useful in under-resourced environments where training data and computational resources are limited. The GAN does not require parallel training data, because it utilises a baseline acoustic model to provide an additional loss term that guides the generator to create acoustic features that are better classified by the baseline.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
05/02/2018

Boosting Noise Robustness of Acoustic Model via Deep Adversarial Training

In realistic environments, speech is usually interfered by various noise...
research
02/24/2020

Distributed Training of Deep Neural Network Acoustic Models for Automatic Speech Recognition

The past decade has witnessed great progress in Automatic Speech Recogni...
research
07/06/2019

Improved low-resource Somali speech recognition by semi-supervised acoustic and language model training

We present improvements in automatic speech recognition (ASR) for Somali...
research
11/08/2022

Towards Improved Room Impulse Response Estimation for Speech Recognition

We propose to characterize and improve the performance of blind room imp...
research
05/25/2022

An Investigation on Applying Acoustic Feature Conversion to ASR of Adult and Child Speech

The performance of child speech recognition is generally less satisfacto...
research
05/16/2019

Learning discriminative features in sequence training without requiring framewise labelled data

In this work, we try to answer two questions: Can deeply learned feature...
research
04/28/2019

Attentive Adversarial Learning for Domain-Invariant Training

Adversarial domain-invariant training (ADIT) proves to be effective in s...

Please sign up or login with your details

Forgot password? Click here to reset