Quantifying and Maximizing the Benefits of Back-End Noise Adaption on Attention-Based Speech Recognition Models

05/03/2021
by   Coleman Hooper, et al.
0

This work analyzes how attention-based Bidirectional Long Short-Term Memory (BLSTM) models adapt to noise-augmented speech. We identify crucial components for noise adaptation in BLSTM models by freezing model components during fine-tuning. We first freeze larger model subnetworks and then pursue a fine-grained freezing approach in the encoder after identifying its importance for noise adaptation. The first encoder layer is shown to be crucial for noise adaptation, and the weights are shown to be more important than the other layers. Appreciable accuracy benefits are identified when fine-tuning on a target noisy environment from a model pretrained with noisy speech relative to fine-tuning from a model pretrained with only clean speech when tested on the target noisy environment. For this analysis, we produce our own dataset augmentation tool and it is open-sourced to encourage future efforts in exploring noise adaptation in ASR.

READ FULL TEXT
research
09/14/2021

Residual Adapters for Parameter-Efficient ASR Adaptation to Atypical and Accented Speech

Automatic Speech Recognition (ASR) systems are often optimized to work b...
research
11/13/2018

An Online Attention-based Model for Speech Recognition

Attention-based end-to-end (E2E) speech recognition models such as Liste...
research
07/29/2022

Domain Specific Wav2vec 2.0 Fine-tuning For The SE R 2022 Challenge

This paper presents our efforts to build a robust ASR model for the shar...
research
05/17/2019

End-to-end Adaptation with Backpropagation through WFST for On-device Speech Recognition System

An on-device DNN-HMM speech recognition system efficiently works with a ...
research
04/02/2022

Speaker adaptation for Wav2vec2 based dysarthric ASR

Dysarthric speech recognition has posed major challenges due to lack of ...
research
07/17/2018

Learning Noise-Invariant Representations for Robust Speech Recognition

Despite rapid advances in speech recognition, current models remain brit...
research
08/09/2019

The role of cue enhancement and frequency fine-tuning in hearing impaired phone recognition

A speech-based hearing test is designed to identify the susceptible erro...

Please sign up or login with your details

Forgot password? Click here to reset