Unsupervised Model-based speaker adaptation of end-to-end lattice-free MMI model for speech recognition

11/17/2022
by   Xurong Xie, et al.
0

Modeling the speaker variability is a key challenge for automatic speech recognition (ASR) systems. In this paper, the learning hidden unit contributions (LHUC) based adaptation techniques with compact speaker dependent (SD) parameters are used to facilitate both speaker adaptive training (SAT) and unsupervised test-time speaker adaptation for end-to-end (E2E) lattice-free MMI (LF-MMI) models. An unsupervised model-based adaptation framework is proposed to estimate the SD parameters in E2E paradigm using LF-MMI and cross entropy (CE) criterions. Various regularization methods of the standard LHUC adaptation, e.g., the Bayesian LHUC (BLHUC) adaptation, are systematically investigated to mitigate the risk of overfitting, on E2E LF-MMI CNN-TDNN and CNN-TDNN-BLSTM models. Lattice-based confidence score estimation is used for adaptation data selection to reduce the supervision label uncertainty. Experiments on the 300-hour Switchboard task suggest that applying BLHUC in the proposed unsupervised E2E adaptation framework to byte pair encoding (BPE) based E2E LF-MMI systems consistently outperformed the baseline systems by relative word error rate (WER) reductions up to 10.5 Hub5'00 and RT03 evaluation sets, and achieved the best performance in WERs of 9.0 state-of-the-art adapted LF-MMI hybrid systems and adapted Conformer-based E2E systems.

READ FULL TEXT
research
06/24/2022

Confidence Score Based Conformer Speaker Adaptation for Speech Recognition

A key challenge for automatic speech recognition (ASR) systems is to mod...
research
12/14/2020

Bayesian Learning for Deep Neural Network Adaptation

A key task for speech recognition systems is to reduce the mismatch betw...
research
06/23/2022

Two-pass Decoding and Cross-adaptation Based System Combination of End-to-end Conformer and Hybrid TDNN ASR Systems

Fundamental modelling differences between hybrid and end-to-end (E2E) au...
research
01/12/2016

Learning Hidden Unit Contributions for Unsupervised Acoustic Model Adaptation

This work presents a broad study on the adaptation of neural network aco...
research
08/14/2020

Adaptation Algorithms for Speech Recognition: An Overview

We present a structured overview of adaptation algorithms for neural net...
research
09/13/2023

Can Whisper perform speech-based in-context learning

This paper investigates the in-context learning abilities of the Whisper...
research
02/15/2023

Confidence Score Based Speaker Adaptation of Conformer Speech Recognition Systems

Speaker adaptation techniques provide a powerful solution to customise a...

Please sign up or login with your details

Forgot password? Click here to reset