Domain Adaptation via Teacher-Student Learning for End-to-End Speech Recognition

01/06/2020
by   Zhong Meng, et al.
2

Teacher-student (T/S) has shown to be effective for domain adaptation of deep neural network acoustic models in hybrid speech recognition systems. In this work, we extend the T/S learning to large-scale unsupervised domain adaptation of an attention-based end-to-end (E2E) model through two levels of knowledge transfer: teacher's token posteriors as soft labels and one-best predictions as decoder guidance. To further improve T/S learning with the help of ground-truth labels, we propose adaptive T/S (AT/S) learning. Instead of conditionally choosing from either the teacher's soft token posteriors or the one-hot ground-truth label, in AT/S, the student always learns from both the teacher and the ground truth with a pair of adaptive weights assigned to the soft and one-hot labels quantifying the confidence on each of the knowledge sources. The confidence scores are dynamically estimated at each decoder step as a function of the soft and one-hot labels. With 3400 hours parallel close-talk and far-field Microsoft Cortana data for domain adaptation, T/S and AT/S achieve 6.3 trained with the same amount of far-field data.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
04/28/2019

Conditional Teacher-Student Learning

The teacher-student (T/S) learning has been shown to be effective for a ...
research
02/20/2018

Distilling Knowledge Using Parallel Data for Far-field Speech Recognition

In order to improve the performance for far-field speech recognition, th...
research
08/21/2021

Robust Ensembling Network for Unsupervised Domain Adaptation

Recently, in order to address the unsupervised domain adaptation (UDA) p...
research
04/02/2018

Adversarial Teacher-Student Learning for Unsupervised Domain Adaptation

The teacher-student (T/S) learning has been shown effective in unsupervi...
research
10/27/2017

BridgeNets: Student-Teacher Transfer Learning Based on Recursive Neural Networks and its Application to Distant Speech Recognition

Despite the remarkable progress achieved on automatic speech recognition...
research
02/10/2020

Multitask Emotion Recognition with Incomplete Labels

We train a unified model to perform three tasks: facial action unit dete...
research
11/06/2017

Improved training for online end-to-end speech recognition systems

Achieving high accuracy with end-to-end speech recognizers requires care...

Please sign up or login with your details

Forgot password? Click here to reset