DeepAI AI Chat
Log In Sign Up

Distilling Knowledge Using Parallel Data for Far-field Speech Recognition

02/20/2018
by   Jiangyan Yi, et al.
0

In order to improve the performance for far-field speech recognition, this paper proposes to distill knowledge from the close-talking model to the far-field model using parallel data. The close-talking model is called the teacher model. The far-field model is called the student model. The student model is trained to imitate the output distributions of the teacher model. This constraint can be realized by minimizing the Kullback-Leibler (KL) divergence between the output distribution of the student model and the teacher model. Experimental results on AMI corpus show that the best student model achieves up to 4.7 conventionally-trained baseline models.

READ FULL TEXT

page 1

page 2

page 3

page 4

06/26/2019

Essence Knowledge Distillation for Speech Recognition

It is well known that a speech recognition system that combines multiple...
04/14/2018

Developing Far-Field Speaker System Via Teacher-Student Learning

In this study, we develop the keyword spotting (KWS) and acoustic model ...
07/09/2019

Teach an all-rounder with experts in different domains

In many automatic speech recognition (ASR) tasks, an ideal model has to ...
01/06/2020

Domain Adaptation via Teacher-Student Learning for End-to-End Speech Recognition

Teacher-student (T/S) has shown to be effective for domain adaptation of...
03/29/2021

Shrinking Bigfoot: Reducing wav2vec 2.0 footprint

Wav2vec 2.0 is a state-of-the-art speech recognition model which maps sp...
11/06/2017

Improved training for online end-to-end speech recognition systems

Achieving high accuracy with end-to-end speech recognizers requires care...