Exploiting Large-scale Teacher-Student Training for On-device Acoustic Models

06/11/2021
by   Jing Liu, et al.
0

We present results from Alexa speech teams on semi-supervised learning (SSL) of acoustic models (AM) with experiments spanning over 3000 hours of GPU time, making our study one of the largest of its kind. We discuss SSL for AMs in a small footprint setting, showing that a smaller capacity model trained with 1 million hours of unsupervised data can outperform a baseline supervised system by 14.3 to seven-fold, our gains diminish to 7.1 larger supervised data regimes, we employ a step-wise distillation into a smaller model, obtaining a WERR of 14.4 student models in low data regimes; while learning efficiency with unsupervised data is higher, student models may outperform teacher models in such a setting. We develop a theoretical sketch to explain this behavior.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
02/01/2020

Fully Learnable Front-End for Multi-Channel Acoustic Modeling using Semi-Supervised Learning

In this work, we investigated the teacher-student training paradigm to t...
research
09/18/2021

A Studious Approach to Semi-Supervised Learning

The problem of learning from few labeled examples while using large amou...
research
04/24/2019

Realizing Petabyte Scale Acoustic Modeling

Large scale machine learning (ML) systems such as the Alexa automatic sp...
research
08/10/2020

Knowledge Distillation and Data Selection for Semi-Supervised Learning in CTC Acoustic Models

Semi-supervised learning (SSL) is an active area of research which aims ...
research
04/02/2019

Lessons from Building Acoustic Models with a Million Hours of Speech

This is a report of our lessons learned building acoustic models from 1 ...
research
03/29/2021

Shrinking Bigfoot: Reducing wav2vec 2.0 footprint

Wav2vec 2.0 is a state-of-the-art speech recognition model which maps sp...
research
05/16/2023

Low-complexity deep learning frameworks for acoustic scene classification using teacher-student scheme and multiple spectrograms

In this technical report, a low-complexity deep learning system for acou...

Please sign up or login with your details

Forgot password? Click here to reset