Enhancing Quantised End-to-End ASR Models via Personalisation

09/17/2023
by   Qiuming Zhao, et al.
0

Recent end-to-end automatic speech recognition (ASR) models have become increasingly larger, making them particularly challenging to be deployed on resource-constrained devices. Model quantisation is an effective solution that sometimes causes the word error rate (WER) to increase. In this paper, a novel strategy of personalisation for a quantised model (PQM) is proposed, which combines speaker adaptive training (SAT) with model quantisation to improve the performance of heavily compressed models. Specifically, PQM uses a 4-bit NormalFloat Quantisation (NF4) approach for model quantisation and low-rank adaptation (LoRA) for SAT. Experiments have been performed on the LibriSpeech and the TED-LIUM 3 corpora. Remarkably, with a 7x reduction in model size and 1 reductions were achieved on quantised Whisper and Conformer-based attention-based encoder-decoder ASR models respectively, comparing to the original full precision models.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
07/02/2021

Relaxed Attention: A Simple Method to Boost Performance of End-to-End Automatic Speech Recognition

Recently, attention-based encoder-decoder (AED) models have shown high p...
research
10/29/2018

Improved hybrid CTC-Attention model for speech recognition

Recently, end-to-end speech recognition with a hybrid model consisting o...
research
07/08/2019

ShrinkML: End-to-End ASR Model Compression Using Reinforcement Learning

End-to-end automatic speech recognition (ASR) models are increasingly la...
research
08/06/2020

Iterative Compression of End-to-End ASR Model using AutoML

Increasing demand for on-device Automatic Speech Recognition (ASR) syste...
research
09/13/2023

Can Whisper perform speech-based in-context learning

This paper investigates the in-context learning abilities of the Whisper...
research
09/01/2021

Tree-constrained Pointer Generator for End-to-end Contextual Speech Recognition

Contextual knowledge is important for real-world automatic speech recogn...
research
01/04/2019

Speaker Adaptation for End-to-End CTC Models

We propose two approaches for speaker adaptation in end-to-end (E2E) aut...

Please sign up or login with your details

Forgot password? Click here to reset