A Model for Every User and Budget: Label-Free and Personalized Mixed-Precision Quantization

07/24/2023
by   Edward Fish, et al.
0

Recent advancement in Automatic Speech Recognition (ASR) has produced large AI models, which become impractical for deployment in mobile devices. Model quantization is effective to produce compressed general-purpose models, however such models may only be deployed to a restricted sub-domain of interest. We show that ASR models can be personalized during quantization while relying on just a small set of unlabelled samples from the target domain. To this end, we propose myQASR, a mixed-precision quantization method that generates tailored quantization schemes for diverse users under any memory requirement with no fine-tuning. myQASR automatically evaluates the quantization sensitivity of network layers by analysing the full-precision activation values. We are then able to generate a personalised mixed-precision quantization scheme for any pre-determined memory budget. Results for large-scale ASR models show how myQASR improves performance for specific genders, languages, and speakers.

READ FULL TEXT

page 3

page 6

research
10/28/2020

INT8 Winograd Acceleration for Conv1D Equipped ASR Models Deployed on Mobile Devices

The intensive computation of Automatic Speech Recognition (ASR) models o...
research
03/29/2022

4-bit Conformer with Native Quantization Aware Training for Speech Recognition

Reducing the latency and model size has always been a significant resear...
research
08/12/2020

Leveraging Automated Mixed-Low-Precision Quantization for tiny edge microcontrollers

The severe on-chip memory limitations are currently preventing the deplo...
research
03/31/2021

Q-ASR: Integer-only Zero-shot Quantization for Efficient Speech Recognition

End-to-end neural network models achieve improved performance on various...
research
06/23/2022

Towards Green ASR: Lossless 4-bit Quantization of a Hybrid TDNN System on the 300-hr Switchboard Corpus

State of the art time automatic speech recognition (ASR) systems are bec...
research
07/02/2023

Data-Free Quantization via Mixed-Precision Compensation without Fine-Tuning

Neural network quantization is a very promising solution in the field of...
research
06/15/2023

MobileASR: A resource-aware on-device personalisation framework for automatic speech recognition in mobile phones

We describe a comprehensive methodology for developing user-voice person...

Please sign up or login with your details

Forgot password? Click here to reset