Iterative Compression of End-to-End ASR Model using AutoML

by   Abhinav Mehrotra, et al.

Increasing demand for on-device Automatic Speech Recognition (ASR) systems has resulted in renewed interests in developing automatic model compression techniques. Past research have shown that AutoML-based Low Rank Factorization (LRF) technique, when applied to an end-to-end Encoder-Attention-Decoder style ASR model, can achieve a speedup of up to 3.7x, outperforming laborious manual rank-selection approaches. However, we show that current AutoML-based search techniques only work up to a certain compression level, beyond which they fail to produce compressed models with acceptable word error rates (WER). In this work, we propose an iterative AutoML-based LRF approach that achieves over 5x compression without degrading the WER, thereby advancing the state-of-the-art in ASR compression.


page 1

page 2

page 3

page 4


ShrinkML: End-to-End ASR Model Compression Using Reinforcement Learning

End-to-end automatic speech recognition (ASR) models are increasingly la...

MP3 Compression To Diminish Adversarial Noise in End-to-End Speech Recognition

Audio Adversarial Examples (AAE) represent specially created inputs mean...

HALOC: Hardware-Aware Automatic Low-Rank Compression for Compact Neural Networks

Low-rank compression is an important model compression strategy for obta...

Enhancing Quantised End-to-End ASR Models via Personalisation

Recent end-to-end automatic speech recognition (ASR) models have become ...

Sparsification via Compressed Sensing for Automatic Speech Recognition

In order to achieve high accuracy for machine learning (ML) applications...

Extremely Low Footprint End-to-End ASR System for Smart Device

Recently, end-to-end (E2E) speech recognition has become popular, since ...

Performance Monitoring for End-to-End Speech Recognition

Measuring performance of an automatic speech recognition (ASR) system wi...

Please sign up or login with your details

Forgot password? Click here to reset