Whisper-KDQ: A Lightweight Whisper via Guided Knowledge Distillation and Quantization for Efficient ASR

05/18/2023
by   Hang Shao, et al.
0

Due to the rapid development of computing hardware resources and the dramatic growth of data, pre-trained models in speech recognition, such as Whisper, have significantly improved the performance of speech recognition tasks. However, these models usually have a high computational overhead, making it difficult to execute effectively on resource-constrained devices. To speed up inference and reduce model size while maintaining performance, we propose a novel guided knowledge distillation and quantization for large pre-trained model Whisper. The student model selects distillation and quantization layers based on quantization loss and distillation loss, respectively. We compressed Whisper_small to Whisper_base and Whisper_tiny levels, making Whisper_small 5.18x/10.48x smaller, respectively. Moreover, compared to the original Whisper_base and Whisper_tiny, there is also a relative character error rate (CER) reduction of 11.3 compressed model respectively.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
09/18/2023

Distilling HuBERT with LSTMs via Decoupled Knowledge Distillation

Much research effort is being applied to the task of compressing the kno...
research
01/30/2023

Knowledge Transfer from Pre-trained Language Models to Cif-based Speech Recognizers via Hierarchical Distillation

Large-scale pre-trained language models (PLMs) with powerful language mo...
research
03/16/2023

DistillW2V2: A Small and Streaming Wav2vec 2.0 Based ASR Model

Wav2vec 2.0 (W2V2) has shown impressive performance in automatic speech ...
research
04/17/2019

Guiding CTC Posterior Spike Timings for Improved Posterior Fusion and Knowledge Distillation

Conventional automatic speech recognition (ASR) systems trained from fra...
research
03/24/2023

Mixed-Type Wafer Classification For Low Memory Devices Using Knowledge Distillation

Manufacturing wafers is an intricate task involving thousands of steps. ...
research
06/27/2023

Reducing the gap between streaming and non-streaming Transducer-based ASR by adaptive two-stage knowledge distillation

Transducer is one of the mainstream frameworks for streaming speech reco...
research
07/01/2019

Compression of Acoustic Event Detection Models With Quantized Distillation

Acoustic Event Detection (AED), aiming at detecting categories of events...

Please sign up or login with your details

Forgot password? Click here to reset