Task-Agnostic Structured Pruning of Speech Representation Models

06/02/2023
by   Haoyu Wang, et al.
0

Self-supervised pre-trained models such as Wav2vec2, Hubert, and WavLM have been shown to significantly improve many speech tasks. However, their large memory and strong computational requirements hinder their industrial applicability. Structured pruning is a hardware-friendly model compression technique but usually results in a larger loss of accuracy. In this paper, we propose a fine-grained attention head pruning method to compensate for the performance degradation. In addition, we also introduce the straight through estimator into the L0 regularization to further accelerate the pruned model. Experiments on the SUPERB benchmark show that our model can achieve comparable performance to the dense model in multiple tasks and outperforms the Wav2vec 2.0 base model on average, with 72 inference speed.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
02/27/2023

Structured Pruning of Self-Supervised Pre-trained Models for Speech Recognition and Understanding

Self-supervised speech representation learning (SSL) has shown to be eff...
research
08/07/2023

Knowledge-preserving Pruning for Pre-trained Language Models without Retraining

Given a pre-trained language model, how can we efficiently compress it w...
research
07/03/2020

Self-Supervised GAN Compression

Deep learning's success has led to larger and larger models to handle mo...
research
01/17/2021

Network Automatic Pruning: Start NAP and Take a Nap

Network pruning can significantly reduce the computation and memory foot...
research
03/31/2022

PADA: Pruning Assisted Domain Adaptation for Self-Supervised Speech Representations

While self-supervised speech representation learning (SSL) models serve ...
research
07/09/2021

Structured Model Pruning of Convolutional Networks on Tensor Processing Units

The deployment of convolutional neural networks is often hindered by hig...
research
11/14/2022

SNIPER Training: Variable Sparsity Rate Training For Text-To-Speech

Text-to-speech (TTS) models have achieved remarkable naturalness in rece...

Please sign up or login with your details

Forgot password? Click here to reset