Enabling Lightweight Fine-tuning for Pre-trained Language Model Compression based on Matrix Product Operators

06/04/2021
by   Peiyu Liu, et al.
0

This paper presents a novel pre-trained language models (PLM) compression approach based on the matrix product operator (short as MPO) from quantum many-body physics. It can decompose an original matrix into central tensors (containing the core information) and auxiliary tensors (with only a small proportion of parameters). With the decomposed MPO structure, we propose a novel fine-tuning strategy by only updating the parameters from the auxiliary tensors, and design an optimization algorithm for MPO-based approximation over stacked network architectures. Our approach can be applied to the original or the compressed PLMs in a general way, which derives a lighter network and significantly reduces the parameters to be fine-tuned. Extensive experiments have demonstrated the effectiveness of the proposed approach in model compression, especially the reduction in finetuning parameters (91 on average).

READ FULL TEXT

page 1

page 2

page 3

page 4

research
03/02/2022

Parameter-Efficient Mixture-of-Experts Architecture for Pre-trained Language Models

The state-of-the-art Mixture-of-Experts (short as MoE) architecture has ...
research
03/27/2023

Scaling Pre-trained Language Models to Deeper via Parameter-efficient Architecture

In this paper, we propose a highly parameter-efficient approach to scali...
research
05/26/2023

Parameter-Efficient Fine-Tuning without Introducing New Latency

Parameter-efficient fine-tuning (PEFT) of pre-trained language models ha...
research
05/08/2023

HiFi: High-Information Attention Heads Hold for Parameter-Efficient Model Adaptation

To fully leverage the advantages of large-scale pre-trained language mod...
research
04/17/2023

Frequency Regularization: Restricting Information Redundancy of Convolutional Neural Networks

Convolutional neural networks have demonstrated impressive results in ma...
research
02/26/2022

An Improved Deep Learning Approach For Product Recognition on Racks in Retail Stores

Automated product recognition in retail stores is an important real-worl...
research
02/25/2023

Automated tuning for the parameters of linear solvers

Robust iterative methods for solving systems of linear algebraic equatio...

Please sign up or login with your details

Forgot password? Click here to reset