On the Compression of Natural Language Models

12/13/2021
by   Saeed Damadi, et al.
0

Deep neural networks are effective feature extractors but they are prohibitively large for deployment scenarios. Due to the huge number of parameters, interpretability of parameters in different layers is not straight-forward. This is why neural networks are sometimes considered black boxes. Although simpler models are easier to explain, finding them is not easy. If found, a sparse network that can fit to a data from scratch would help to interpret parameters of a neural network. To this end, lottery ticket hypothesis states that typical dense neural networks contain a small sparse sub-network that can be trained to a reach similar test accuracy in an equal number of steps. The goal of this work is to assess whether such a trainable subnetwork exists for natural language models (NLM)s. To achieve this goal we will review state-of-the-art compression techniques such as quantization, knowledge distillation, and pruning.

READ FULL TEXT
research
12/30/2021

Automatic Mixed-Precision Quantization Search of BERT

Pre-trained language models such as BERT have shown remarkable effective...
research
03/09/2018

The Lottery Ticket Hypothesis: Finding Small, Trainable Neural Networks

Neural network compression techniques are able to reduce the parameter c...
research
01/21/2022

Can Model Compression Improve NLP Fairness

Model compression techniques are receiving increasing attention; however...
research
04/18/2021

Rethinking Network Pruning – under the Pre-train and Fine-tune Paradigm

Transformer-based pre-trained language models have significantly improve...
research
02/20/2018

Do Deep Learning Models Have Too Many Parameters? An Information Theory Viewpoint

Deep learning models often have more parameters than observations, and s...
research
07/03/2022

FasterAI: A Lightweight Library for Creating Sparse Neural Networks

FasterAI is a PyTorch-based library, aiming to facilitate the utilizatio...
research
06/16/2022

Not All Lotteries Are Made Equal

The Lottery Ticket Hypothesis (LTH) states that for a reasonably sized n...

Please sign up or login with your details

Forgot password? Click here to reset