Finding the Winning Ticket of BERT for Binary Text Classification via Adaptive Layer Truncation before Fine-tuning

11/22/2021
by   Jing Fan, et al.
0

In light of the success of transferring language models into NLP tasks, we ask whether the full BERT model is always the best and does it exist a simple but effective method to find the winning ticket in state-of-the-art deep neural networks without complex calculations. We construct a series of BERT-based models with different size and compare their predictions on 8 binary classification tasks. The results show there truly exist smaller sub-networks performing better than the full model. Then we present a further study and propose a simple method to shrink BERT appropriately before fine-tuning. Some extended experiments indicate that our method could save time and storage overhead extraordinarily with little even no accuracy loss.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
02/24/2020

Improving BERT Fine-Tuning via Self-Ensemble and Self-Distillation

Fine-tuning pre-trained language models like BERT has become an effectiv...
research
11/10/2019

Improving BERT Fine-tuning with Embedding Normalization

Large pre-trained sentence encoders like BERT start a new chapter in nat...
research
10/12/2020

Layer-wise Guided Training for BERT: Learning Incrementally Refined Document Representations

Although BERT is widely used by the NLP community, little is known about...
research
07/10/2021

Noise Stability Regularization for Improving BERT Fine-tuning

Fine-tuning pre-trained language models such as BERT has become a common...
research
05/25/2020

Pointwise Paraphrase Appraisal is Potentially Problematic

The prevailing approach for training and evaluating paraphrase identific...
research
04/23/2021

Optimizing small BERTs trained for German NER

Currently, the most widespread neural network architecture for training ...
research
05/03/2021

Goldilocks: Just-Right Tuning of BERT for Technology-Assisted Review

Technology-assisted review (TAR) refers to iterative active learning wor...

Please sign up or login with your details

Forgot password? Click here to reset