Can We Use Probing to Better Understand Fine-tuning and Knowledge Distillation of the BERT NLU?

01/27/2023
by   Jakub Hościłowicz, et al.
0

In this article, we use probing to investigate phenomena that occur during fine-tuning and knowledge distillation of a BERT-based natural language understanding (NLU) model. Our ultimate purpose was to use probing to better understand practical production problems and consequently to build better NLU models. We designed experiments to see how fine-tuning changes the linguistic capabilities of BERT, what the optimal size of the fine-tuning dataset is, and what amount of information is contained in a distilled NLU based on a tiny Transformer. The results of the experiments show that the probing paradigm in its current form is not well suited to answer such questions. Structural, Edge and Conditional probes do not take into account how easy it is to decode probed information. Consequently, we conclude that quantification of information decodability is critical for many practical applications of the probing paradigm.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
05/27/2023

One-Step Knowledge Distillation and Fine-Tuning in Using Large Pre-Trained Self-Supervised Learning Models for Speaker Verification

The application of speech self-supervised learning (SSL) models has achi...
research
02/24/2020

Improving BERT Fine-Tuning via Self-Ensemble and Self-Distillation

Fine-tuning pre-trained language models like BERT has become an effectiv...
research
09/03/2019

Transfer Fine-Tuning: A BERT Case Study

A semantic equivalence assessment is defined as a task that assesses sem...
research
10/25/2020

Two-stage Textual Knowledge Distillation to Speech Encoder for Spoken Language Understanding

End-to-end approaches open a new way for more accurate and efficient spo...
research
05/02/2023

Distill or Annotate? Cost-Efficient Fine-Tuning of Compact Models

Fine-tuning large models is highly effective, however, inference using t...
research
07/12/2021

A Flexible Multi-Task Model for BERT Serving

In this demonstration, we present an efficient BERT-based multi-task (MT...
research
09/16/2019

Probing Natural Language Inference Models through Semantic Fragments

Do state-of-the-art models for language understanding already have, or c...

Please sign up or login with your details

Forgot password? Click here to reset