BAM! Born-Again Multi-Task Networks for Natural Language Understanding

07/10/2019
by   Kevin Clark, et al.
0

It can be challenging to train multi-task neural networks that outperform or even match their single-task counterparts. To help address this, we propose using knowledge distillation where single-task models teach a multi-task model. We enhance this training with teacher annealing, a novel method that gradually transitions the model from distillation to supervised learning, helping the multi-task model surpass its single-task teachers. We evaluate our approach by multi-task fine-tuning BERT on the GLUE benchmark. Our method consistently improves over standard single-task and multi-task training.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
04/20/2019

Improving Multi-Task Deep Neural Networks via Knowledge Distillation for Natural Language Understanding

This paper explores the use of knowledge distillation to improve a Multi...
research
10/05/2020

Lifelong Language Knowledge Distillation

It is challenging to perform lifelong language learning (LLL) on a strea...
research
07/12/2021

A Flexible Multi-Task Model for BERT Serving

In this demonstration, we present an efficient BERT-based multi-task (MT...
research
08/12/2019

Feature Partitioning for Efficient Multi-Task Architectures

Multi-task learning holds the promise of less data, parameters, and time...
research
07/12/2020

HyperGrid: Efficient Multi-Task Transformers with Grid-wise Decomposable Hyper Projections

Achieving state-of-the-art performance on natural language understanding...
research
04/21/2019

Model Compression with Multi-Task Knowledge Distillation for Web-scale Question Answering System

Deep pre-training and fine-tuning models (like BERT, OpenAI GPT) have de...
research
03/24/2022

Multitask Emotion Recognition Model with Knowledge Distillation and Task Discriminator

Due to the collection of big data and the development of deep learning, ...

Please sign up or login with your details

Forgot password? Click here to reset