Distill or Annotate? Cost-Efficient Fine-Tuning of Compact Models

05/02/2023
by   Junmo Kang, et al.
0

Fine-tuning large models is highly effective, however, inference using these models can be expensive and produces carbon emissions. Knowledge distillation has been shown to be a practical solution to reduce inference costs, but the distillation process itself requires significant computational resources. Rather than buying or renting GPUs to fine-tune, then distill a large model, an NLP practitioner who needs a compact model might also choose to simply allocate an available budget to hire annotators and manually label additional fine-tuning data. In this paper, we investigate how to most efficiently use a fixed budget to build a compact model. Through our extensive experiments on six diverse NLP tasks, we find that distilling from T5-XXL (11B) to T5-Small (60M) leads to almost always a cost-efficient option compared to annotating more data to directly train a compact model (T5-Small (60M)). We further demonstrate that the optimal amount of distillation that maximizes utility varies across different budgetary scenarios.

READ FULL TEXT
research
05/27/2023

One-Step Knowledge Distillation and Fine-Tuning in Using Large Pre-Trained Self-Supervised Learning Models for Speaker Verification

The application of speech self-supervised learning (SSL) models has achi...
research
09/02/2022

Petals: Collaborative Inference and Fine-tuning of Large Models

Many NLP tasks benefit from using large language models (LLMs) that ofte...
research
07/20/2021

Learning ULMFiT and Self-Distillation with Calibration for Medical Dialogue System

A medical dialogue system is essential for healthcare service as providi...
research
01/27/2023

Can We Use Probing to Better Understand Fine-tuning and Knowledge Distillation of the BERT NLU?

In this article, we use probing to investigate phenomena that occur duri...
research
05/21/2023

Understanding the Effect of Data Augmentation on Knowledge Distillation

Knowledge distillation (KD) requires sufficient data to transfer knowled...
research
06/08/2023

The economic trade-offs of large language models: A case study

Contacting customer service via chat is a common practice. Because emplo...
research
06/20/2018

Doubly Nested Network for Resource-Efficient Inference

We propose doubly nested network(DNNet) where all neurons represent thei...

Please sign up or login with your details

Forgot password? Click here to reset