Is It Worth the (Environmental) Cost? Limited Evidence for the Benefits of Diachronic Continuous Training

10/13/2022
by   Giuseppe Attanasio, et al.
0

Language is constantly changing and evolving, leaving language models to quickly become outdated, both factually and linguistically. Recent research proposes we continuously update our models using new data. Continuous training allows us to teach language models about new events and facts and changing norms. However, continuous training also means continuous costs. We show there is currently limited evidence for the benefits of continuous training, be it for the actual downstream performance or the environmental cost. Our results show continuous training does not significantly improve performance. While it is clear that, sooner or later, our language models need to be updated, it is unclear when this effort is worth the cost. We call for a critical reflection about when and how to use continuous training and for more benchmarks to support this research direction.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
07/20/2023

Dynamic Large Language Models on Blockchains

Training and deploying the large language models requires a large mount ...
research
02/02/2022

Co-training Improves Prompt-based Learning for Large Language Models

We demonstrate that co-training (Blum Mitchell, 1998) can improve th...
research
06/11/2021

Dynamic Language Models for Continuously Evolving Content

The content on the web is in a constant state of flux. New entities, iss...
research
12/17/2019

Analyzing Privacy Loss in Updates of Natural Language Models

To continuously improve quality and reflect changes in data, machine lea...
research
04/29/2022

TemporalWiki: A Lifelong Benchmark for Training and Evaluating Ever-Evolving Language Models

Language Models (LMs) become outdated as the world changes; they often f...
research
02/24/2021

When Attention Meets Fast Recurrence: Training Language Models with Reduced Compute

Large language models have become increasingly difficult to train becaus...
research
02/23/2023

Dynamic Benchmarking of Masked Language Models on Temporal Concept Drift with Multiple Views

Temporal concept drift refers to the problem of data changing over time....

Please sign up or login with your details

Forgot password? Click here to reset