Language Modeling Is Compression

09/19/2023
by   Grégoire Delétang, et al.
0

It has long been established that predictive models can be transformed into lossless compressors and vice versa. Incidentally, in recent years, the machine learning community has focused on training increasingly large and powerful self-supervised (language) models. Since these large language models exhibit impressive predictive capabilities, they are well-positioned to be strong compressors. In this work, we advocate for viewing the prediction problem through the lens of compression and evaluate the compression capabilities of large (foundation) models. We show that large language models are powerful general-purpose predictors and that the compression viewpoint provides novel insights into scaling laws, tokenization, and in-context learning. For example, Chinchilla 70B, while trained primarily on text, compresses ImageNet patches to 43.4 domain-specific compressors like PNG (58.5 Finally, we show that the prediction-compression equivalence allows us to use any compressor (like gzip) to build a conditional generative model.

READ FULL TEXT
research
06/13/2022

Language Models are General-Purpose Interfaces

Foundation models have received much attention due to their effectivenes...
research
02/02/2023

Conditioning Predictive Models: Risks and Strategies

Our intention is to provide a definitive reference on what it would take...
research
06/20/2022

nuQmm: Quantized MatMul for Efficient Inference of Large-Scale Generative Language Models

The recent advance of self-supervised learning associated with the Trans...
research
07/15/2023

SINC: Self-Supervised In-Context Learning for Vision-Language Tasks

Large Pre-trained Transformers exhibit an intriguing capacity for in-con...
research
06/14/2022

Can Foundation Models Talk Causality?

Foundation models are subject to an ongoing heated debate, leaving open ...
research
07/27/2023

Metric-Based In-context Learning: A Case Study in Text Simplification

In-context learning (ICL) for large language models has proven to be a p...
research
05/19/2023

LLM-Pruner: On the Structural Pruning of Large Language Models

Large language models (LLMs) have shown remarkable capabilities in langu...

Please sign up or login with your details

Forgot password? Click here to reset