Current Limitations of Language Models: What You Need is Retrieval

09/15/2020
by   Aran Komatsuzaki, et al.
0

We classify and re-examine some of the current approaches to improve the performance-computes trade-off of language models, including (1) non-causal models (such as masked language models), (2) extension of batch length with efficient attention, (3) recurrence, (4) conditional computation and (5) retrieval. We identify some limitations (1) - (4) suffer from. For example, (1) currently struggles with open-ended text generation with the output loosely constrained by the input as well as performing general textual tasks like GPT-2/3 due to its need for a specific fine-tuning dataset. (2) and (3) do not improve the prediction of the first ∼ 10^3 tokens. Scaling up a model size (e.g. efficiently with (4)) still results in poor performance scaling for some tasks. We argue (5) would resolve many of these limitations, and it can (a) reduce the amount of supervision and (b) efficiently extend the context over the entire training dataset and the entire past of the current sample. We speculate how to modify MARGE to perform unsupervised causal modeling that achieves (b) with the retriever jointly trained.

READ FULL TEXT
research
03/28/2023

Scaling Down to Scale Up: A Guide to Parameter-Efficient Fine-Tuning

This paper presents a systematic overview and comparison of parameter-ef...
research
05/25/2023

Scaling Data-Constrained Language Models

The current trend of scaling language models involves increasing both pa...
research
09/21/2023

LongLoRA: Efficient Fine-tuning of Long-Context Large Language Models

We present LongLoRA, an efficient fine-tuning approach that extends the ...
research
09/08/2022

IDIAPers @ Causal News Corpus 2022: Efficient Causal Relation Identification Through a Prompt-based Few-shot Approach

In this paper, we describe our participation in the subtask 1 of CASE-20...
research
04/13/2023

Shall We Pretrain Autoregressive Language Models with Retrieval? A Comprehensive Study

Large decoder-only language models (LMs) can be largely improved in term...
research
08/20/2023

LMTuner: An user-friendly and highly-integrable Training Framework for fine-tuning Large Language Models

With the burgeoning development in the realm of large language models (L...
research
05/23/2023

Improving Language Models via Plug-and-Play Retrieval Feedback

Large language models (LLMs) exhibit remarkable performance across vario...

Please sign up or login with your details

Forgot password? Click here to reset