Small Language Models Improve Giants by Rewriting Their Outputs

05/22/2023
by   Giorgos Vernikos, et al.
0

Large language models (LLMs) have demonstrated impressive few-shot learning capabilities, but they often underperform compared to fine-tuned models on challenging tasks. Furthermore, their large size and restricted access only through APIs make task-specific fine-tuning impractical. Moreover, LLMs are sensitive to different aspects of prompts (e.g., the selection and order of demonstrations) and can thus require time-consuming prompt engineering. In this light, we propose a method to correct LLM outputs without relying on their weights. First, we generate a pool of candidates by few-shot prompting an LLM. Second, we refine the LLM-generated outputs using a smaller model, the LM-corrector (LMCor), which is trained to rank, combine and rewrite the candidates to produce the final target output. Our experiments demonstrate that even a small LMCor model (250M) substantially improves the few-shot performance of LLMs (62B) across diverse tasks. Moreover, we illustrate that the LMCor exhibits robustness against different prompts, thereby minimizing the need for extensive prompt engineering. Finally, we showcase that the LMCor can be seamlessly integrated with different LLMs at inference time, serving as a plug-and-play module to improve their performance.

READ FULL TEXT
research
04/16/2021

Language Models are Few-Shot Butlers

Pretrained language models demonstrate strong performance in most NLP ta...
research
11/22/2022

HyperTuning: Toward Adapting Large Language Models without Back-propagation

Fine-tuning large language models for different tasks can be costly and ...
research
10/12/2021

LiST: Lite Self-training Makes Efficient Few-shot Learners

We present a new method LiST for efficient fine-tuning of large pre-trai...
research
05/23/2022

What Makes Data-to-Text Generation Hard for Pretrained Language Models?

Expressing natural language descriptions of structured facts or relation...
research
02/02/2022

Co-training Improves Prompt-based Learning for Large Language Models

We demonstrate that co-training (Blum Mitchell, 1998) can improve th...
research
05/23/2023

Flexible Grammar-Based Constrained Decoding for Language Models

LLMs have shown impressive few-shot performance across many tasks. Howev...
research
03/15/2023

Large Language Model Is Not a Good Few-shot Information Extractor, but a Good Reranker for Hard Samples!

Large Language Models (LLMs) have made remarkable strides in various tas...

Please sign up or login with your details

Forgot password? Click here to reset