Shepherd: A Critic for Language Model Generation

08/08/2023
by   Tianlu Wang, et al.
0

As large language models improve, there is increasing interest in techniques that leverage these models' capabilities to refine their own outputs. In this work, we introduce Shepherd, a language model specifically tuned to critique responses and suggest refinements, extending beyond the capabilities of an untuned model to identify diverse errors and provide suggestions to remedy them. At the core of our approach is a high quality feedback dataset, which we curate from community feedback and human annotations. Even though Shepherd is small (7B parameters), its critiques are either equivalent or preferred to those from established models including ChatGPT. Using GPT-4 for evaluation, Shepherd reaches an average win-rate of 53-87 alternatives. In human evaluation, Shepherd strictly outperforms other models and on average closely ties with ChatGPT.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
04/29/2022

Training Language Models with Natural Language Feedback

Pretrained language models often do not perform tasks in ways that are i...
research
03/28/2023

Training Language Models with Language Feedback at Scale

Pretrained language models often generate outputs that are not in line w...
research
05/24/2023

How Predictable Are Large Language Model Capabilities? A Case Study on BIG-bench

We investigate the predictability of large language model (LLM) capabili...
research
05/18/2023

LIMA: Less Is More for Alignment

Large language models are trained in two stages: (1) unsupervised pretra...
research
09/08/2023

Towards Reliable and Fluent Large Language Models: Incorporating Feedback Learning Loops in QA Systems

Large language models (LLMs) have emerged as versatile tools in various ...
research
06/14/2023

Revealing the structure of language model capabilities

Building a theoretical understanding of the capabilities of large langua...
research
01/13/2023

In BLOOM: Creativity and Affinity in Artificial Lyrics and Art

We apply a large multilingual language model (BLOOM-176B) in open-ended ...

Please sign up or login with your details

Forgot password? Click here to reset