Recursively Summarizing Books with Human Feedback

09/22/2021
by   Jeff Wu, et al.
0

A major challenge for scaling machine learning is training models to perform tasks that are very difficult or time-consuming for humans to evaluate. We present progress on this problem on the task of abstractive summarization of entire fiction novels. Our method combines learning from human feedback with recursive task decomposition: we use models trained on smaller parts of the task to assist humans in giving feedback on the broader task. We collect a large volume of demonstrations and comparisons from human labelers, and fine-tune GPT-3 using behavioral cloning and reward modeling to do summarization recursively. At inference time, the model first summarizes small sections of the book and then recursively summarizes these summaries to produce a summary of the entire book. Our human labelers are able to supervise and evaluate the models quickly, despite not having read the entire books themselves. Our resulting model generates sensible summaries of entire books, even matching the quality of human-written summaries in a few cases (∼5% of books). We achieve state-of-the-art results on the recent BookSum dataset for book-length summarization. A zero-shot question-answering model using these summaries achieves state-of-the-art results on the challenging NarrativeQA benchmark for answering questions about books and movie scripts. We release datasets of samples from our model.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
06/12/2022

Self-critiquing models for assisting human evaluators

We fine-tune large language models to write natural language critiques (...
research
12/19/2022

Human-in-the-loop Abstractive Dialogue Summarization

Abstractive dialogue summarization has received increasing attention rec...
research
09/02/2020

Learning to summarize from human feedback

As language models become more powerful, training and evaluation are inc...
research
01/31/2023

Benchmarking Large Language Models for News Summarization

Large language models (LLMs) have shown promise for automatic summarizat...
research
12/17/2021

WebGPT: Browser-assisted question-answering with human feedback

We fine-tune GPT-3 to answer long-form questions using a text-based web-...
research
12/16/2021

QuALITY: Question Answering with Long Input Texts, Yes!

To enable building and testing models on long-document comprehension, we...
research
11/30/2022

Time-Efficient Reward Learning via Visually Assisted Cluster Ranking

One of the most successful paradigms for reward learning uses human feed...

Please sign up or login with your details

Forgot password? Click here to reset