Large Language Models Can Self-Improve

10/20/2022
by   Jiaxin Huang, et al.
1

Large Language Models (LLMs) have achieved excellent performances in various tasks. However, fine-tuning an LLM requires extensive supervision. Human, on the other hand, may improve their reasoning abilities by self-thinking without external inputs. In this work, we demonstrate that an LLM is also capable of self-improving with only unlabeled datasets. We use a pre-trained LLM to generate "high-confidence" rationale-augmented answers for unlabeled questions using Chain-of-Thought prompting and self-consistency, and fine-tune the LLM using those self-generated solutions as target outputs. We show that our approach improves the general reasoning ability of a 540B-parameter LLM (74.4 63.4 without any ground truth label. We conduct ablation studies and show that fine-tuning on reasoning is critical for self-improvement.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
09/15/2023

Chain-of-Thought Reasoning is a Policy Improvement Operator

Large language models have astounded the world with fascinating new capa...
research
05/09/2023

MoT: Pre-thinking and Recalling Enable ChatGPT to Self-Improve with Memory-of-Thoughts

Large Language Models have shown impressive abilities on various tasks. ...
research
08/03/2023

Scaling Relationship on Learning Mathematical Reasoning with Large Language Models

Mathematical reasoning is a challenging task for large language models (...
research
09/16/2023

Struc-Bench: Are Large Language Models Really Good at Generating Complex Structured Data?

Despite the power of Large Language Models (LLMs) like GPT-4, they still...
research
07/11/2023

Self-consistency for open-ended generations

In this paper, we present a novel approach for improving the quality and...
research
07/20/2023

Multi-Method Self-Training: Improving Code Generation With Text, And Vice Versa

Large Language Models have many methods for solving the same problem. Th...
research
09/15/2023

Self-Consistent Narrative Prompts on Abductive Natural Language Inference

Abduction has long been seen as crucial for narrative comprehension and ...

Please sign up or login with your details

Forgot password? Click here to reset