The CoT Collection: Improving Zero-shot and Few-shot Learning of Language Models via Chain-of-Thought Fine-Tuning

05/23/2023
by   Seungone Kim, et al.
0

Large Language Models (LLMs) have shown enhanced capabilities of solving novel tasks by reasoning step-by-step known as Chain-of-Thought (CoT) reasoning; how can we instill the same capability of reasoning step-by-step on unseen tasks into LMs that possess less than <100B parameters? To address this question, we first introduce the CoT Collection, a new instruction-tuning dataset that augments 1.88 million CoT rationales across 1,060 tasks. We show that continually fine-tuning Flan-T5 (3B 11B) with the CoT Collection enables the 3B 11B LMs to perform CoT better on unseen tasks, leading to an improvement in the average zero-shot accuracy on 27 datasets of the BIG-Bench-Hard benchmark by +4.34 show that instruction tuning with CoT allows LMs to possess stronger few-shot learning capabilities, resulting in an improvement of +2.97 domain-specific tasks over Flan-T5 (3B 11B), respectively. We make our CoT Collection data and our trained models publicly available at https://github.com/kaist-lklab/CoT-Collection.

READ FULL TEXT

page 5

page 15

page 16

page 17

page 18

page 19

page 20

page 21

research
09/03/2021

Finetuned Language Models Are Zero-Shot Learners

This paper explores a simple method for improving the zero-shot learning...
research
01/31/2023

The Flan Collection: Designing Data and Methods for Effective Instruction Tuning

We study the design decisions of publicly available instruction tuning m...
research
11/22/2022

Program of Thoughts Prompting: Disentangling Computation from Reasoning for Numerical Reasoning Tasks

Recently, there has been significant progress in teaching language model...
research
02/07/2023

Exploring the Benefits of Training Expert Language Models over Instruction Tuning

Recently, Language Models (LMs) instruction-tuned on multiple tasks, als...
research
12/20/2022

Large Language Models Are Reasoning Teachers

Language models (LMs) have demonstrated remarkable performance on downst...
research
10/06/2022

Guess the Instruction! Flipped Learning Makes Language Models Stronger Zero-Shot Learners

Meta-training, which fine-tunes the language model (LM) on various downs...
research
03/07/2023

CoTEVer: Chain of Thought Prompting Annotation Toolkit for Explanation Verification

Chain-of-thought (CoT) prompting enables large language models (LLMs) to...

Please sign up or login with your details

Forgot password? Click here to reset