Instruction Tuning with GPT-4

04/06/2023
by   Baolin Peng, et al.
0

Prior work has shown that finetuning large language models (LLMs) using machine-generated instruction-following data enables such models to achieve remarkable zero-shot capabilities on new tasks, and no human-written instructions are needed. In this paper, we present the first attempt to use GPT-4 to generate instruction-following data for LLM finetuning. Our early experiments on instruction-tuned LLaMA models show that the 52K English and Chinese instruction-following data generated by GPT-4 leads to superior zero-shot performance on new tasks to the instruction-following data generated by previous state-of-the-art models. We also collect feedback and comparison data from GPT-4 to enable a comprehensive evaluation and reward model training. We make our data generated using GPT-4 as well as our codebase publicly available.

READ FULL TEXT

page 3

page 5

page 6

page 7

research
04/17/2023

Visual Instruction Tuning

Instruction tuning large language models (LLMs) using machine-generated ...
research
05/18/2023

Aligning Instruction Tasks Unlocks Large Language Models as Zero-Shot Relation Extractors

Recent work has shown that fine-tuning large language models (LLMs) on l...
research
01/31/2023

The Flan Collection: Designing Data and Methods for Effective Instruction Tuning

We study the design decisions of publicly available instruction tuning m...
research
08/12/2023

VisIT-Bench: A Benchmark for Vision-Language Instruction Following Inspired by Real-World Use

We introduce VisIT-Bench (Visual InsTruction Benchmark), a benchmark for...
research
02/11/2023

Exploiting Programmatic Behavior of LLMs: Dual-Use Through Standard Security Attacks

Recent advances in instruction-following large language models (LLMs) ha...
research
08/09/2023

LLaMA-E: Empowering E-commerce Authoring with Multi-Aspect Instruction Following

E-commerce authoring involves creating attractive, abundant, and targete...
research
09/18/2023

Dynamic-SUPERB: Towards A Dynamic, Collaborative, and Comprehensive Instruction-Tuning Benchmark for Speech

Text language models have shown remarkable zero-shot capability in gener...

Please sign up or login with your details

Forgot password? Click here to reset