Guiding Pretraining in Reinforcement Learning with Large Language Models

02/13/2023
by   Yuqing Du, et al.
0

Reinforcement learning algorithms typically struggle in the absence of a dense, well-shaped reward function. Intrinsically motivated exploration methods address this limitation by rewarding agents for visiting novel states or transitions, but these methods offer limited benefits in large environments where most discovered novelty is irrelevant for downstream tasks. We describe a method that uses background knowledge from text corpora to shape exploration. This method, called ELLM (Exploring with LLMs) rewards an agent for achieving goals suggested by a language model prompted with a description of the agent's current state. By leveraging large-scale language model pretraining, ELLM guides agents toward human-meaningful and plausibly useful behaviors without requiring a human in the loop. We evaluate ELLM in the Crafter game environment and the Housekeep robotic simulator, showing that ELLM-trained agents have better coverage of common-sense behaviors during pretraining and usually match or improve performance on a range of downstream tasks.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
08/23/2023

Language Reward Modulation for Pretraining Reinforcement Learning

Using learned reward functions (LRFs) as a means to solve sparse-reward ...
research
10/06/2020

Pretrained Language Model Embryology: The Birth of ALBERT

While behaviors of pretrained language models (LMs) have been thoroughly...
research
10/16/2021

Lifelong Pretraining: Continually Adapting Language Models to Emerging Corpora

Pretrained language models (PTLMs) are typically learned over a large, s...
research
05/19/2023

Shattering the Agent-Environment Interface for Fine-Tuning Inclusive Language Models

A centerpiece of the ever-popular reinforcement learning from human feed...
research
03/25/2023

Sem4SAP: Synonymous Expression Mining From Open Knowledge Graph For Language Model Synonym-Aware Pretraining

The model's ability to understand synonymous expression is crucial in ma...
research
11/21/2017

Transferring Agent Behaviors from Videos via Motion GANs

A major bottleneck for developing general reinforcement learning agents ...
research
04/08/2022

Semantic Exploration from Language Abstractions and Pretrained Representations

Continuous first-person 3D environments pose unique exploration challeng...

Please sign up or login with your details

Forgot password? Click here to reset