Learning to Query Internet Text for Informing Reinforcement Learning Agents

05/25/2022
by   Kolby Nottingham, et al.
0

Generalization to out of distribution tasks in reinforcement learning is a challenging problem. One successful approach improves generalization by conditioning policies on task or environment descriptions that provide information about the current transition or reward functions. Previously, these descriptions were often expressed as generated or crowd sourced text. In this work, we begin to tackle the problem of extracting useful information from natural language found in the wild (e.g. internet forums, documentation, and wikis). These natural, pre-existing sources are especially challenging, noisy, and large and present novel challenges compared to previous approaches. We propose to address these challenges by training reinforcement learning agents to learn to query these sources as a human would, and we experiment with how and when an agent should query. To address the how, we demonstrate that pretrained QA models perform well at executing zero-shot queries in our target domain. Using information retrieved by a QA model, we train an agent to learn when it should execute queries. We show that our method correctly learns to execute queries to maximize reward in a reinforcement learning setting.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
06/17/2022

AnyMorph: Learning Transferable Polices By Inferring Agent Morphology

The prototypical approach to reinforcement learning involves training po...
research
05/12/2022

Asking for Knowledge: Training RL Agents to Query External Knowledge Using Language

To solve difficult tasks, humans ask questions to acquire knowledge from...
research
02/21/2020

Language as a Cognitive Tool to Imagine Goals in Curiosity-Driven Exploration

Autonomous reinforcement learning agents must be intrinsically motivated...
research
06/19/2023

LARG, Language-based Automatic Reward and Goal Generation

Goal-conditioned and Multi-Task Reinforcement Learning (GCRL and MTRL) a...
research
07/19/2018

Multitask Reinforcement Learning for Zero-shot Generalization with Subtask Dependencies

We introduce a new RL problem where the agent is required to execute a g...
research
05/18/2023

Bayesian Reparameterization of Reward-Conditioned Reinforcement Learning with Energy-based Models

Recently, reward-conditioned reinforcement learning (RCRL) has gained po...
research
05/23/2023

Video Prediction Models as Rewards for Reinforcement Learning

Specifying reward signals that allow agents to learn complex behaviors i...

Please sign up or login with your details

Forgot password? Click here to reset