ELLA: Exploration through Learned Language Abstraction

03/10/2021
by   Suvir Mirchandani, et al.
0

Building agents capable of understanding language instructions is critical to effective and robust human-AI collaboration. Recent work focuses on training these instruction following agents via reinforcement learning in environments with synthetic language; however, these instructions often define long-horizon, sparse-reward tasks, and learning policies requires many episodes of experience. To this end, we introduce ELLA: Exploration through Learned Language Abstraction, a reward shaping approach that correlates high-level instructions with simpler low-level instructions to enrich the sparse rewards afforded by the environment. ELLA has two key elements: 1) A termination classifier that identifies when agents complete low-level instructions, and 2) A relevance classifier that correlates low-level instructions with success on high-level tasks. We learn the termination classifier offline from pairs of instructions and terminal states. Notably, in departure from prior work in language and abstraction, we learn the relevance classifier online, without relying on an explicit decomposition of high-level instructions to low-level instructions. On a suite of complex grid world environments with varying instruction complexities and reward sparsity, ELLA shows a significant gain in sample efficiency across several environments compared to competitive language-based reward shaping and no-shaping methods.

READ FULL TEXT

page 7

page 12

page 13

research
10/12/2021

FILM: Following Instructions in Language with Modular Methods

Recent methods for embodied instruction following are typically trained ...
research
10/08/2020

Generating Instructions at Different Levels of Abstraction

When generating technical instructions, it is often convenient to descri...
research
05/31/2018

Following High-level Navigation Instructions on a Simulated Quadcopter with Imitation Learning

We introduce a method for following high-level navigation instructions b...
research
08/31/2018

Wasabi: A Framework for Dynamically Analyzing WebAssembly

WebAssembly is the new low-level language for the web and has now been i...
research
01/26/2022

Learning Invariable Semantical Representation from Language for Extensible Policy Generalization

Recently, incorporating natural language instructions into reinforcement...
research
06/21/2023

Improving Long-Horizon Imitation Through Instruction Prediction

Complex, long-horizon planning and its combinatorial nature pose steep c...
research
08/27/2023

CUDA-PIM: End-to-End Integration of Digital Processing-in-Memory from High-Level C++ to Microarchitectural Design

Digital processing-in-memory (PIM) architectures mitigate the memory wal...

Please sign up or login with your details

Forgot password? Click here to reset