Long-Context Language Decision Transformers and Exponential Tilt for Interactive Text Environments

02/10/2023
by   Nicolas Gontier, et al.
1

Text-based game environments are challenging because agents must deal with long sequences of text, execute compositional actions using text and learn from sparse rewards. We address these challenges by proposing Long-Context Language Decision Transformers (LLDTs), a framework that is based on long transformer language models and decision transformers (DTs). LLDTs extend DTs with 3 components: (1) exponential tilt to guide the agent towards high obtainable goals, (2) novel goal conditioning methods yielding significantly better results than the traditional return-to-go (sum of all future rewards), and (3) a model of future observations. Our ablation results show that predicting future observations improves agent performance. To the best of our knowledge, LLDTs are the first to address offline RL with DTs on these challenging games. Our experiments show that LLDTs achieve the highest scores among many different types of agents on some of the most challenging Jericho games, such as Enchanter.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
07/18/2021

Pre-trained Language Models as Prior Knowledge for Playing Text-based Games

Recently, text world games have been proposed to enable artificial agent...
research
10/13/2022

Behavior Cloned Transformers are Neurosymbolic Reasoners

In this work, we explore techniques for augmenting interactive agents wi...
research
05/30/2022

Multi-Game Decision Transformers

A longstanding goal of the field of AI is a strategy for compiling diver...
research
05/31/2022

You Can't Count on Luck: Why Decision Transformers Fail in Stochastic Environments

Recently, methods such as Decision Transformer that reduce reinforcement...
research
07/31/2023

Learning to Model the World with Language

To interact with humans in the world, agents need to understand the dive...
research
10/24/2022

Dichotomy of Control: Separating What You Can Control from What You Cannot

Future- or return-conditioned supervised learning is an emerging paradig...
research
09/13/2019

Toward Automated Quest Generation in Text-Adventure Games

Interactive fictions, or text-adventures, are games in which a player in...

Please sign up or login with your details

Forgot password? Click here to reset