Towards Flexible Inference in Sequential Decision Problems via Bidirectional Transformers

04/28/2022
by   Micah Carroll, et al.
2

Randomly masking and predicting word tokens has been a successful approach in pre-training language models for a variety of downstream tasks. In this work, we observe that the same idea also applies naturally to sequential decision making, where many well-studied tasks like behavior cloning, offline RL, inverse dynamics, and waypoint conditioning correspond to different sequence maskings over a sequence of states, actions, and returns. We introduce the FlexiBiT framework, which provides a unified way to specify models which can be trained on many different sequential decision making tasks. We show that a single FlexiBiT model is simultaneously capable of carrying out many tasks with performance similar to or better than specialized models. Additionally, we show that performance can be further improved by fine-tuning our general model on specific tasks of interest.

READ FULL TEXT

page 6

page 8

page 13

page 14

page 15

research
11/20/2022

UniMASK: Unified Inference in Sequential Decision Problems

Randomly masking and predicting word tokens has been a successful approa...
research
09/08/2021

On the Transferability of Pre-trained Language Models: A Study from Artificial Datasets

Pre-training language models (LMs) on large-scale unlabeled text data ma...
research
11/23/2022

Masked Autoencoding for Scalable and Generalizable Decision Making

We are interested in learning scalable agents for reinforcement learning...
research
05/16/2023

Adapting Sentence Transformers for the Aviation Domain

Learning effective sentence representations is crucial for many Natural ...
research
05/25/2023

Dynamic Context Pruning for Efficient and Interpretable Autoregressive Transformers

Autoregressive Transformers adopted in Large Language Models (LLMs) are ...
research
09/14/2021

Rationales for Sequential Predictions

Sequence models are a critical component of modern NLP systems, but thei...
research
03/14/2023

Merging Decision Transformers: Weight Averaging for Forming Multi-Task Policies

Recent work has shown the promise of creating generalist, transformer-ba...

Please sign up or login with your details

Forgot password? Click here to reset