RePreM: Representation Pre-training with Masked Model for Reinforcement Learning

03/03/2023
by   Yuanying Cai, et al.
0

Inspired by the recent success of sequence modeling in RL and the use of masked language model for pre-training, we propose a masked model for pre-training in RL, RePreM (Representation Pre-training with Masked Model), which trains the encoder combined with transformer blocks to predict the masked states or actions in a trajectory. RePreM is simple but effective compared to existing representation pre-training methods in RL. It avoids algorithmic sophistication (such as data augmentation or estimating multiple models) with sequence modeling and generates a representation that captures long-term dynamics well. Empirically, we demonstrate the effectiveness of RePreM in various tasks, including dynamic prediction, transfer learning, and sample-efficient RL with both value-based and actor-critic methods. Moreover, we show that RePreM scales well with dataset size, dataset quality, and the scale of the encoder, which indicates its potential towards big RL models.

READ FULL TEXT
research
02/09/2023

An Investigation into Pre-Training Object-Centric Representations for Reinforcement Learning

Unsupervised object-centric representation (OCR) learning has recently d...
research
02/11/2023

Cross-domain Random Pre-training with Prototypes for Reinforcement Learning

Task-agnostic cross-domain pre-training shows great potential in image-b...
research
09/12/2017

Pre-training Neural Networks with Human Demonstrations for Deep Reinforcement Learning

Deep reinforcement learning (deep RL) has achieved superior performance ...
research
03/08/2021

Behavior From the Void: Unsupervised Active Pre-Training

We introduce a new unsupervised pre-training method for reinforcement le...
research
05/23/2017

Enhanced Experience Replay Generation for Efficient Reinforcement Learning

Applying deep reinforcement learning (RL) on real systems suffers from s...
research
06/14/2023

Recipes for Sequential Pre-training of Multilingual Encoder and Seq2Seq Models

Pre-trained encoder-only and sequence-to-sequence (seq2seq) models each ...
research
06/28/2021

Knowledge Transfer by Discriminative Pre-training for Academic Performance Prediction

The needs for precisely estimating a student's academic performance have...

Please sign up or login with your details

Forgot password? Click here to reset