!MDP Playground: Meta-Features in Reinforcement Learning

09/17/2019
by   Raghu Rajan, et al.
0

Reinforcement Learning (RL) algorithms usually assume their environment to be a Markov Decision Process (MDP). Additionally, they do not try to identify specific features of environments which could help them perform better. Here, we present a few key meta-features of environments: delayed rewards, specific reward sequences, sparsity of rewards, and stochasticity of environments, which may violate the MDP assumptions and adapting to which should help RL agents perform better. While it is very time consuming to run RL algorithms on standard benchmarks, we define a parameterised collection of fast-to-run toy benchmarks in OpenAI Gym by varying these meta-features. Despite their toy nature and low compute requirements, we show that these benchmarks present substantial difficulties to current RL algorithms. Furthermore, since we can generate environments with a desired value for each of the meta-features, we have fine-grained control over the environments' difficulty and also have the ground truth available for evaluating algorithms. We believe that devising algorithms that can detect such meta-features of environments and adapt to them will be key to creating robust RL algorithms that work in a variety of different real-world problems.

READ FULL TEXT

page 6

page 7

research
12/10/2021

A Validation Tool for Designing Reinforcement Learning Environments

Reinforcement learning (RL) has gained increasing attraction in the acad...
research
03/23/2021

Policy Information Capacity: Information-Theoretic Measure for Task Complexity in Deep Reinforcement Learning

Progress in deep reinforcement learning (RL) research is largely enabled...
research
11/17/2020

REALab: An Embedded Perspective on Tampering

This paper describes REALab, a platform for embedded agency research in ...
research
04/27/2020

Evolving Inborn Knowledge For Fast Adaptation in Dynamic POMDP Problems

Rapid online adaptation to changing tasks is an important problem in mac...
research
05/19/2020

A Survey of Reinforcement Learning Algorithms for Dynamically Varying Environments

Reinforcement learning (RL) algorithms find applications in inventory co...
research
09/02/2020

Vulnerability-Aware Poisoning Mechanism for Online RL with Unknown Dynamics

Poisoning attacks, although have been studied extensively in supervised ...
research
03/08/2021

Comparing Popular Simulation Environments in the Scope of Robotics and Reinforcement Learning

This letter compares the performance of four different, popular simulati...

Please sign up or login with your details

Forgot password? Click here to reset