Demonstration-efficient Inverse Reinforcement Learning in Procedurally Generated Environments

12/04/2020
by   Alessandro Sestini, et al.
0

Deep Reinforcement Learning achieves very good results in domains where reward functions can be manually engineered. At the same time, there is growing interest within the community in using games based on Procedurally Content Generation (PCG) as benchmark environments since this type of environment is perfect for studying overfitting and generalization of agents under domain shift. Inverse Reinforcement Learning (IRL) can instead extrapolate reward functions from expert demonstrations, with good results even on high-dimensional problems, however there are no examples of applying these techniques to procedurally-generated environments. This is mostly due to the number of demonstrations needed to find a good reward model. We propose a technique based on Adversarial Inverse Reinforcement Learning which can significantly decrease the need for expert demonstrations in PCG games. Through the use of an environment with a limited set of initial seed levels, plus some modifications to stabilize training, we show that our approach, DE-AIRL, is demonstration-efficient and still able to extrapolate reward functions which generalize to the fully procedural domain. We demonstrate the effectiveness of our technique on two procedural environments, MiniGrid and DeepCrawl, for a variety of tasks.

READ FULL TEXT
research
10/24/2018

Inverse reinforcement learning for video games

Deep reinforcement learning achieves superhuman performance in a range o...
research
07/31/2021

Inverse Reinforcement Learning for Strategy Identification

In adversarial environments, one side could gain an advantage by identif...
research
05/21/2019

Stochastic Inverse Reinforcement Learning

Inverse reinforcement learning (IRL) is an ill-posed inverse problem sin...
research
04/12/2019

Extrapolating Beyond Suboptimal Demonstrations via Inverse Reinforcement Learning from Observations

A critical flaw of existing inverse reinforcement learning (IRL) methods...
research
06/22/2018

Human-Interactive Subgoal Supervision for Efficient Inverse Reinforcement Learning

Humans are able to understand and perform complex tasks by strategically...
research
07/07/2021

Learning Time-Invariant Reward Functions through Model-Based Inverse Reinforcement Learning

Inverse reinforcement learning is a paradigm motivated by the goal of le...
research
07/13/2021

A Hierarchical Bayesian model for Inverse RL in Partially-Controlled Environments

Robots learning from observations in the real world using inverse reinfo...

Please sign up or login with your details

Forgot password? Click here to reset