X-MEN: Guaranteed XOR-Maximum Entropy Constrained Inverse Reinforcement Learning

03/22/2022
by   Fan Ding, et al.
0

Inverse Reinforcement Learning (IRL) is a powerful way of learning from demonstrations. In this paper, we address IRL problems with the availability of prior knowledge that optimal policies will never violate certain constraints. Conventional approaches ignoring these constraints need many demonstrations to converge. We propose XOR-Maximum Entropy Constrained Inverse Reinforcement Learning (X-MEN), which is guaranteed to converge to the optimal policy in linear rate w.r.t. the number of learning iterations. X-MEN embeds XOR-sampling – a provable sampling approach that transforms the #P complete sampling problem into queries to NP oracles – into the framework of maximum entropy IRL. X-MEN also guarantees the learned policy will never generate trajectories that violate constraints. Empirical results in navigation demonstrate that X-MEN converges faster to the optimal policies compared to baseline approaches and always generates trajectories that satisfy multi-state combinatorial constraints.

READ FULL TEXT

page 2

page 7

page 8

research
08/20/2022

Weighted Maximum Entropy Inverse Reinforcement Learning

We study inverse reinforcement learning (IRL) and imitation learning (IM...
research
05/04/2023

Maximum Causal Entropy Inverse Constrained Reinforcement Learning

When deploying artificial agents in real-world environments where they i...
research
05/14/2023

Inverse Reinforcement Learning With Constraint Recovery

In this work, we propose a novel inverse reinforcement learning (IRL) al...
research
11/03/2019

Maximum Entropy Diverse Exploration: Disentangling Maximum Entropy Reinforcement Learning

Two hitherto disconnected threads of research, diverse exploration (DE) ...
research
05/22/2018

Multi-task Maximum Entropy Inverse Reinforcement Learning

Multi-task Inverse Reinforcement Learning (IRL) is the problem of inferr...
research
01/26/2023

Principled Reinforcement Learning with Human Feedback from Pairwise or K-wise Comparisons

We provide a theoretical framework for Reinforcement Learning with Human...
research
05/25/2021

A Generalised Inverse Reinforcement Learning Framework

The gloabal objective of inverse Reinforcement Learning (IRL) is to esti...

Please sign up or login with your details

Forgot password? Click here to reset