Maximum Causal Entropy Inverse Constrained Reinforcement Learning

05/04/2023
by   Mattijs Baert, et al.
0

When deploying artificial agents in real-world environments where they interact with humans, it is crucial that their behavior is aligned with the values, social norms or other requirements of that environment. However, many environments have implicit constraints that are difficult to specify and transfer to a learning agent. To address this challenge, we propose a novel method that utilizes the principle of maximum causal entropy to learn constraints and an optimal policy that adheres to these constraints, using demonstrations of agents that abide by the constraints. We prove convergence in a tabular setting and provide an approximation which scales to complex environments. We evaluate the effectiveness of the learned policy by assessing the reward received and the number of constraint violations, and we evaluate the learned cost function based on its transferability to other agents. Our method has been shown to outperform state-of-the-art approaches across a variety of tasks and environments, and it is able to handle problems with stochastic dynamics and a continuous state-action space.

READ FULL TEXT

page 19

page 20

research
02/21/2022

Learning Behavioral Soft Constraints from Demonstrations

Many real-life scenarios require humans to make difficult trade-offs: do...
research
03/22/2022

X-MEN: Guaranteed XOR-Maximum Entropy Constrained Inverse Reinforcement Learning

Inverse Reinforcement Learning (IRL) is a powerful way of learning from ...
research
09/22/2021

Making Human-Like Trade-offs in Constrained Environments by Learning from Demonstrations

Many real-life scenarios require humans to make difficult trade-offs: do...
research
04/08/2021

Learning What To Do by Simulating the Past

Since reward functions are hard to specify, recent work has focused on l...
research
09/21/2018

Interpretable Multi-Objective Reinforcement Learning through Policy Orchestration

Autonomous cyber-physical agents and systems play an increasingly large ...
research
03/08/2023

Computational-level Analysis of Constraint Compliance for General Intelligence

Human behavior is conditioned by codes and norms that constrain action. ...
research
07/11/2019

Reward Advancement: Transforming Policy under Maximum Causal Entropy Principle

Many real-world human behaviors can be characterized as a sequential dec...

Please sign up or login with your details

Forgot password? Click here to reset