Contrastive Explanations for Comparing Preferences of Reinforcement Learning Agents

12/17/2021
by   Jasmina Gajcin, et al.
7

In complex tasks where the reward function is not straightforward and consists of a set of objectives, multiple reinforcement learning (RL) policies that perform task adequately, but employ different strategies can be trained by adjusting the impact of individual objectives on reward function. Understanding the differences in strategies between policies is necessary to enable users to choose between offered policies, and can help developers understand different behaviors that emerge from various reward functions and training hyperparameters in RL systems. In this work we compare behavior of two policies trained on the same task, but with different preferences in objectives. We propose a method for distinguishing between differences in behavior that stem from different abilities from those that are a consequence of opposing preferences of two RL agents. Furthermore, we use only data on preference-based differences in order to generate contrasting explanations about agents' preferences. Finally, we test and evaluate our approach on an autonomous driving task and compare the behavior of a safety-oriented policy and one that prefers speed.

READ FULL TEXT
research
04/20/2022

Reinforcement Learning with Intrinsic Affinity for Personalized Asset Management

The common purpose of applying reinforcement learning (RL) to asset mana...
research
08/11/2023

Learning Control Policies for Variable Objectives from Offline Data

Offline reinforcement learning provides a viable approach to obtain adva...
research
05/29/2023

Doing the right thing for the right reason: Evaluating artificial moral cognition by probing cost insensitivity

Is it possible to evaluate the moral cognition of complex artificial age...
research
01/06/2021

One-shot Policy Elicitation via Semantic Reward Manipulation

Synchronizing expectations and knowledge about the state of the world is...
research
12/10/2021

How Private Is Your RL Policy? An Inverse RL Based Analysis Framework

Reinforcement Learning (RL) enables agents to learn how to perform vario...
research
05/30/2019

Defining Admissible Rewards for High Confidence Policy Evaluation

A key impediment to reinforcement learning (RL) in real applications wit...
research
05/16/2022

Qualitative Differences Between Evolutionary Strategies and Reinforcement Learning Methods for Control of Autonomous Agents

In this paper we analyze the qualitative differences between evolutionar...

Please sign up or login with your details

Forgot password? Click here to reset