One-shot Policy Elicitation via Semantic Reward Manipulation

01/06/2021
by   Aaquib Tabrez, et al.
7

Synchronizing expectations and knowledge about the state of the world is an essential capability for effective collaboration. For robots to effectively collaborate with humans and other autonomous agents, it is critical that they be able to generate intelligible explanations to reconcile differences between their understanding of the world and that of their collaborators. In this work we present Single-shot Policy Explanation for Augmenting Rewards (SPEAR), a novel sequential optimization algorithm that uses semantic explanations derived from combinations of planning predicates to augment agents' reward functions, driving their policies to exhibit more optimal behavior. We provide an experimental validation of our algorithm's policy manipulation capabilities in two practically grounded applications and conclude with a performance analysis of SPEAR on domains of increasingly complex state space and predicate counts. We demonstrate that our method makes substantial improvements over the state-of-the-art in terms of runtime and addressable problem size, enabling an agent to leverage its own expertise to communicate actionable information to improve another's performance.

READ FULL TEXT

page 2

page 5

page 6

page 12

research
10/21/2022

Integrating Policy Summaries with Reward Decomposition for Explaining Reinforcement Learning Agents

Explaining the behavior of reinforcement learning agents operating in se...
research
12/17/2021

Contrastive Explanations for Comparing Preferences of Reinforcement Learning Agents

In complex tasks where the reward function is not straightforward and co...
research
09/13/2023

Stable In-hand Manipulation with Finger Specific Multi-agent Shadow Reward

Deep Reinforcement Learning has shown its capability to solve the high d...
research
11/17/2020

Learning Dense Rewards for Contact-Rich Manipulation Tasks

Rewards play a crucial role in reinforcement learning. To arrive at the ...
research
12/30/2013

Distributed Policy Evaluation Under Multiple Behavior Strategies

We apply diffusion strategies to develop a fully-distributed cooperative...

Please sign up or login with your details

Forgot password? Click here to reset