Diversity Through Exclusion (DTE): Niche Identification for Reinforcement Learning through Value-Decomposition

02/02/2023
by   Peter Sunehag, et al.
0

Many environments contain numerous available niches of variable value, each associated with a different local optimum in the space of behaviors (policy space). In such situations it is often difficult to design a learning process capable of evading distraction by poor local optima long enough to stumble upon the best available niche. In this work we propose a generic reinforcement learning (RL) algorithm that performs better than baseline deep Q-learning algorithms in such environments with multiple variably-valued niches. The algorithm we propose consists of two parts: an agent architecture and a learning rule. The agent architecture contains multiple sub-policies. The learning rule is inspired by fitness sharing in evolutionary computation and applied in reinforcement learning using Value-Decomposition-Networks in a novel manner for a single-agent's internal population. It can concretely be understood as adding an extra loss term where one policy's experience is also used to update all the other policies in a manner that decreases their value estimates for the visited states. In particular, when one sub-policy visits a particular state frequently this decreases the value predicted for other sub-policies for going to that state. Further, we introduce an artificial chemistry inspired platform where it is easy to create tasks with multiple rewarding strategies utilizing different resources (i.e. multiple niches). We show that agents trained this way can escape poor-but-attractive local optima to instead converge to harder-to-discover higher value strategies in both the artificial chemistry environments and in simpler illustrative environments.

READ FULL TEXT

page 1

page 7

page 9

page 11

research
07/21/2023

Offline Multi-Agent Reinforcement Learning with Implicit Global-to-Local Value Regularization

Offline reinforcement learning (RL) has received considerable attention ...
research
11/09/2021

Dealing with the Unknown: Pessimistic Offline Reinforcement Learning

Reinforcement Learning (RL) has been shown effective in domains where th...
research
02/13/2018

Diversity-Driven Exploration Strategy for Deep Reinforcement Learning

Efficient exploration remains a challenging research problem in reinforc...
research
04/06/2022

Federated Reinforcement Learning with Environment Heterogeneity

We study a Federated Reinforcement Learning (FedRL) problem in which n a...
research
07/17/2020

Discovering Reinforcement Learning Algorithms

Reinforcement learning (RL) algorithms update an agent's parameters acco...
research
12/02/2020

Policy Supervectors: General Characterization of Agents by their Behaviour

By studying the underlying policies of decision-making agents, we can le...
research
11/18/2022

Provable Defense against Backdoor Policies in Reinforcement Learning

We propose a provable defense mechanism against backdoor policies in rei...

Please sign up or login with your details

Forgot password? Click here to reset