Modularity benefits reinforcement learning agents with competing homeostatic drives

04/13/2022
by   Zack Dulberg, et al.
1

The problem of balancing conflicting needs is fundamental to intelligence. Standard reinforcement learning algorithms maximize a scalar reward, which requires combining different objective-specific rewards into a single number. Alternatively, different objectives could also be combined at the level of action value, such that specialist modules responsible for different objectives submit different action suggestions to a decision process, each based on rewards that are independent of one another. In this work, we explore the potential benefits of this alternative strategy. We investigate a biologically relevant multi-objective problem, the continual homeostasis of a set of variables, and compare a monolithic deep Q-network to a modular network with a dedicated Q-learner for each variable. We find that the modular agent: a) requires minimal exogenously determined exploration; b) has improved sample efficiency; and c) is more robust to out-of-domain perturbation.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
02/08/2023

A Scale-Independent Multi-Objective Reinforcement Learning with Convergence Analysis

Many sequential decision-making problems need optimization of different ...
research
02/21/2022

Inferring Lexicographically-Ordered Rewards from Preferences

Modeling the preferences of agents over a set of alternatives is a princ...
research
07/11/2023

Reinforcement Learning with Non-Cumulative Objective

In reinforcement learning, the objective is almost always defined as a c...
research
07/07/2022

Multi-objective Optimization of Notifications Using Offline Reinforcement Learning

Mobile notification systems play a major role in a variety of applicatio...
research
06/25/2018

Multi-objective Model-based Policy Search for Data-efficient Learning with Sparse Rewards

The most data-efficient algorithms for reinforcement learning in robotic...
research
04/21/2017

Multi-Objective Deep Q-Learning with Subsumption Architecture

In this work we present a method for using Deep Q-Networks (DQNs) in mul...
research
06/16/2021

Mungojerrie: Reinforcement Learning of Linear-Time Objectives

Reinforcement learning synthesizes controllers without prior knowledge o...

Please sign up or login with your details

Forgot password? Click here to reset