Expected Scalarised Returns Dominance: A New Solution Concept for Multi-Objective Decision Making

06/02/2021
by   Conor F. Hayes, et al.
11

In many real-world scenarios, the utility of a user is derived from the single execution of a policy. In this case, to apply multi-objective reinforcement learning, the expected utility of the returns must be optimised. Various scenarios exist where a user's preferences over objectives (also known as the utility function) are unknown or difficult to specify. In such scenarios, a set of optimal policies must be learned. However, settings where the expected utility must be maximised have been largely overlooked by the multi-objective reinforcement learning community and, as a consequence, a set of optimal solutions has yet to be defined. In this paper we address this challenge by proposing first-order stochastic dominance as a criterion to build solution sets to maximise expected utility. We also propose a new dominance criterion, known as expected scalarised returns (ESR) dominance, that extends first-order stochastic dominance to allow a set of optimal policies to be learned in practice. We then define a new solution concept called the ESR set, which is a set of policies that are ESR dominant. Finally, we define a new multi-objective distributional tabular reinforcement learning (MOT-DRL) algorithm to learn the ESR set in a multi-objective multi-armed bandit setting.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
07/01/2022

Multi-Objective Coordination Graphs for the Expected Scalarised Returns with Generative Flow Models

Many real-world problems contain multiple objectives and agents, where a...
research
05/09/2023

Distributional Multi-Objective Decision Making

For effective decision support in scenarios with conflicting objectives,...
research
02/01/2021

Risk Aware and Multi-Objective Decision Making with Distributional Monte Carlo Tree Search

In many risk-aware and multi-objective reinforcement learning settings, ...
research
11/23/2022

Monte Carlo Tree Search Algorithms for Risk-Aware and Multi-Objective Reinforcement Learning

In many risk-aware and multi-objective reinforcement learning settings, ...
research
07/03/2021

Multi-Objective Congestion Control

Decades of research on Internet congestion control (CC) has produced a p...
research
08/22/2022

Efficient Utility Function Learning for Multi-Objective Parameter Optimization with Prior Knowledge

The current state-of-the-art in multi-objective optimization assumes eit...
research
10/02/2019

Relationship Explainable Multi-objective Optimization Via Vector Value Function Based Reinforcement Learning

Solving multi-objective optimization problems is important in various ap...

Please sign up or login with your details

Forgot password? Click here to reset