A Generalized Algorithm for Multi-Objective Reinforcement Learning and Policy Adaptation

08/21/2019
by   Runzhe Yang, et al.
0

We introduce a new algorithm for multi-objective reinforcement learning (MORL) with linear preferences, with the goal of enabling few-shot adaptation to new tasks. In MORL, the aim is to learn policies over multiple competing objectives whose relative importance (preferences) is unknown to the agent. While this alleviates dependence on scalar reward design, the expected return of a policy can change significantly with varying preferences, making it challenging to learn a single model to produce optimal policies under different preference conditions. We propose a generalized version of the Bellman equation to learn a single parametric representation for optimal policies over the space of all possible preferences. After this initial learning phase, our agent can quickly adapt to any given preference, or automatically infer an underlying preference with very few samples. Experiments across four different domains demonstrate the effectiveness of our approach.

READ FULL TEXT
research
01/18/2023

Sample-Efficient Multi-Objective Learning via Generalized Policy Improvement Prioritization

Multi-objective reinforcement learning (MORL) algorithms tackle sequenti...
research
05/15/2020

A Distributional View on Multi-Objective Policy Optimization

Many real-world problems require trading off multiple competing objectiv...
research
06/16/2023

Fairness in Preference-based Reinforcement Learning

In this paper, we address the issue of fairness in preference-based rein...
research
04/30/2023

Scaling Pareto-Efficient Decision Making Via Offline Multi-Objective RL

The goal of multi-objective reinforcement learning (MORL) is to learn po...
research
11/25/2020

Accommodating Picky Customers: Regret Bound and Exploration Complexity for Multi-Objective Reinforcement Learning

In this paper we consider multi-objective reinforcement learning where t...
research
04/27/2023

Preference Inference from Demonstration in Multi-objective Multi-agent Decision Making

It is challenging to quantify numerical preferences for different object...
research
10/03/2019

Using Logical Specifications of Objectives in Multi-Objective Reinforcement Learning

In the multi-objective reinforcement learning (MORL) paradigm, the relat...

Please sign up or login with your details

Forgot password? Click here to reset