Joint Optimization of Multi-Objective Reinforcement Learning with Policy Gradient Based Algorithm

05/28/2021
by   Qinbo Bai, et al.
0

Many engineering problems have multiple objectives, and the overall aim is to optimize a non-linear function of these objectives. In this paper, we formulate the problem of maximizing a non-linear concave function of multiple long-term objectives. A policy-gradient based model-free algorithm is proposed for the problem. To compute an estimate of the gradient, a biased estimator is proposed. The proposed algorithm is shown to achieve convergence to within an ϵ of the global optima after sampling 𝒪(M^4σ^2/(1-γ)^8ϵ^4) trajectories where γ is the discount factor and M is the number of the agents, thus achieving the same dependence on ϵ as the policy gradient algorithm for the standard reinforcement learning.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
12/28/2022

On the Convergence of Discounted Policy Gradient Methods

Many popular policy gradient methods for reinforcement learning follow a...
research
10/03/2022

Policy Gradient for Reinforcement Learning with General Utilities

In Reinforcement Learning (RL), the goal of agents is to discover an opt...
research
06/18/2019

Evolutionary Reinforcement Learning for Sample-Efficient Multiagent Coordination

A key challenge for Multiagent RL (Reinforcement Learning) is the design...
research
01/08/2021

Learning Low-Correlation GPS Spreading Codes with a Policy Gradient Algorithm

With the birth of the next-generation GPS III constellation and the upco...
research
05/24/2023

Policy Learning based on Deep Koopman Representation

This paper proposes a policy learning algorithm based on the Koopman ope...
research
09/15/2022

Multi-Objective Policy Gradients with Topological Constraints

Multi-objective optimization models that encode ordered sequential const...
research
06/30/2021

Inverse Design of Grating Couplers Using the Policy Gradient Method from Reinforcement Learning

We present a proof-of-concept technique for the inverse design of electr...

Please sign up or login with your details

Forgot password? Click here to reset