Loss and Reward Weighing for increased learning in Distributed Reinforcement Learning

04/25/2023
by   Martin Holen, et al.
0

This paper introduces two learning schemes for distributed agents in Reinforcement Learning (RL) environments, namely Reward-Weighted (R-Weighted) and Loss-Weighted (L-Weighted) gradient merger. The R/L weighted methods replace standard practices for training multiple agents, such as summing or averaging the gradients. The core of our methods is to scale the gradient of each actor based on how high the reward (for R-Weighted) or the loss (for L-Weighted) is compared to the other actors. During training, each agent operates in differently initialized versions of the same environment, which gives different gradients from different actors. In essence, the R-Weights and L-Weights of each agent inform the other agents of its potential, which again reports which environment should be prioritized for learning. This approach of distributed learning is possible because environments that yield higher rewards, or low losses, have more critical information than environments that yield lower rewards or higher losses. We empirically demonstrate that the R-Weighted methods work superior to the state-of-the-art in multiple RL environments.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
02/13/2023

Universal Agent Mixtures and the Geometry of Intelligence

Inspired by recent progress in multi-agent Reinforcement Learning (RL), ...
research
02/19/2022

Shaping Advice in Deep Reinforcement Learning

Reinforcement learning involves agents interacting with an environment t...
research
05/23/2017

Reinforcement Learning with a Corrupted Reward Channel

No real-world reward function is perfect. Sensory errors and software bu...
research
06/26/2019

Towards Empathic Deep Q-Learning

As reinforcement learning (RL) scales to solve increasingly complex task...
research
02/06/2022

Learning Synthetic Environments and Reward Networks for Reinforcement Learning

We introduce Synthetic Environments (SEs) and Reward Networks (RNs), rep...
research
02/13/2018

Evolved Policy Gradients

We propose a meta-learning approach for learning gradient-based reinforc...
research
06/03/2015

Multi-Objective Optimization for Self-Adjusting Weighted Gradient in Machine Learning Tasks

Much of the focus in machine learning research is placed in creating new...

Please sign up or login with your details

Forgot password? Click here to reset