Distributed Soft Actor-Critic with Multivariate Reward Representation and Knowledge Distillation

11/29/2019
by   Dmitry Akimov, et al.
0

In this paper, we describe NeurIPS 2019 Learning to Move - Walk Around challenge physics-based environment and present our solution to this competition which scored 1303.727 mean reward points and took 3rd place. Our method combines recent advances from both continuous- and discrete-action space reinforcement learning, such as Soft Actor-Critic and Recurrent Experience Replay in Distributed Reinforcement Learning. We trained our agent in two stages: to move somewhere at the first stage and to follow the target velocity field at the second stage. We also introduce novel Q-function split technique, which we believe facilitates the task of training an agent, allows critic pretraining and reusing it for solving harder problems, and mitigate reward shaping design efforts.

READ FULL TEXT

page 2

page 4

research
10/16/2019

Soft Actor-Critic for Discrete Action Settings

Soft Actor-Critic is a state-of-the-art reinforcement learning algorithm...
research
06/12/2020

Potential Field Guided Actor-Critic Reinforcement Learning

In this paper, we consider the problem of actor-critic reinforcement lea...
research
10/29/2021

Brick-by-Brick: Combinatorial Construction with Deep Reinforcement Learning

Discovering a solution in a combinatorial space is prevalent in many rea...
research
12/20/2017

Pseudorehearsal in actor-critic agents with neural network function approximation

Catastrophic forgetting has a significant negative impact in reinforceme...
research
04/25/2023

Fulfilling Formal Specifications ASAP by Model-free Reinforcement Learning

We propose a model-free reinforcement learning solution, namely the ASAP...
research
07/21/2022

Incorporating Prior Knowledge into Reinforcement Learning for Soft Tissue Manipulation with Autonomous Grasping Point Selection

Previous soft tissue manipulation studies assumed that the grasping poin...
research
11/02/2022

Spatial-temporal recurrent reinforcement learning for autonomous ships

The paper proposes a spatial-temporal recurrent neural network architect...

Please sign up or login with your details

Forgot password? Click here to reset