Evolutionary Action Selection for Gradient-based Policy Learning

01/12/2022
by   Yan Ma, et al.
9

Evolutionary Algorithms (EAs) and Deep Reinforcement Learning (DRL) have recently been combined to integrate the advantages of the two solutions for better policy learning. However, in existing hybrid methods, EA is used to directly train the policy network, which will lead to sample inefficiency and unpredictable impact on the policy performance. To better integrate these two approaches and avoid the drawbacks caused by the introduction of EA, we devote ourselves to devising a more efficient and reasonable method of combining EA and DRL. In this paper, we propose Evolutionary Action Selection-Twin Delayed Deep Deterministic Policy Gradient (EAS-TD3), a novel combination of EA and DRL. In EAS, we focus on optimizing the action chosen by the policy network and attempt to obtain high-quality actions to guide policy learning through an evolutionary algorithm. We conduct several experiments on challenging continuous control tasks. The result shows that EAS-TD3 shows superior performance over other state-of-art methods.

READ FULL TEXT
research
01/08/2020

Sample-based Distributional Policy Gradient

Distributional reinforcement learning (DRL) is a recent reinforcement le...
research
12/10/2020

An Efficient Asynchronous Method for Integrating Evolutionary and Gradient-based Policy Search

Deep reinforcement learning (DRL) algorithms and evolution strategies (E...
research
11/24/2019

Merging Deterministic Policy Gradient Estimations with Varied Bias-Variance Tradeoff for Effective Deep Reinforcement Learning

Deep reinforcement learning (DRL) on Markov decision processes (MDPs) wi...
research
03/13/2018

Policy Search in Continuous Action Domains: an Overview

Continuous action policy search, the search for efficient policies in co...
research
04/24/2023

Efficient Halftoning via Deep Reinforcement Learning

Halftoning aims to reproduce a continuous-tone image with pixels whose i...
research
09/24/2021

Combing Policy Evaluation and Policy Improvement in a Unified f-Divergence Framework

The framework of deep reinforcement learning (DRL) provides a powerful a...

Please sign up or login with your details

Forgot password? Click here to reset