Parametrized Deep Q-Networks Learning: Reinforcement Learning with Discrete-Continuous Hybrid Action Space

10/10/2018
by   Jiechao Xiong, et al.
0

Most existing deep reinforcement learning (DRL) frameworks consider either discrete action space or continuous action space solely. Motivated by applications in computer games, we consider the scenario with discrete-continuous hybrid action space. To handle hybrid action space, previous works either approximate the hybrid space by discretization, or relax it into a continuous set. In this paper, we propose a parametrized deep Q-network (P- DQN) framework for the hybrid action space without approximation or relaxation. Our algorithm combines the spirits of both DQN (dealing with discrete action space) and DDPG (dealing with continuous action space) by seamlessly integrating them. Empirical results on a simulation example, scoring a goal in simulated RoboCup soccer and the solo mode in game King of Glory (KOG) validate the efficiency and effectiveness of our method.

READ FULL TEXT
research
09/12/2021

HyAR: Addressing Discrete-Continuous Action Reinforcement Learning via Hybrid Action Representation

Discrete-continuous hybrid action space is a natural setting in many pra...
research
05/24/2018

A0C: Alpha Zero in Continuous Action Space

A core novelty of Alpha Zero is the interleaving of tree search and deep...
research
10/29/2020

Deep Jump Q-Evaluation for Offline Policy Evaluation in Continuous Action Space

We consider off-policy evaluation (OPE) in continuous action domains, su...
research
07/16/2018

Discrete linear-complexity reinforcement learning in continuous action spaces for Q-learning algorithms

In this article, we sketch an algorithm that extends the Q-learning algo...
research
11/23/2022

Reinforcement learning for traffic signal control in hybrid action space

The prevailing reinforcement-learning-based traffic signal control metho...
research
07/22/2022

Learn Continuously, Act Discretely: Hybrid Action-Space Reinforcement Learning For Optimal Execution

Optimal execution is a sequential decision-making problem for cost-savin...
research
06/10/2019

Deep Reinforcement Learning with Discrete Normalized Advantage Functions for Resource Management in Network Slicing

Network slicing promises to provision diversified services with distinct...

Please sign up or login with your details

Forgot password? Click here to reset