Handling Large Discrete Action Spaces via Dynamic Neighborhood Construction

05/31/2023
by   Fabian Akkerman, et al.
0

Large discrete action spaces remain a central challenge for reinforcement learning methods. Such spaces are encountered in many real-world applications, e.g., recommender systems, multi-step planning, and inventory replenishment. The mapping of continuous proxies to discrete actions is a promising paradigm for handling large discrete action spaces. Existing continuous-to-discrete mapping approaches involve searching for discrete neighboring actions in a static pre-defined neighborhood, which requires discrete neighbor lookups across the entire action space. Hence, scalability issues persist. To mitigate this drawback, we propose a novel Dynamic Neighborhood Construction (DNC) method, which dynamically constructs a discrete neighborhood to map the continuous proxy, thus efficiently exploiting the underlying action space. We demonstrate the robustness of our method by benchmarking it against three state-of-the-art approaches designed for large discrete action spaces across three different environments. Our results show that DNC matches or outperforms state-of-the-art approaches while being more computationally efficient. Furthermore, our method scales to action spaces that so far remained computationally intractable for existing methodologies.

READ FULL TEXT

page 8

page 17

research
12/24/2015

Deep Reinforcement Learning in Large Discrete Action Spaces

Being able to reason in an environment with a large number of discrete a...
research
01/22/2020

Q-Learning in enormous action spaces via amortized approximate maximization

Applying Q-learning to high-dimensional or continuous action spaces can ...
research
04/15/2021

Generalising Discrete Action Spaces with Conditional Action Trees

There are relatively few conventions followed in reinforcement learning ...
research
10/09/2020

Joint State-Action Embedding for Efficient Reinforcement Learning

While reinforcement learning has achieved considerable successes in rece...
research
06/28/2023

DCT: Dual Channel Training of Action Embeddings for Reinforcement Learning with Large Discrete Action Spaces

The ability to learn robust policies while generalizing over large discr...
research
06/10/2020

Marginal Utility for Planning in Continuous or Large Discrete Action Spaces

Sample-based planning is a powerful family of algorithms for generating ...
research
02/23/2023

Revisiting the Gumbel-Softmax in MADDPG

MADDPG is an algorithm in multi-agent reinforcement learning (MARL) that...

Please sign up or login with your details

Forgot password? Click here to reset