
ContinuousDiscrete Reinforcement Learning for Hybrid Control in Robotics
Many realworld control problems involve both discrete decision variable...
read it

Learning Action Representations for Reinforcement Learning
Most modelfree reinforcement learning methods leverage state representa...
read it

QLearning in enormous action spaces via amortized approximate maximization
Applying Qlearning to highdimensional or continuous action spaces can ...
read it

Lifelong Learning with a Changing Action Set
In many realworld sequential decision making problems, the number of av...
read it

A GoalBased Movement Model for Continuous MultiAgent Tasks
Despite increasing attention paid to the need for fast, scalable methods...
read it

Using a Logarithmic Mapping to Enable Lower Discount Factors in Reinforcement Learning
In an effort to better understand the different ways in which the discou...
read it

Fast Reinforcement Learning with Large Action Sets using ErrorCorrecting Output Codes for MDP Factorization
The use of Reinforcement Learning in realworld scenarios is strongly li...
read it
Deep Reinforcement Learning in Large Discrete Action Spaces
Being able to reason in an environment with a large number of discrete actions is essential to bringing reinforcement learning to a larger class of problems. Recommender systems, industrial plants and language models are only some of the many realworld tasks involving large numbers of discrete actions for which current methods are difficult or even often impossible to apply. An ability to generalize over the set of actions as well as sublinear complexity relative to the size of the set are both necessary to handle such tasks. Current approaches are not able to provide both of these, which motivates the work in this paper. Our proposed approach leverages prior information about the actions to embed them in a continuous space upon which it can generalize. Additionally, approximate nearestneighbor methods allow for logarithmictime lookup complexity relative to the number of actions, which is necessary for timewise tractable training. This combined approach allows reinforcement learning methods to be applied to largescale learning problems previously intractable with current methods. We demonstrate our algorithm's abilities on a series of tasks having up to one million actions.
READ FULL TEXT
Comments
There are no comments yet.