We propose a novel solution to challenging sparse-reward, continuous con...
We propose a novel reinforcement learning algorithm,QD-RL, that incorpor...
Bidding in real-time auctions can be a difficult stochastic control task...
The exploration-exploitation trade-off is at the heart of reinforcement
...
In environments with continuous state and action spaces, state-of-the-ar...
In a context where several policies can be observed as black boxes on
di...
We propose a novel reinforcement learning algorithm, AlphaNPI, that
inco...
In this paper, we provide an overview of first-order and second-order
va...
Deep neuroevolution, that is evolutionary policy search methods based on...