How to Discount Deep Reinforcement Learning: Towards New Dynamic Strategies

12/07/2015
by   Vincent Francois-Lavet, et al.
0

Using deep neural nets as function approximator for reinforcement learning tasks have recently been shown to be very powerful for solving problems approaching real-world complexity. Using these results as a benchmark, we discuss the role that the discount factor may play in the quality of the learning process of a deep Q-network (DQN). When the discount factor progressively increases up to its final value, we empirically show that it is possible to significantly reduce the number of learning steps. When used in conjunction with a varying learning rate, we empirically show that it outperforms original DQN on several experiments. We relate this phenomenon with the instabilities of neural networks when they are used in an approximate Dynamic Programming setting. We also describe the possibility to fall within a local optimum during the learning process, thus connecting our discussion with the exploration/exploitation dilemma.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
08/15/2023

Deep reinforcement learning for process design: Review and perspective

The transformation towards renewable energy and feedstock supply in the ...
research
12/12/2020

Tutoring Reinforcement Learning via Feedback Control

We introduce a control-tutored reinforcement learning (CTRL) algorithm. ...
research
12/01/2021

Neural Stochastic Dual Dynamic Programming

Stochastic dual dynamic programming (SDDP) is a state-of-the-art method ...
research
08/07/2020

Towards Sample Efficient Agents through Algorithmic Alignment

Deep reinforcement-learning agents have demonstrated great success on va...
research
09/25/2018

Anderson Acceleration for Reinforcement Learning

Anderson acceleration is an old and simple method for accelerating the c...
research
11/11/2021

Agent Spaces

Exploration is one of the most important tasks in Reinforcement Learning...
research
06/09/2011

Accelerating Reinforcement Learning by Composing Solutions of Automatically Identified Subtasks

This paper discusses a system that accelerates reinforcement learning by...

Please sign up or login with your details

Forgot password? Click here to reset