research
∙
04/08/2022
Approximate discounting-free policy evaluation from transient and recurrent states
In order to distinguish policies that prescribe good from bad actions in...
research
∙
07/03/2021
Examining average and discounted reward optimality criteria in reinforcement learning
In reinforcement learning (RL), the goal is to obtain an optimal policy,...
research
∙
05/28/2021
A nearly Blackwell-optimal policy gradient method
For continuing environments, reinforcement learning methods commonly max...
research
∙
10/18/2020