In this paper, we study the problem of optimal data collection for polic...
In real-world robotics applications, Reinforcement Learning (RL) agents ...
Intrinsic rewards are commonly applied to improve exploration in
reinfor...
Temporal difference (TD) learning is one of the main foundations of mode...
The sim to real transfer problem deals with leveraging large amounts of
...
Signalized intersections are managed by controllers that assign right of...
In reinforcement learning, off-policy evaluation is the task of using da...
This paper is devoted to fair optimization in Multiobjective Markov Deci...