Reinforcement learning algorithms commonly seek to optimize policies for...
In this paper, we present a data-driven strategy to simplify the deploym...
We study the problem of conservative off-policy evaluation (COPE) where ...
Trajectory optimization methods have achieved an exceptional level of
pe...
Learning optimal control policies directly on physical systems is challe...