We study parameterized MDPs (PMDPs) in which the key parameters of inter...
We present pyRDDLGym, a Python framework for auto-generation of OpenAI G...
Planning provides a framework for optimizing sequential decisions in com...
Resolving the exploration-exploitation trade-off remains a fundamental
p...
Learning from demonstrations (LfD) improves the exploration efficiency o...
Reinforcement learning methods that consider the context, or current sta...