In an offline reinforcement learning setting, the safe policy improvemen...
This position paper reflects on the state-of-the-art in decision-making ...
We study safe policy improvement (SPI) for partially observable Markov
d...
Markov decision processes (MDPs) are formal models commonly used in
sequ...
Uncertain partially observable Markov decision processes (uPOMDPs) allow...