We study the problem of unsupervised skill discovery, whose goal is to l...
In reinforcement learning, continuous time is often discretized by a tim...
Having the ability to acquire inherent skills from environments without ...
We propose a novel information bottleneck (IB) method named Drop-Bottlen...
Policy optimization struggles when the reward feedback signal is very sp...