CEIP: Combining Explicit and Implicit Priors for Reinforcement Learning with Demonstrations

10/18/2022
by   Kai Yan, et al.
0

Although reinforcement learning has found widespread use in dense reward settings, training autonomous agents with sparse rewards remains challenging. To address this difficulty, prior work has shown promising results when using not only task-specific demonstrations but also task-agnostic albeit somewhat related demonstrations. In most cases, the available demonstrations are distilled into an implicit prior, commonly represented via a single deep net. Explicit priors in the form of a database that can be queried have also been shown to lead to encouraging results. To better benefit from available demonstrations, we develop a method to Combine Explicit and Implicit Priors (CEIP). CEIP exploits multiple implicit priors in the form of normalizing flows in parallel to form a single complex prior. Moreover, CEIP uses an effective explicit retrieval and push-forward mechanism to condition the implicit priors. In three challenging environments, we find the proposed CEIP method to improve upon sophisticated state-of-the-art techniques.

READ FULL TEXT
research
09/17/2018

Automata Guided Reinforcement Learning With Demonstrations

Tasks with complex temporal structures and long horizons pose a challeng...
research
03/21/2022

Self-Imitation Learning from Demonstrations

Despite the numerous breakthroughs achieved with Reinforcement Learning ...
research
05/21/2017

Experience enrichment based task independent reward model

For most reinforcement learning approaches, the learning is performed by...
research
12/01/2021

Using Deep Image Prior to Assist Variational Selective Segmentation Deep Learning Algorithms

Variational segmentation algorithms require a prior imposed in the form ...
research
10/26/2022

Leveraging Demonstrations with Latent Space Priors

Demonstrations provide insight into relevant state or action space regio...
research
02/21/2022

Learning Behavioral Soft Constraints from Demonstrations

Many real-life scenarios require humans to make difficult trade-offs: do...
research
09/01/2021

Implicit Behavioral Cloning

We find that across a wide range of robot policy learning scenarios, tre...

Please sign up or login with your details

Forgot password? Click here to reset