Learning to Solve Tasks with Exploring Prior Behaviours

07/06/2023
by   Ruiqi Zhu, et al.
0

Demonstrations are widely used in Deep Reinforcement Learning (DRL) for facilitating solving tasks with sparse rewards. However, the tasks in real-world scenarios can often have varied initial conditions from the demonstration, which would require additional prior behaviours. For example, consider we are given the demonstration for the task of picking up an object from an open drawer, but the drawer is closed in the training. Without acquiring the prior behaviours of opening the drawer, the robot is unlikely to solve the task. To address this, in this paper we propose an Intrinsic Rewards Driven Example-based Control (IRDEC). Our method can endow agents with the ability to explore and acquire the required prior behaviours and then connect to the task-specific behaviours in the demonstration to solve sparse-reward tasks without requiring additional demonstration of the prior behaviours. The performance of our method outperforms other baselines on three navigation tasks and one robotic manipulation task with sparse rewards. Codes are available at https://github.com/Ricky-Zhu/IRDEC.

READ FULL TEXT

page 1

page 5

page 6

research
12/03/2022

Reinforcement learning with Demonstrations from Mismatched Task under Sparse Reward

Reinforcement learning often suffer from the sparse reward issue in real...
research
06/22/2023

Learning from Visual Observation via Offline Pretrained State-to-Go Transformer

Learning from visual observation (LfVO), aiming at recovering policies f...
research
05/25/2023

Decomposing the Enigma: Subgoal-based Demonstration Learning for Formal Theorem Proving

Large language models (LLMs) present an intriguing avenue of exploration...
research
10/17/2018

One-Shot Observation Learning

Observation learning is the process of learning a task by observing an e...
research
04/12/2020

Reinforcement Learning via Reasoning from Demonstration

Demonstration is an appealing way for humans to provide assistance to re...
research
09/29/2020

Align-RUDDER: Learning From Few Demonstrations by Reward Redistribution

Reinforcement Learning algorithms require a large number of samples to s...
research
05/15/2020

Simple Sensor Intentions for Exploration

Modern reinforcement learning algorithms can learn solutions to increasi...

Please sign up or login with your details

Forgot password? Click here to reset