PLUNDER: Probabilistic Program Synthesis for Learning from Unlabeled and Noisy Demonstrations

03/02/2023
by   Jimmy Xin, et al.
0

Learning from demonstration (LfD) is a widely researched paradigm for teaching robots to perform novel tasks. LfD works particularly well with program synthesis since the resulting programmatic policy is data efficient, interpretable, and amenable to formal verification. However, existing synthesis approaches to LfD rely on precise and labeled demonstrations and are incapable of reasoning about the uncertainty inherent in human decision-making. In this paper, we propose PLUNDER, a new LfD approach that integrates a probabilistic program synthesizer in an expectation-maximization (EM) loop to overcome these limitations. PLUNDER only requires unlabeled low-level demonstrations of the intended task (e.g., remote-controlled motion trajectories), which liberates end-users from providing explicit labels and facilitates a more intuitive LfD experience. PLUNDER also generates a probabilistic policy that captures actuation errors and the uncertainties inherent in human decision making. Our experiments compare PLUNDER with state-of the-art LfD techniques and demonstrate its advantages across different robotic tasks.

READ FULL TEXT
research
09/07/2023

Learning from Demonstration via Probabilistic Diagrammatic Teaching

Learning for Demonstration (LfD) enables robots to acquire new skills by...
research
05/04/2023

Program Synthesis for Robot Learning from Demonstrations

This paper presents a new synthesis-based approach for solving the Learn...
research
06/05/2023

Knowledge-Driven Robot Program Synthesis from Human VR Demonstrations

Aging societies, labor shortages and increasing wage costs call for assi...
research
05/20/2018

Machine Teaching for Inverse Reinforcement Learning: Algorithms and Applications

Inverse reinforcement learning (IRL) infers a reward function from demon...
research
07/23/2020

Semi-supervised Learning From Demonstration Through Program Synthesis: An Inspection Robot Case Study

Semi-supervised learning improves the performance of supervised machine ...
research
02/18/2021

Learning Memory-Dependent Continuous Control from Demonstrations

Efficient exploration has presented a long-standing challenge in reinfor...
research
03/15/2012

Anytime Planning for Decentralized POMDPs using Expectation Maximization

Decentralized POMDPs provide an expressive framework for multi-agent seq...

Please sign up or login with your details

Forgot password? Click here to reset