Tolerance-Guided Policy Learning for Adaptable and Transferrable Delicate Industrial Insertion

08/04/2021
by   Boshen Niu*, et al.
0

Policy learning for delicate industrial insertion tasks (e.g., PC board assembly) is challenging. This paper considers two major problems: how to learn a diversified policy (instead of just one average policy) that can efficiently handle different workpieces with minimum amount of training data, and how to handle defects of workpieces during insertion. To address the problems, we propose tolerance-guided policy learning. To encourage transferability of the learned policy to different workpieces, we add a task embedding to the policy's input space using the insertion tolerance. Then we train the policy using generative adversarial imitation learning with reward shaping (RS-GAIL) on a variety of representative situations. To encourage adaptability of the learned policy to handle defects, we build a probabilistic inference model that can output the best inserting pose based on failed insertions using the tolerance model. The best inserting pose is then used as a reference to the learned policy. This proposed method is validated on a sequence of IC socket insertion tasks in simulation. The results show that 1) RS-GAIL can efficiently learn optimal policies under sparse rewards; 2) the tolerance embedding can enhance the transferability of the learned policy; 3) the probabilistic inference makes the policy robust to defects on the workpieces.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
03/06/2022

A Composable Framework for Policy Design, Learning, and Transfer Toward Safe and Efficient Industrial Insertion

Delicate industrial insertion tasks (e.g., PC board assembly) remain cha...
research
12/03/2018

Generative Adversarial Self-Imitation Learning

This paper explores a simple regularizer for reinforcement learning by p...
research
12/01/2022

Multi-Task Imitation Learning for Linear Dynamical Systems

We study representation learning for efficient imitation learning over l...
research
05/25/2018

Learning Self-Imitating Diverse Policies

Deep reinforcement learning algorithms, including policy gradient method...
research
04/05/2023

Goal-Conditioned Imitation Learning using Score-based Diffusion Policies

We propose a new policy representation based on score-based diffusion mo...
research
12/06/2021

Guided Imitation of Task and Motion Planning

While modern policy optimization methods can do complex manipulation fro...

Please sign up or login with your details

Forgot password? Click here to reset