Value-Informed Skill Chaining for Policy Learning of Long-Horizon Tasks with Surgical Robot

07/31/2023
by   Tao Huang, et al.
0

Reinforcement learning is still struggling with solving long-horizon surgical robot tasks which involve multiple steps over an extended duration of time due to the policy exploration challenge. Recent methods try to tackle this problem by skill chaining, in which the long-horizon task is decomposed into multiple subtasks for easing the exploration burden and subtask policies are temporally connected to complete the whole long-horizon task. However, smoothly connecting all subtask policies is difficult for surgical robot scenarios. Not all states are equally suitable for connecting two adjacent subtasks. An undesired terminate state of the previous subtask would make the current subtask policy unstable and result in a failed execution. In this work, we introduce value-informed skill chaining (ViSkill), a novel reinforcement learning framework for long-horizon surgical robot tasks. The core idea is to distinguish which terminal state is suitable for starting all the following subtask policies. To achieve this target, we introduce a state value function that estimates the expected success probability of the entire task given a state. Based on this value function, a chaining policy is learned to instruct subtask policies to terminate at the state with the highest value so that all subsequent policies are more likely to be connected for accomplishing the task. We demonstrate the effectiveness of our method on three complex surgical robot tasks from SurRoL, a comprehensive surgical simulation platform, achieving high task success rates and execution efficiency. Code is available at $\href{https://github.com/med-air/ViSkill}{\text{https://github.com/med-air/ViSkill}}$.

READ FULL TEXT

page 1

page 4

page 6

research
02/20/2023

Demonstration-Guided Reinforcement Learning with Efficient Exploration for Task Automation of Surgical Robot

Task automation of surgical robot has the potentials to improve surgical...
research
11/15/2021

Adversarial Skill Chaining for Long-Horizon Robot Manipulation via Terminal State Regularization

Skill chaining is a promising approach for synthesizing complex behavior...
research
11/04/2021

Value Function Spaces: Skill-Centric State Abstractions for Long-Horizon Reasoning

Reinforcement learning can train policies that effectively perform compl...
research
05/12/2021

Learning a Skill-sequence-dependent Policy for Long-horizon Manipulation Tasks

In recent years, the robotics community has made substantial progress in...
research
01/01/2023

Human-in-the-loop Embodied Intelligence with Interactive Simulation Environment for Surgical Robot Learning

Surgical robot automation has attracted increasing research interest ove...
research
09/21/2021

Example-Driven Model-Based Reinforcement Learning for Solving Long-Horizon Visuomotor Tasks

In this paper, we study the problem of learning a repertoire of low-leve...
research
11/24/2020

C-Learning: Horizon-Aware Cumulative Accessibility Estimation

Multi-goal reaching is an important problem in reinforcement learning ne...

Please sign up or login with your details

Forgot password? Click here to reset