Composable Learning with Sparse Kernel Representations

03/26/2021
by   Ekaterina Tolstaya, et al.
0

We present a reinforcement learning algorithm for learning sparse non-parametric controllers in a Reproducing Kernel Hilbert Space. We improve the sample complexity of this approach by imposing a structure of the state-action function through a normalized advantage function (NAF). This representation of the policy enables efficiently composing multiple learned models without additional training samples or interaction with the environment. We demonstrate the performance of this algorithm on learning obstacle-avoidance policies in multiple simulations of a robot equipped with a laser scanner while navigating in a 2D environment. We apply the composition operation to various policy combinations and test them to show that the composed policies retain the performance of their components. We also transfer the composed policy directly to a physical platform operating in an arena with obstacles in order to demonstrate a degree of generalization.

READ FULL TEXT

page 1

page 6

research
03/28/2018

Reinforcement learning for non-prehensile manipulation: Transfer from simulation to physical system

Reinforcement learning has emerged as a promising methodology for traini...
research
02/27/2019

Introspection Learning

Traditional reinforcement learning agents learn from experience, past or...
research
03/24/2022

Non-Parametric Stochastic Policy Gradient with Strategic Retreat for Non-Stationary Environment

In modern robotics, effectively computing optimal control policies under...
research
02/17/2018

Learning to Race through Coordinate Descent Bayesian Optimisation

In the automation of many kinds of processes, the observable outcome can...
research
04/20/2002

Learning from Scarce Experience

Searching the space of policies directly for the optimal policy has been...
research
11/04/2017

Composing Meta-Policies for Autonomous Driving Using Hierarchical Deep Reinforcement Learning

Rather than learning new control policies for each new task, it is possi...
research
03/17/2023

Policy/mechanism separation in the Warehouse-Scale OS

"As many of us know from bitter experience, the policies provided in ext...

Please sign up or login with your details

Forgot password? Click here to reset