Learning to Plan via a Multi-Step Policy Regression Method

06/18/2021
by   Stefan Wagner, et al.
0

We propose a new approach to increase inference performance in environments that require a specific sequence of actions in order to be solved. This is for example the case for maze environments where ideally an optimal path is determined. Instead of learning a policy for a single step, we want to learn a policy that can predict n actions in advance. Our proposed method called policy horizon regression (PHR) uses knowledge of the environment sampled by A2C to learn an n dimensional policy vector in a policy distillation setup which yields n sequential actions per observation. We test our method on the MiniGrid and Pong environments and show drastic speedup during inference time by successfully predicting sequences of actions on a single observation.

READ FULL TEXT

page 9

page 11

research
10/25/2022

In-context Reinforcement Learning with Algorithm Distillation

We propose Algorithm Distillation (AD), a method for distilling reinforc...
research
07/26/2019

Environment Probing Interaction Policies

A key challenge in reinforcement learning (RL) is environment generaliza...
research
04/17/2019

PLOTS: Procedure Learning from Observations using Subtask Structure

In many cases an intelligent agent may want to learn how to mimic a sing...
research
01/11/2018

Model-Based Action Exploration

Deep reinforcement learning has great stride in solving challenging moti...
research
04/27/2020

Evolutionary Stochastic Policy Distillation

Solving the Goal-Conditioned Reward Sparse (GCRS) task is a challenging ...
research
07/07/2022

Hyper-Universal Policy Approximation: Learning to Generate Actions from a Single Image using Hypernets

Inspired by Gibson's notion of object affordances in human vision, we as...
research
12/07/2022

Policy Transfer via Enhanced Action Space

Though transfer learning is promising to increase the learning efficienc...

Please sign up or login with your details

Forgot password? Click here to reset