Model-Based Action Exploration

01/11/2018
by   Glen Berseth, et al.
0

Deep reinforcement learning has great stride in solving challenging motion control tasks. Recently there has been a significant amount of work on methods to exploit the data gathered during training, but less work is done on good methods for generating data to learn from. For continuous actions domains, the typical method for generating exploratory actions is by sampling from a Gaussian distribution centred around the mean of a policy. Although these methods can find an optimal policy, in practise, they do not scale well, and solving environments with many actions dimensions becomes impractical. We consider learning a forward dynamics model to predict the result, (s_t+1), of taking a particular action, (a), given a specific observation of the state, (s_t). With a model such as this we, can perform what comes more naturally to biological systems that have already collect experience, we perform internal predictions of outcomes and endeavour to try actions we believe have a reasonable chance of success. This method greatly reduces the space of exploratory actions, increasing learning speed and enables higher quality solutions to difficult problems, such as robotic locomotion.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
05/09/2011

A Real-Time Model-Based Reinforcement Learning Architecture for Robot Control

Reinforcement Learning (RL) is a method for learning decision-making tas...
research
09/05/2019

Learning Action-Transferable Policy with Action Embedding

Despite achieving great success on performance in various sequential dec...
research
09/18/2018

SCC-rFMQ Learning in Cooperative Markov Games with Continuous Actions

Although many reinforcement learning methods have been proposed for lear...
research
05/10/2019

Multi-Pass Q-Networks for Deep Reinforcement Learning with Parameterised Action Spaces

Parameterised actions in reinforcement learning are composed of discrete...
research
06/18/2021

Learning to Plan via a Multi-Step Policy Regression Method

We propose a new approach to increase inference performance in environme...
research
03/13/2020

Action for Better Prediction

Good prediction is necessary for autonomous robotics to make informed de...
research
03/20/2019

Reinforcing Classical Planning for Adversary Driving Scenarios

Adversary scenarios in driving, where the other vehicles may make mistak...

Please sign up or login with your details

Forgot password? Click here to reset