Keep it stupid simple

09/10/2018
by   Erik J Peterson, et al.
0

Deep reinforcement learning can match and exceed human performance, but if even minor changes are introduced to the environment artificial networks often can't adapt. Humans meanwhile are quite adaptable. We hypothesize that this is partly because of how humans use heuristics, and partly because humans can imagine new and more challenging environments to learn from. We've developed a model of hierarchical reinforcement learning that combines both these elements into a stumbler-strategist network. We test transfer performance of this network using Wythoff's game, a gridworld environment with a known optimal strategy. We show that combining imagined play with a heuristic--labeling each position as "good" or "bad"'--both accelerates learning and promotes transfer to novel games, while also improving model interpretability.

READ FULL TEXT
research
06/30/2020

Testing match-3 video games with Deep Reinforcement Learning

Testing a video game is a critical step for the production process and r...
research
01/07/2018

Sample-Efficient Reinforcement Learning through Transfer and Architectural Priors

Recent work in deep reinforcement learning has allowed algorithms to lea...
research
09/27/2019

Playing Atari Ball Games with Hierarchical Reinforcement Learning

Human beings are particularly good at reasoning and inference from just ...
research
01/20/2018

Learning model-based strategies in simple environments with hierarchical q-networks

Recent advances in deep learning have allowed artificial agents to rival...
research
10/20/2016

Utilization of Deep Reinforcement Learning for saccadic-based object visual search

The paper focuses on the problem of learning saccades enabling visual ob...
research
08/04/2021

High Performance Across Two Atari Paddle Games Using the Same Perceptual Control Architecture Without Training

Deep reinforcement learning (DRL) requires large samples and a long trai...

Please sign up or login with your details

Forgot password? Click here to reset