On the Reliability and Generalizability of Brain-inspired Reinforcement Learning Algorithms

07/09/2020
by   Dongjae Kim, et al.
0

Although deep RL models have shown a great potential for solving various types of tasks with minimal supervision, several key challenges remain in terms of learning from limited experience, adapting to environmental changes, and generalizing learning from a single task. Recent evidence in decision neuroscience has shown that the human brain has an innate capacity to resolve these issues, leading to optimism regarding the development of neuroscience-inspired solutions toward sample-efficient, and generalizable RL algorithms. We show that the computational model combining model-based and model-free control, which we term the prefrontal RL, reliably encodes the information of high-level policy that humans learned, and this model can generalize the learned policy to a wide range of tasks. First, we trained the prefrontal RL, and deep RL algorithms on 82 subjects' data, collected while human participants were performing two-stage Markov decision tasks, in which we manipulated the goal, state-transition uncertainty and state-space complexity. In the reliability test, which includes the latent behavior profile and the parameter recoverability test, we showed that the prefrontal RL reliably learned the latent policies of the humans, while all the other models failed. Second, to test the ability to generalize what these models learned from the original task, we situated them in the context of environmental volatility. Specifically, we ran large-scale simulations with 10 Markov decision tasks, in which latent context variables change over time. Our information-theoretic analysis showed that the prefrontal RL showed the highest level of adaptability and episodic encoding efficacy. This is the first attempt to formally test the possibility that computational models mimicking the way the brain solves general problems can lead to practical solutions to key challenges in machine learning.

READ FULL TEXT

page 16

page 18

research
06/29/2023

Learning Environment Models with Continuous Stochastic Dynamics

Solving control tasks in complex environments automatically through lear...
research
11/03/2020

Control with adaptive Q-learning

This paper evaluates adaptive Q-learning (AQL) and single-partition adap...
research
06/13/2023

Unified Off-Policy Learning to Rank: a Reinforcement Learning Perspective

Off-policy Learning to Rank (LTR) aims to optimize a ranker from data co...
research
04/18/2020

Time Adaptive Reinforcement Learning

Reinforcement learning (RL) allows to solve complex tasks such as Go oft...
research
10/21/2022

Deep Reinforcement Learning for Stabilization of Large-scale Probabilistic Boolean Networks

The ability to direct a Probabilistic Boolean Network (PBN) to a desired...
research
07/07/2020

The LoCA Regret: A Consistent Metric to Evaluate Model-Based Behavior in Reinforcement Learning

Deep model-based Reinforcement Learning (RL) has the potential to substa...
research
03/19/2020

Enhanced POET: Open-Ended Reinforcement Learning through Unbounded Invention of Learning Challenges and their Solutions

Creating open-ended algorithms, which generate their own never-ending st...

Please sign up or login with your details

Forgot password? Click here to reset