Exploration With a Finite Brain

01/27/2022
by   Marcel Binz, et al.
0

Equipping artificial agents with useful exploration mechanisms remains a challenge to this day. Humans, on the other hand, seem to manage the trade-off between exploration and exploitation effortlessly. In the present article, we put forward the hypothesis that they accomplish this by making optimal use of limited computational resources. We study this hypothesis by meta-learning reinforcement learning algorithms that sacrifice performance for a shorter description length. The emerging class of models captures human exploration behavior better than previously considered approaches, such as Boltzmann exploration, upper confidence bound algorithms, and Thompson sampling. We additionally demonstrate that changes in description length produce the intended effects: reducing description length captures the behavior of brain-lesioned patients while increasing it echoes cognitive development during adolescence.

READ FULL TEXT
research
02/27/2021

Cognitive Homeostatic Agents

Human brain has been used as an inspiration for building autonomous agen...
research
10/11/2022

The Role of Exploration for Task Transfer in Reinforcement Learning

The exploration–exploitation trade-off in reinforcement learning (RL) is...
research
12/18/2018

Information-Directed Exploration for Deep Reinforcement Learning

Efficient exploration remains a major challenge for reinforcement learni...
research
05/28/2019

Learning Efficient and Effective Exploration Policies with Counterfactual Meta Policy

A fundamental issue in reinforcement learning algorithms is the balance ...
research
04/12/2023

Meta-Learned Models of Cognition

Meta-learning is a framework for learning learning algorithms through re...
research
10/23/2021

Map Induction: Compositional spatial submap learning for efficient exploration in novel environments

Humans are expert explorers. Understanding the computational cognitive m...

Please sign up or login with your details

Forgot password? Click here to reset