Zipfian environments for Reinforcement Learning

03/15/2022
by   Stephanie C. Y. Chan, et al.
0

As humans and animals learn in the natural world, they encounter distributions of entities, situations and events that are far from uniform. Typically, a relatively small set of experiences are encountered frequently, while many important experiences occur only rarely. The highly-skewed, heavy-tailed nature of reality poses particular learning challenges that humans and animals have met by evolving specialised memory systems. By contrast, most popular RL environments and benchmarks involve approximately uniform variation of properties, objects, situations or tasks. How will RL algorithms perform in worlds (like ours) where the distribution of environment features is far less uniform? To explore this question, we develop three complementary RL environments where the agent's experience varies according to a Zipfian (discrete power law) distribution. On these benchmarks, we find that standard Deep RL architectures and algorithms acquire useful knowledge of common situations and tasks, but fail to adequately learn about rarer ones. To understand this failure better, we explore how different aspects of current approaches may be adjusted to help improve performance on rare events, and show that the RL objective function, the agent's memory system and self-supervised learning objectives can all influence an agent's ability to learn from uncommon experiences. Together, these results show that learning robustly from skewed experience is a critical challenge for applying Deep RL methods beyond simulations or laboratories, and our Zipfian environments provide a basis for measuring future progress towards this goal.

READ FULL TEXT

page 4

page 5

page 6

research
01/26/2023

Which Experiences Are Influential for Your Agent? Policy Iteration with Turn-over Dropout

In reinforcement learning (RL) with experience replay, experiences store...
research
06/05/2020

Rapid Task-Solving in Novel Environments

When thrust into an unfamiliar environment and charged with solving a se...
research
10/09/2019

Improving Generalization in Meta Reinforcement Learning using Learned Objectives

Biological evolution has distilled the experiences of many learners into...
research
03/28/2018

Unsupervised Predictive Memory in a Goal-Directed Agent

Animals execute goal-directed behaviours despite the limited range and s...
research
09/29/2018

Generalization and Regularization in DQN

Deep reinforcement learning (RL) algorithms have shown an impressive abi...
research
03/19/2022

The Sandbox Environment for Generalizable Agent Research (SEGAR)

A broad challenge of research on generalization for sequential decision-...
research
04/16/2023

Out-of-Variable Generalization

The ability of an agent to perform well in new and unseen environments i...

Please sign up or login with your details

Forgot password? Click here to reset