Extending Environments To Measure Self-Reflection In Reinforcement Learning

10/13/2021
by   Samuel Allen Alexander, et al.
7

We consider an extended notion of reinforcement learning in which the environment can simulate the agent and base its outputs on the agent's hypothetical behavior. Since good performance usually requires paying attention to whatever things the environment's outputs are based on, we argue that for an agent to achieve on-average good performance across many such extended environments, it is necessary for the agent to self-reflect. Thus, an agent's self-reflection ability can be numerically estimated by running the agent through a battery of extended environments. We are simultaneously releasing an open-source library of extended environments to serve as proof-of-concept of this technique. As the library is first-of-kind, we have avoided the difficult problem of optimizing it. Instead we have chosen environments with interesting properties. Some seem paradoxical, some lead to interesting thought experiments, some are even suggestive of how self-reflection might have evolved in nature. We give examples and introduce a simple transformation which experimentally seems to increase self-reflection.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
03/20/2023

Reflexion: an autonomous agent with dynamic memory and self-reflection

Recent advancements in decision-making large language model (LLM) agents...
research
08/17/2020

SuperSuit: Simple Microwrappers for Reinforcement Learning Environments

In reinforcement learning, wrappers are universally used to transform th...
research
09/30/2020

PettingZoo: Gym for Multi-Agent Reinforcement Learning

This paper introduces PettingZoo, a library of diverse sets of multi-age...
research
07/12/2020

OtoWorld: Towards Learning to Separate by Learning to Move

We present OtoWorld, an interactive environment in which agents must lea...
research
02/18/2021

Strategic bidding in freight transport using deep reinforcement learning

This paper presents a multi-agent reinforcement learning algorithm to re...
research
06/09/2023

Self-Paced Absolute Learning Progress as a Regularized Approach to Curriculum Learning

The usability of Reinforcement Learning is restricted by the large compu...
research
12/29/2022

Learning One Abstract Bit at a Time Through Self-Invented Experiments Encoded as Neural Networks

There are two important things in science: (A) Finding answers to given ...

Please sign up or login with your details

Forgot password? Click here to reset