A State Representation for Diminishing Rewards

09/07/2023
by   Ted Moskovitz, et al.
0

A common setting in multitask reinforcement learning (RL) demands that an agent rapidly adapt to various stationary reward functions randomly sampled from a fixed distribution. In such situations, the successor representation (SR) is a popular framework which supports rapid policy evaluation by decoupling a policy's expected discounted, cumulative state occupancies from a specific reward function. However, in the natural world, sequential tasks are rarely independent, and instead reflect shifting priorities based on the availability and subjective perception of rewarding stimuli. Reflecting this disjunction, in this paper we study the phenomenon of diminishing marginal utility and introduce a novel state representation, the λ representation (λR) which, surprisingly, is required for policy evaluation in this setting and which generalizes the SR as well as several other state representations from the literature. We establish the λR's formal properties and examine its normative advantages in the context of machine learning, as well as its usefulness for studying natural behaviors, particularly foraging.

READ FULL TEXT

page 5

page 8

page 28

page 29

page 33

research
09/28/2021

A First-Occupancy Representation for Reinforcement Learning

Both animals and artificial agents benefit from state representations th...
research
06/08/2016

Deep Successor Reinforcement Learning

Learning robust value functions given raw observations and rewards is no...
research
03/22/2019

Jet grooming through reinforcement learning

We introduce a novel implementation of a reinforcement learning (RL) alg...
research
11/16/2016

Reinforcement Learning with Unsupervised Auxiliary Tasks

Deep reinforcement learning agents have achieved state-of-the-art result...
research
01/06/2022

Admissible Policy Teaching through Reward Design

We study reward design strategies for incentivizing a reinforcement lear...
research
10/06/2019

Probabilistic Successor Representations with Kalman Temporal Differences

The effectiveness of Reinforcement Learning (RL) depends on an animal's ...

Please sign up or login with your details

Forgot password? Click here to reset