Time Consistent Discounting

07/27/2011
by   Tor Lattimore, et al.
0

A possibly immortal agent tries to maximise its summed discounted rewards over time, where discounting is used to avoid infinite utilities and encourage the agent to value current rewards more than future ones. Some commonly used discount functions lead to time-inconsistent behavior where the agent changes its plan over time. These inconsistencies can lead to very poor behavior. We generalise the usual discounted utility model to one where the discount function changes with the age of the agent. We then give a simple characterisation of time-(in)consistent discount functions and show the existence of a rational policy for an agent that knows its discount function is time-inconsistent.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
06/23/2019

An AGI with Time-Inconsistent Preferences

This paper reveals a trap for artificial general intelligence (AGI) theo...
research
02/16/2023

Analytically Tractable Models for Decision Making under Present Bias

Time-inconsistency is a characteristic of human behavior in which people...
research
12/01/2021

On the Practical Consistency of Meta-Reinforcement Learning Algorithms

Consistency is the theoretical property of a meta learning algorithm tha...
research
12/06/2021

Inconsistent Planning: When in doubt, toss a coin!

One of the most widespread human behavioral biases is the present bias –...
research
11/16/2011

Model-based Utility Functions

Orseau and Ring, as well as Dewey, have recently described problems, inc...
research
04/27/2020

Diversity in Action: General-Sum Multi-Agent Continuous Inverse Optimal Control

Traffic scenarios are inherently interactive. Multiple decision-makers p...
research
09/05/2007

Simple Algorithmic Principles of Discovery, Subjective Beauty, Selective Attention, Curiosity & Creativity

I postulate that human or other intelligent agents function or should fu...

Please sign up or login with your details

Forgot password? Click here to reset